Error when creating external table in Athena: "mismatched input 'EXTERNAL'. Expecting: 'MATERIALIZED', 'OR', 'ROLE', 'SCHEMA', 'TABLE', 'VIEW'"

0

Hello, I’m new to Athena so I’m not sure if this is the appropriate place to post this.

I’m trying to import Azure data into Athena, so that it can be ingested downstream by AWS Quicksight. I’ve downloaded a usage details .csv file from Azure, placed it in S3, and am trying to create an external table in Athena, based on this .csv file.

Here’s the error I receive when trying to create an external table based on .csv file (when clicking on the explain button within Athena):

line 1:48: mismatched input 'EXTERNAL'. Expecting: 'MATERIALIZED', 'OR', 'ROLE', 'SCHEMA', 'TABLE', 'VIEW' This query ran against the "<masking my database name>" database, unless qualified by the query.

I’ve tried executing this both through a manually written SQL statement and through the “Create table from S3 bucket data” feature. Both leading to the above error. I even tried converting all data types to string (thinking it could be some kind of data type mismatch), but to no avail.

I’d greatly appreciate any help in troubleshooting this, as the error message feels vague.

My create table statement (I masked the DB name and S3 location for confidentiality reasons):

CREATE EXTERNAL TABLE IF NOT EXISTS <masked for reasons of confidentiality>.azure_usage_apr_2023 ( BillingAccountId string, BillingAccountName string, BillingPeriodStartDate date, BillingPeriodEndDate date, BillingProfileId string, BillingProfileName string, AccountOwnerId string, AccountName string, SubscriptionId string, SubscriptionName string, Date date, Product string, PartNumber string, MeterId string, ServiceFamily string, MeterCategory string, MeterSubCategory string, MeterRegion string, MeterName string, Quantity double, EffectivePrice double, Cost double, UnitPrice double, BillingCurrency string, ResourceLocation string, AvailabilityZone string, ConsumedService string, ResourceId string, ResourceName string, ServiceInfo1 string, ServiceInfo2 string, AdditionalInfo string, Tags string, InvoiceSectionId date, InvoiceSection string, CostCenter string, UnitOfMeasure string, ResourceGroup string, ReservationId string, ReservationName string, ProductOrderId string, ProductOrderName string, OfferId string, IsAzureCreditEligible string, Term string, PublisherName string, PlanName string, ChargeType string, Frequency string, PublisherType string, PayGPrice double, PricingModel string, CostAllocationRuleName string, benefitId string, benefitName string ) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' WITH SERDEPROPERTIES ('field.delim' = ',') STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION '<S3 location is masked for reasons of confidentiality>' TBLPROPERTIES ( 'classification' = 'csv', 'skip.header.line.count' = '1' );

Column headers of .csv file (I’m not posting data for reasons of confidentiality – it’s a standard Azure usage details format):

BillingAccountId,BillingAccountName,BillingPeriodStartDate,BillingPeriodEndDate,BillingProfileId,BillingProfileName,AccountOwnerId,AccountName,SubscriptionId,SubscriptionName,Date,Product,PartNumber,MeterId,ServiceFamily,MeterCategory,MeterSubCategory,MeterRegion,MeterName,Quantity,EffectivePrice,Cost,UnitPrice,BillingCurrency,ResourceLocation,AvailabilityZone,ConsumedService,ResourceId,ResourceName,ServiceInfo1,ServiceInfo2,AdditionalInfo,Tags,InvoiceSectionId,InvoiceSection,CostCenter,UnitOfMeasure,ResourceGroup,ReservationId,ReservationName,ProductOrderId,ProductOrderName,OfferId,IsAzureCreditEligible,Term,PublisherName,PlanName,ChargeType,Frequency,PublisherType,PayGPrice,PricingModel,CostAllocationRuleName,benefitId,benefitName

JR
asked a year ago1334 views
1 Answer
0

Have you checked the documentation for limitations of SERDE for CSV (Comma-Separated Values) ? Refer: https://docs.aws.amazon.com/athena/latest/ug/supported-serdes.html Snippet here: Use the LazySimpleSerDe for CSV, TSV, and custom-delimited files if your data does not include values enclosed in quotes or if it uses the java.sql.Timestamp format.

Creating an external table manually in Athena could be error prone, mistake with properties or empty properties might also cause this error , would you consider the use of AWS Glue Crawler? It will update the Glue Data Catalog tables for you and is fully integrated with Athena tables and views. You can then also easily review the logs and errors in AWS Glue console.

AWS
Chan_N
answered 10 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions