Saving results of a query which uses an external function into a new table in Athena v.3

2

I'm following this tutorial to perform text analytics using UDFs. https://aws.amazon.com/blogs/machine-learning/translate-and-analyze-text-using-sql-functions-with-amazon-athena-amazon-translate-and-amazon-comprehend/

The example where data are not saved works fine work fine in Athena v.3 USING EXTERNAL FUNCTION detect_sentiment_all(col1 VARCHAR, lang VARCHAR) RETURNS VARCHAR LAMBDA 'textanalytics-udf', EXTERNAL FUNCTION detect_entities_all(col1 VARCHAR, lang VARCHAR) RETURNS VARCHAR LAMBDA 'textanalytics-udf' SELECT *, detect_sentiment_all(review_body, language) AS sentiment, detect_entities_all(review_body, language) AS entities FROM amazon_reviews_with_language WHERE language IN ('ar', 'hi', 'ko', 'zh-TW', 'ja', 'zh', 'de', 'pt', 'en', 'it', 'fr', 'es')

However, the example where a table is created to save the results does not work in Athena v.3, but it works in Athena v.2.

CREATE TABLE amazon_reviews_with_text_analysis WITH (format='parquet') AS USING EXTERNAL FUNCTION detect_sentiment_all(col1 VARCHAR, lang VARCHAR) RETURNS VARCHAR LAMBDA 'textanalytics-udf', EXTERNAL FUNCTION detect_entities_all(col1 VARCHAR, lang VARCHAR) RETURNS VARCHAR LAMBDA 'textanalytics-udf' SELECT *, detect_sentiment_all(review_body, language) AS sentiment, detect_entities_all(review_body, language) AS entities FROM amazon_reviews_with_language WHERE language IN ('ar', 'hi', 'ko', 'zh-TW', 'ja', 'zh', 'de', 'pt', 'en', 'it', 'fr', 'es')

It fails with an error: line 2:1: mismatched input 'USING'. Expecting: '(', <query>

The question is how can I save data into a table when using external function in Athena v.3?

MikeW
질문됨 일 년 전352회 조회
1개 답변
0

Hello,

Please put the 'USING' clause before the 'CREATE TABLE'

Please ensure that the database has a location associated in AWS Glue.

For example:

USING EXTERNAL FUNCTION detect_dominant_language(col1 VARCHAR) RETURNS VARCHAR LAMBDA 'textanalytics-udf'

CREATE TABLE amazon_reviews_with_language WITH ( format='parquet' ) AS

SELECT *, detect_dominant_language(review_body) AS language

FROM amazon_reviews_parquet

LIMIT 5000

References:

https://docs.aws.amazon.com/athena/latest/ug/create-table.html

https://docs.amazonaws.cn/en_us/athena/latest/ug/create-table-as.html

https://docs.aws.amazon.com/athena/latest/ug/alter-table-set-location.html

AWS
지원 엔지니어
답변함 일 년 전
  • Thanks, mate. Appreciate the suggestion. I'm using the database that was created previously. The location was not set in the Glue Catalog, so I set it. Then I moved 'USING' clause before 'CREATE TABLE' and now I get the following error: GENERIC_INTERNAL_ERROR: Can not create a Path from an empty string

    I also tried adding a location in the 'CREATE TABLE' statement like this:

    USING EXTERNAL FUNCTION detect_dominant_language(col1 VARCHAR) RETURNS VARCHAR LAMBDA 'textanalytics-udf' CREATE TABLE amazon_reviews_with_language WITH ( format='parquet' ) LOCATION 's3://my-bucket/db_path/amazon_reviews_with_language' AS SELECT *, detect_dominant_language(review_body) AS language FROM amazon_reviews_parquet LIMIT 5000

    But it is not working in Athena v.3: Error: mismatched input 'LOCATION'. Expecting: '%', ')', '*', '+', ',', '-', '.', '/', 'AND', 'AS', 'AT', 'OR', '[', '||', <EOF>, <predicate>

  • I'm also getting errors in Engine V3 with a Lambda UDF + CTAS statement. My current workaround is to revert to Engine V2 and place USING EXTERNAL FUNCTION after CREATE TABLE AS.

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠