SQL Query Transform | AWS Glue Job

0

I am transforming my table by adding new columns using SQL Query Transform in AWS Glue Job studio.

visual diagram for transformation

SQL aliases- study

Existing Schema from data catalog - study id, patient id, patient age

I want to transform the existing schema by adding new columns.

new columns - AccessionNo

Transformed schema - study id, patient id, patient age, AccessionNo

SQL query - alter table study add columns (AccessionNo int)

Error it gives- pyspark.sql.utils.AnalysisException: Invalid command: 'study' is a view not a table.; line 2 pos 0; 'AlterTable V2SessionCatalog(spark_catalog), default.study, 'UnresolvedV2Relation [study], V2SessionCatalog(spark_catalog), default.study, [org.apache.spark.sql.connector.catalog.TableChange$AddColumn@1e7cbfec]

I tried looking at AWS official doc for SQL transform and it says queries should be in Spark Sql syntax and my query is also in Spark Sql syntax. https://docs.aws.amazon.com/glue/latest/ug/transforms-sql.html

What is the exact issue and please help me resolve. Thanks

Prabhu
질문됨 일 년 전860회 조회
1개 답변
0
수락된 답변

A DDL like that is mean to alter and actual catalog table, not an immutable view like is "study".
You also have to add some content to the column (even if it's a NULL placeholder you will fill in later).
Even better if you can set the value you need here using other columns.
For instance:

select *, 0 as AccessionNo from study
profile pictureAWS
전문가
답변함 일 년 전
  • Pls accept apologies for the late response as I was busy with MVP deadlines. It actually worked. Thanks a lot @Gonzalo Herreros.

    So, a follow up question. when a data catalog table is getting transformed with SQL transform, does the catalog table always referenced as a View?

  • They are separate things, a DDL (ALTER TABLE) updates the table, all the other transformations just exists while you make the query

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인