SQL Query Transform | AWS Glue Job

0

I am transforming my table by adding new columns using SQL Query Transform in AWS Glue Job studio.

visual diagram for transformation

SQL aliases- study

Existing Schema from data catalog - study id, patient id, patient age

I want to transform the existing schema by adding new columns.

new columns - AccessionNo

Transformed schema - study id, patient id, patient age, AccessionNo

SQL query - alter table study add columns (AccessionNo int)

Error it gives- pyspark.sql.utils.AnalysisException: Invalid command: 'study' is a view not a table.; line 2 pos 0; 'AlterTable V2SessionCatalog(spark_catalog), default.study, 'UnresolvedV2Relation [study], V2SessionCatalog(spark_catalog), default.study, [org.apache.spark.sql.connector.catalog.TableChange$AddColumn@1e7cbfec]

I tried looking at AWS official doc for SQL transform and it says queries should be in Spark Sql syntax and my query is also in Spark Sql syntax. https://docs.aws.amazon.com/glue/latest/ug/transforms-sql.html

What is the exact issue and please help me resolve. Thanks

Prabhu
已提問 1 年前檢視次數 860 次
1 個回答
0
已接受的答案

A DDL like that is mean to alter and actual catalog table, not an immutable view like is "study".
You also have to add some content to the column (even if it's a NULL placeholder you will fill in later).
Even better if you can set the value you need here using other columns.
For instance:

select *, 0 as AccessionNo from study
profile pictureAWS
專家
已回答 1 年前
  • Pls accept apologies for the late response as I was busy with MVP deadlines. It actually worked. Thanks a lot @Gonzalo Herreros.

    So, a follow up question. when a data catalog table is getting transformed with SQL transform, does the catalog table always referenced as a View?

  • They are separate things, a DDL (ALTER TABLE) updates the table, all the other transformations just exists while you make the query

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南