By using AWS re:Post, you agree to the Terms of Use
/Amazon Redshift/

Questions tagged with Amazon Redshift

Sort by most recent
  • 1
  • 90 / page

Browse through the questions and answers listed below or filter and sort to narrow down your results.

Trying to track redshift Assert error

Since today we are having this error, the DBT/SQL was not touched I'm receiving errors in many processes with this same message. A full refresh on DBT clears it, but when the first incremental runs, it starts to fail again with the same error, "context" is not helping. I ran the query manually and had no trouble at all. It's a query that creates a table, a delete on the destination table (with a where), and inserts into it (managed by DBT that wasn't modified for more than 6 months, the query runs daily and also wasn't changed on the last months). The only thing that I think changed was the Redshift version via an auto-maintenance yesterday, but can't do a rollback to validate, or can I? I was in "current" version on maintenance, just switched to "trailing", now I'm on 1.0.38361, and can't find the changelog for that version. ``` 20:26:44 | 20:26:44 | 1 of 1 START incremental model xxxxxx......... [RUN] 20:26:51 | 1 of 1 ERROR creating incremental model xxxxxx [ERROR in 6.44s] 20:26:51 | 20:26:51 | Finished running 1 incremental model in 8.55s. Completed with 1 error and 0 warnings: Database Error in model xxxxx (models/xxxxx.sql) Assert DETAIL: ----------------------------------------------- error: Assert code: 1000 context: size >= 0 - size=-2, 0=0. query: 62706875 location: tbl_trans.cpp:867 process: padbmaster [pid=70356] ----------------------------------------------- compiled SQL at target/run/xxxxx/xxxxx.sql Done. PASS=0 WARN=0 ERROR=1 SKIP=0 TOTAL=1 ``` Any idea of what can be happening? Thanks!
1
answers
0
votes
2
views
AWS-User-3279708
asked 12 hours ago

Is there a way to create a Redshift Table from a Glue table's schema?

Athena tables can be created from Glue tables which can have schemas based on crawlers. **Is it also possible to use the schema of a Glue table to generate a *Redshift-compatible* `CREATE TABLE` statement? ** I tried `SHOW CREATE TABLE encounter;` in Athena. And then I tried plugging in the resulting `CREATE TABLE` statement in Redshift, but got an error: ``` ERROR: syntax error at or near "`" Position: 23. ``` I can go through the statement Athena generated and clean it up to fit Redshift requirements, like taking out the back-ticks, but I'm wondering if there's any more direct way to generate a table based on a Glue table? This is that `CREATE TABLE` statement that Athena generated: ``` CREATE EXTERNAL TABLE `encounter`( `resourcetype` string COMMENT 'from deserializer', `id` string COMMENT 'from deserializer', `meta` struct<lastupdated:string,profile:array<string>> COMMENT 'from deserializer', `identifier` array<struct<use:string,system:string,value:string>> COMMENT 'from deserializer', `status` string COMMENT 'from deserializer', `class` struct<system:string,code:string> COMMENT 'from deserializer', `type` array<struct<coding:array<struct<system:string,code:string,display:string>>,text:string>> COMMENT 'from deserializer', `subject` struct<reference:string,display:string> COMMENT 'from deserializer', `participant` array<struct<type:array<struct<coding:array<struct<system:string,code:string,display:string>>,text:string>>,period:struct<start:string,end:string>,individual:struct<reference:string,display:string>>> COMMENT 'from deserializer', `period` struct<start:string,end:string> COMMENT 'from deserializer', `location` array<struct<location:struct<reference:string,display:string>>> COMMENT 'from deserializer', `serviceprovider` struct<reference:string,display:string> COMMENT 'from deserializer', `reasoncode` array<struct<coding:array<struct<system:string,code:string,display:string>>>> COMMENT 'from deserializer') ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe' WITH SERDEPROPERTIES ( 'paths'='class,id,identifier,location,meta,participant,period,reasonCode,resourceType,serviceProvider,status,subject,type') STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION 's3://bucket/Encounter/' TBLPROPERTIES ( 'CrawlerSchemaDeserializerVersion'='1.0', 'CrawlerSchemaSerializerVersion'='1.0', 'UPDATED_BY_CRAWLER'='healthlake-export-crawler', 'averageRecordSize'='1561', 'classification'='json', 'compressionType'='none', 'objectCount'='14', 'recordCount'='53116', 'sizeKey'='83059320', 'typeOfData'='file') ``` Here's an example of what the original data looks like (it's synthetic data, so not PHI): ``` { "period": { "start": "2019-11-18T13:53:49-08:00", "end": "2019-11-18T14:23:49-08:00" }, "subject": { "reference": "Patient/92e36d1e-66a2-4e77-9f50-155f7edf819c", "display": "Cyndi533 Bogan287" }, "serviceProvider": { "reference": "Organization/3ecb1bdd-03d7-3fd2-b52d-8df2a04f5b0a", "display": "SOUTH SHORE SKIN CENTER, LLC" }, "id": "b39745ae-14dd-46b3-9345-2916efa759ad", "type": [{ "coding": [{ "system": "http://snomed.info/sct", "code": "410620009", "display": "Well child visit (procedure)" }], "text": "Well child visit (procedure)" }], "class": { "system": "http://terminology.hl7.org/CodeSystem/v3-ActCode", "code": "AMB" }, "participant": [{ "period": { "start": "2019-11-18T13:53:49-08:00", "end": "2019-11-18T14:23:49-08:00" }, "individual": { "reference": "Practitioner/c51e847b-fcd0-3f98-98a7-7e4274a2e6f3", "display": "Dr. Jacquelyne425 O'Reilly797" }, "type": [{ "coding": [{ "system": "http://terminology.hl7.org/CodeSystem/v3-ParticipationType", "code": "PPRF", "display": "primary performer" }], "text": "primary performer" }] }], "resourceType": "Encounter", "status": "finished", "meta": { "lastUpdated": "2022-04-08T15:40:39.926Z" } } ```
2
answers
0
votes
7
views
yann
asked 20 days ago

How to use psycopg2 to load data into Redshift tables with the copy command

I am trying to load data from an EC2 instance into Redshift tables but cannot figure out how to do this using the copy command. I have tried the following to create the sql queries: ``` def copy_query_creator(table_name, schema): copy_sql_template = sql.SQL("COPY {table_name} from stdin iam_role 'iam_role' DATEFORMAT 'MM-DD-YYYY' TIMEFORMAT 'MM-DD-YYYY HH12:MI:SS AM' ACCEPTINVCHARS fixedwidth {schema}").format(table_name = sql.Identifier(table_name),schema = schema) return copy_sql_template ``` and ``` def copy_query_creator_2(table_name, iam_role, schema): copy_sql_base = """ COPY {} FROM STDIN iam_role {} DATEFORMAT 'MM-DD-YYYY' TIMEFORMAT 'MM-DD-YYYY HH12:MI:SS AM' ACCEPTINVCHARS fixedwidth {}""".format(table_name, iam_role, schema) print(copy_sql_base) return copy_sql_base ``` where schema is the fixedwidth_spec in the example snippet below: ``` copy table_name from 's3://mybucket/prefix' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' fixedwidth 'fixedwidth_spec'; ``` The function that uses the query created looks like so: ``` def copy_query(self, filepath): schema = Query.create_schema() #returns the formatted fixedwidth_spec table_name = Query.get_table_def() #returns the table_name print(copy_query_creator_2(table_name, iam_role, schema)) self.connect() with self.connection.cursor() as cursor: try: with open(filepath) as f: cursor.copy_expert(copy_query_creator_2(table_name, iam_role, schema), f) print('copy worked') logging.info(f'{copy_query_creator_2(table_name, iam_role, schema)} ran; {cursor.rowcount} records copied.') except (Exception, psycopg2.Error) as error: logging.error(error) print(error) ``` The two attempts return errors. The first returns 'Composed elements must be Composable, got %r instead' while the latter returns 'error at or near STDIN'. Please help.
0
answers
0
votes
3
views
AWS-User-8813229
asked 20 days ago

Redshift stored procedure transaction keeps state after commit that leads to ERROR 1023 Serializable isolation violation

This stored procedure (excerpt) reliably generates ERROR: 1023 DETAIL: Serializable isolation violation on table tbl_process! If only a single invocation occurs (Fast or Slow) the process always completes successfully inserting two rows (including updating the first)! However if two overlapping invocations occur with the first being a "Slow" process followed (before completing) by a "Fast" process - then the "Fast" process always completes successfully whilst the "Slow" process generates an exception (always mentioning 3 transactions forming the cycle). Based on Redshift documentation I understood the BEGIN LOCK COMMIT END construct should prevent this! It appears the scalar var_process_start used in the UPDATE statement has retained some state from Transaction #1 that causes the Transaction #2 to fail - but only if any other processes transactions have been committed to the table. Note that the LOCK or INSERT in Transaction #2 does not fail - only the UPDATE. How can this exception be avoided? ``` DECLARE var_process_start INTEGER ; BEGIN BEGIN -- TRANSACTION #1 LOCK TABLE tbl_process ; INSERT INTO tbl_process ( process_user , process_pid , start_tstp ) VALUES ( CURRENT_USER , PG_BACKEND_PID() , GETDATE() ) ; -- Collect the tbl_process identity value for #1 SELECT INTO var_process_start MAX(process_event) FROM tbl_process ; COMMIT ; END ; -- TRANSACTION #1 -- /* Slow or Fast running processing here */ /* Sets var_start_tstp and var_finish_tstp values */ -- BEGIN -- TRANSACTION #2 -- Avoid serializable isolation violations and deadlocks LOCK TABLE tbl_process ; -- Commit new process entry #2 INSERT INTO tbl_process ( process_user , process_pid , start_tstp , finish_tstp ) VALUES ( CURRENT_USER , PG_BACKEND_PID() , var_start_tstp , var_finish_tstp ) ; IF var_process_start > 0 THEN -- Record the end of this processing run on #1 UPDATE tbl_process SET finish_tstp = GETDATE() WHERE process_event = var_process_start ; -- ^^^ this statement generates serialization isolation violation! END IF ; -- Finished - commit all work to database COMMIT ; END ; -- TRANSACTION #2 ```
1
answers
0
votes
6
views
Frank Hamersley
asked a month ago

Redshift jdbc driver getdate() returns time in server time zone instead of in UTC

The java code mentioned below was working fine with Redshift jdbc driver 1.2. But, now seeing erroneous behaviour with Redshift jdbc driver 2.1. Can anyone confirm whether a workaround is possible in this java code or whether this is a bug in the driver 2.1 itself? ## Information about Java code * In a timestamp data-typed column, the value is inserted using "getdate()". * It is supposed to add time in UTC (without timezone information). * How the insert query is written: ``` connection.prepareStatement("INSERT INTO sampleTable (tsColumn, ...) VALUES (getdate(), ...)"); ``` ### jdbc driver 1.2.41.1065 inserts correct date in UTC * With JDBC 4.2–compatible driver 1.2.41.1065 : https://s3.amazonaws.com/redshift-downloads/drivers/jdbc/1.2.41.1065/RedshiftJDBC42-no-awssdk-1.2.41.1065.jar : https://docs.aws.amazon.com/redshift/latest/mgmt/configure-jdbc-connection.html#jdbc-previous-versions : The query inserts the value in UTC : "2022-04-18 12:38:25" . * It is correct as per https://docs.aws.amazon.com/redshift/latest/dg/r_GETDATE.html . ### jdbc driver 2.1.0.5 insertes date in PDT (server time zone) INCORRECT * With JDBC 4.2–compatible driver version 2.1 (without the AWS SDK) : https://s3.amazonaws.com/redshift-downloads/drivers/jdbc/2.1.0.5/redshift-jdbc42-2.1.0.5.jar : https://docs.aws.amazon.com/redshift/latest/mgmt/jdbc20-download-driver.html, the value inserted is "2022-04-18 05:38:25". It is in PDT without timezone information. * It is automatically converting the UTC time to PDT timezone where the java code is running. It is incorrect. * Expected: The value inserted should be UTC time. #### Workaround attempted: * Attempted to modify getdate() part in the query as follows but it didn't work: ``` convert_timezone('UTC', getdate()) ``` ### Questions * Is this a bug in the driver 2.1? * Is there any workaround/configuration/code-fix possible in the java code explained above? ### Related * Posted on stackoverflow also: https://stackoverflow.com/q/71924153/4270739 .
1
answers
0
votes
4
views
Hitesh
asked a month ago

AWS Redshift Maintenance Announcement (January 9th - March 24th 2022)

**Major Version** **(01/19/2022)** We will be patching your Amazon Redshift clusters during your system maintenance window in the coming weeks. The timing of the patch will depend on your region and maintenance window settings. You can view or change your maintenance window settings from the AWS Management Console. After your cluster has been patched, the new cluster version will be Build 1.0.34934. Please contact us at redshift-feedback@amazon.com if you have any questions. This version includes the following new features and improvements: • Amazon Redshift: The database users that are automatically created when using GetClusterCredentials API to get temporary credentials will now be created with "IAM:" prefix in the database. • Amazon Redshift: User names from this version are treated as case sensitive when surrounded by double quotes irrespective of the value of the configuration parameter "enable_case_sensitive_identifier". • Amazon Redshift: Customers can use ST_Intersection to calculate the intersection of two geometries. • Amazon Redshift: Automatically discovering spatial reference system id (SRID) during COPY of shapefiles when .prj file next to .shp file exists. • Amazon Redshift: Customers can combine two sketches that appear in separate columns into one. • Amazon Redshift: Added support for AUTO table property for distribution style in CTAS. • Amazon Redshift: Customers can use St_GeoHash function to return the geohash value of an input point. • Amazon Redshift: Amazon Redshift ML now supports unsupervised training with K-Means clustering. • Amazon Redshift: Added support for converting interleaved sort keys to compound sort keys or no sort key with Alter Sortkey command. Additionally, the following fixes are included: • With the performance improvement for the numeric datatype, this resolves a sig11 issue with all customers using drivers that send numeric to Redshift server in binary mode like .Net driver. --- **Minor Versions** **(02/07/2022)** We will be patching your Amazon Redshift clusters during your system maintenance window in the coming weeks. The timing of the patch will depend on your region and maintenance window settings. You can view or change your maintenance window settings from the AWS Management Console. After your cluster has been patched, the new cluster version will be Build 1.0.35480. Please contact us at redshift-feedback@amazon.com if you have any questions. In addition to what is included in the major version, this version includes the following new features and improvements: • Amazon Redshift: Public Preview of Redshift Streaming Ingestion for Kinesis Data Streams (KDS) pulls data from a KDS stream into a Redshift Materialized View (MV) in near real-time with high throughput. Please note: Preview features are not to be used to run production workloads. To get started see the Redshift documentation. Additionally, the following fixes are included: • Fixed a rare situation where with Materialized View auto refresh enabled, external functions cause Redshift cluster instability. --- **(02/15/2022)** We will be patching your Amazon Redshift clusters during your system maintenance window in the coming weeks. The timing of the patch will depend on your region and maintenance window settings. You can view or change your maintenance window settings from the AWS Management Console. After your cluster has been patched, the new cluster version will be Build 1.0.35649. Please contact us at redshift-feedback@amazon.com if you have any questions. In addition to what is included in the major version, this version includes the following new features and improvements: • Miscellaneous fixes --- **(03/03/2022)** We will be patching your Amazon Redshift clusters during your system maintenance window in the coming weeks. The timing of the patch will depend on your region and maintenance window settings. You can view or change your maintenance window settings from the AWS Management Console. After your cluster has been patched, the new cluster version will be Build 1.0.36224. Please contact us at redshift-feedback@amazon.com if you have any questions. In addition to what is included in the major version, this version includes the following new features and improvements: • Miscellaneous fixes --- **(03/03/2022)** We will be patching your Amazon Redshift clusters during your system maintenance window in the coming weeks. The timing of the patch will depend on your region and maintenance window settings. You can view or change your maintenance window settings from the AWS Management Console. After your cluster has been patched, the new cluster version will be Build 1.0.36905. Please contact us at redshift-feedback@amazon.com if you have any questions. In addition to what is included in the major version, this version includes the following new features and improvements: • Miscellaneous fixes
1
answers
1
votes
20
views
meghanaAtAWS
asked a month ago

AWS Redshift Maintenance Announcement (March 2nd - April 4th 2022)

**Major Version** **(03/02/2022)** We will be patching your Amazon Redshift clusters during your system maintenance window in the coming weeks. The timing of the patch will depend on your region and maintenance window settings. You can view or change your maintenance window settings from the AWS Management Console. After your cluster has been patched, the new cluster version will be Build 1.0.36236. Please contact us at redshift-feedback@amazon.com if you have any questions. This version includes the following new features and improvements: • Amazon Redshift: Copy query now supports an IGNOREALLERRORS keyword. This will ignore all the errors encountered while scanning and parsing the records in COPY query. • Amazon Redshift: Customers are now able to leverage Azure Active Directory natively to manage their Redshift identities and simplify authorization based on their Azure Active Directory group memberships. • Amazon Redshift: Role-based access control is now supported in Redshift. Customer can create roles, grant privileges to roles and grant roles to database users. Additionally, the following fixes are included: • Fixes an issue that causes some data sharing queries being stuck when running on Concurrency Scaling clusters. --- **Minor Versions** **(03/14/2022)** We will be patching your Amazon Redshift clusters during your system maintenance window in the coming weeks. The timing of the patch will depend on your region and maintenance window settings. You can view or change your maintenance window settings from the AWS Management Console. After your cluster has been patched, the new cluster version will be Build 1.0.36433. Please contact us at redshift-feedback@amazon.com if you have any questions. In addition to what is included in the major version, this version includes the following new features and improvements: • Miscellaneous fixes --- **(03/24/2022)** We will be patching your Amazon Redshift clusters during your system maintenance window in the coming weeks. The timing of the patch will depend on your region and maintenance window settings. You can view or change your maintenance window settings from the AWS Management Console. After your cluster has been patched, the new cluster version will be Build 1.0.36926. Please contact us at redshift-feedback@amazon.com if you have any questions. In addition to what is included in the major version, this version includes the following new features and improvements: • Miscellaneous fixes --- **(04/04/2022)** We will be patching your Amazon Redshift clusters during your system maintenance window in the coming weeks. The timing of the patch will depend on your region and maintenance window settings. You can view or change your maintenance window settings from the AWS Management Console. After your cluster has been patched, the new cluster version will be Build 1.0.37176. Please contact us at redshift-feedback@amazon.com if you have any questions. In addition to what is included in the major version, this version includes the following new features and improvements: • Miscellaneous fixes
1
answers
2
votes
31
views
meghanaAtAWS
asked a month ago

How to replicate IF NOT EXISTS in redshift

I am trying to replicate a functionality from SQL Server into redshift where I have to ignore column if the column exists, otherwise add it into the table. I have come across these posts in stackoverflow, however couldn't find a proper solution from them: 1) https://stackoverflow.com/questions/65103448/redshift-alter-table-if-not-exists 2) https://stackoverflow.com/questions/42035068/redshift-add-column-if-not-exists 3) https://stackoverflow.com/questions/42669237/workaround-in-redshift-for-add-column-if-not-exists I am able to get a TRUE or FALSE for columns that I want to check. But I don't know how to ALTER the table to add or remove one. I am also a bit new to redshift, so I would appreciate a response! These are some of my attempts: ````` IF (SELECT EXISTS(SELECT * FROM pg_catalog.pg_table_def WHERE schemaname = 'my_schema' AND tablename = 'my_table' AND "column" = 'my_new_column' )) <> TRUE THEN ALTER TABLE my_table ADD COLUMN my_new_column varchar END IF; ````` ``````` CREATE OR REPLACE PROCEDURE if_else() LANGUAGE plpgsql AS $$ BEGIN IF (SELECT EXISTS(SELECT * FROM pg_catalog.pg_table_def WHERE schemaname = 'my_schema' AND tablename = 'my_table' AND "column" = 'my_new_column' )) <> TRUE THEN ALTER TABLE my_table ADD COLUMN my_new_column varchar END IF; END; $$ ; CALL if_else(); ``````` A few more failed attempts: ``````` CREATE OR REPLACE PROCEDURE alter_my_table() AS $$ BEGIN ALTER TABLE my_table ADD COLUMN my_new_column varchar END; $$ LANGUAGE plpgsql ; SELECT CASE WHEN COUNT(*) THEN 'warning: column exists already.' ELSE CALL alter_my_table(); END FROM pg_catalog.pg_table_def WHERE schemaname = 'my_schema' AND tablename = 'my_table' AND "column" = 'my_new_column' ``````` Thank you for your time.
1
answers
0
votes
5
views
AWS-User-4229592
asked 2 months ago

Redshift Clear Text Passwords and Secret keys exposed?

Hi there, I received the following email about my redshift cluster: > We are reaching out to inform you your Amazon Redshift cluster(s) may have been affected by an issue caused by a change introduced on October 13, 2021, where your password and/or your Secret_Access_Key may have been inadvertently written in plain text to your cluster's audit logs (stl_user_activity_log). We do not have any indication that these credentials have been accessed. We applied a patch on January 19, 2022, to fix the issue for all clusters in all AWS regions. > As a cautionary measure, we recommend that you: (1) Review any access to your cluster(s) in your audit log files from October 13, 2021 through January 19, 2022, such as those by authorized applications, to ensure your access credentials and passwords were not accessed; (2) Immediately change your cluster's password and/or generate a new Secret_Access_Key for use with COPY and UNLOAD commands for moving files between Amazon S3 and Amazon Redshift; and (3) Scan and sanitize your audit log files, that were created between October 13, 2021 through January 19, 2022, both dates inclusive, to remove any occurrences of clear text passwords and security keys in them. However, looking on my cluster I can't see a stl_user_activity_log > Select * from stl_user_activity_log; > SQL Error [42P01]: ERROR: relation "pg_catalog.stl_user_activity_log" does not exist Was this email pointing out the wrong audit logs? or should I not be looking for these audit logs on the table? we have s3 audit logging enabled, but browsing through those I don't see anything either.
1
answers
0
votes
8
views
AWS-User-5958751
asked 2 months ago

COPY from S3 to Redshift with manifest fails

Hi, I need to load data from Aurora (MySQL) to Redshift and using S3 is one of the available options. I can extract data from Aurora (MySQL) to S3 using: ``` SELECT * FROM data_table INTO OUTFILE S3 's3-XXX://bucket_name/aurora_files/data_table' FORMAT CSV HEADER FIELDS TERMINATED BY ';' LINES TERMINATED BY '\n' OVERWRITE ON; ``` and load the same data to Redshift using: ``` copy data_table from 's3://bucket_name/aurora_files/data_table.part_00000' access_key_id 'XXX' secret_access_key 'XXX' csv delimiter ';' ignoreheader 1 timeformat 'YYYY-MM-DD HH:MI:SS' region 'XXX'; ``` If I try to extract data with Manifest and load that from Manifest, I get the following error: ``` [2022-03-14 18:08:52] [XX000] ERROR: S3 path "s3-XXX://bucket_name/aurora_files/data_table.part_00000" has invalid format. [2022-03-14 18:08:52] Detail: [2022-03-14 18:08:52] ----------------------------------------------- [2022-03-14 18:08:52] error: S3 path "s3-XXX://bucket_name/aurora_files/data_table.part_00000" has invalid format. [2022-03-14 18:08:52] code: 8001 [2022-03-14 18:08:52] context: Parsing S3 Bucket [2022-03-14 18:08:52] query: 312924 [2022-03-14 18:08:52] location: s3_utility.cpp:133 [2022-03-14 18:08:52] process: padbm@ster [pid=13049] [2022-03-14 18:08:52] ----------------------------------------------- ``` Following commands are used to create S3 file and load that to Redshift with manifest: ``` SELECT * FROM data_table INTO OUTFILE S3 's3-XXX://bucket_name/aurora_files/data_table' FORMAT CSV HEADER FIELDS TERMINATED BY ';' LINES TERMINATED BY '\n' MANIFEST ON OVERWRITE ON; ``` ``` copy data_table from 's3://bucket_name/aurora_files/data_table.manifest' access_key_id 'XXX' secret_access_key 'XXX' csv delimiter ';' ignoreheader 1 timeformat 'YYYY-MM-DD HH:MI:SS' region 'XXX' manifest; ``` What could be the issue?
1
answers
0
votes
5
views
nenkie76
asked 2 months ago

Redshift External Table - unable to query struct data types using Avro as a format

Hi, I have created an external table in redshift as below: ``` CREATE EXTERNAL TABLE "someschema"."avro_simple_nested" ( "actor_id" varchar(200), "first_name" varchar(200), "last_name" varchar(200), "last_update" bigint, "version" int, "debut_film" struct< "name":varchar(200), "score":int > ) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' WITH SERDEPROPERTIES ( 'avro.schema.literal'='{"namespace": "example.avro", "type": "record", "name": "actor", "fields": [ {"name": "actor_id", "type": "string"}, {"name": "first_name", "type": "string"}, {"name": "last_name", "type": "string"}, {"default": 0, "name": "last_update", "type": "long"}, {"name": "version", "type": "int"}, {"name": "debut_film", "type": {"type":"record", "name":"debut_film_name", "fields": [ {"name": "name", "type": "string"}, {"name": "score", "type": "int"} ]}} ] }' ) STORED AS AVRO LOCATION 's3://....../..../avro-files-simple-nested/' ``` If I try and query the table by using *SELECT * from Table* I get: **ERROR: Cannot expand query to include unsupported column type for column "debut_film".** If I try and query the non struct columns, I am able to see the data. If I check the same data in Athena, it shows fine. The table is shown correctly and the data is as expected. If I query the SVV_EXTERNAL_COLUMNS table, you can see the table definition is correct. `debut_film struct<name:varchar(200),score:int>` Has anyone else used an external table over avro data and managed to view nested data in redshift? Thanks!
2
answers
0
votes
10
views
AWS-User-5186938
asked 3 months ago
  • 1
  • 90 / page