Skip to content

Exclude file types when creating external table with redshift spectrum

0

I want to create an external table to copy s3 data int htat table for the use of redshift spectrum. The specified location includes parquet and mp4 files.

I only want to copy the parquet files. And I also specified the term STORED AS PARQUET. But when I run a select statement I get following error:

SQL-Fehler [XX000]: ERROR: Spectrum Scan Error context: File '....mp4' has an invalid version number.

This is my sql query

CREATE EXTERNAL TABLE schema_xyz_ext.xyz_ext ( xyz VARCHAR, abc BIGINT ) STORED AS PARQUET LOCATION 's3://path/xyz/';

Is there any possibilty to exclude the mp4 files?

asked a year ago122 views
2 Answers
0

I hardly use external (and be aware you're now including a complex additional system, Redshift Spectrum, with plenty of complex and unexpected behaviour, into your system), but with COPY you can specify a manifest.

I think I remember you can specify a manifest when defining an external table?

You'll need to update the manifest to get new files.

The real solution is not to mix files for, and not for, the external table. Set things up so those two types are stored and used separately. Then it just works by itself.

--
Max Ganz II
https://www.redshift-observatory.ch

answered a year ago
0

Thank you very much for you answer. But I need / want to use Redshift Spectrum.

To seperate the files in the actual project is no option, unfortunately.

Thanks.

answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.