Redshift Spectrum: no way to glance at tables with nested data

0

Hi,
I'm enjoying working with Redshift Spectrum, but I find it difficult to quickly scan a table with nested fields. By that I mean something along the lines of "select * from table limit 100"

This is impossible if the table has nested fields, because Spectrum doesn't allow "*" queries on these.

"Query 1 ERROR: ERROR: Nested tables do not support '*' in the SELECT clause."

But there are few alternatives for me - say I have dozens of columns in my table, do I need to enumerate them while somehow deducing which ones are nested and thus to be avoided?

There is a related issue that I can't do "select nested_field from table" either, presumably because Spectrum can't serialize the field in any way.

These two issues require me to know the schema of all my tables before writing my queries, which is quite user unfriendly. It would be really nice if Spectrum supported displaying string representations of nested fields, if only for these simple selects, they are crucial for exploratory work. (I also use Presto, Hive, and Spark, neither of those has a problem with this.)

Thanks!

ondrejj
질문됨 5년 전2519회 조회
3개 답변
0

Hi ondrejj,

If you're sure Presto readily allows you to query this external data in an exploratory way, then I respectfully suggest that you use Athena, that uses Presto as it's query engine, to do your exploratory SQL and Redshift Spectrum for actual analysis.

Assuming you are using the Glue data catalog for your external table catalog then Athena and Redshift can directly share the same table definitions. If you've already done the DDL in Redshift Spectrum then should be able to use it in Athena as is with little to no setup, nothing to provision or spin up, and the only cost would be the Athena scan costs.

I hope this helps,
-Kurt

klarson
답변함 5년 전
0

We'd love to use Athena, but we're not using Glue and we're not currently planning on doing so.

We also like to query Redshift at the same time as S3 (which both Spectrum and Presto allow), so that's another deal breaker.

Last but not least, we don't want to use multiple tools (like Athena + Spectrum), rather one that supports all our workloads - currently Presto leads the way, but we like Spectrum's simplicity in terms of ops.

ondrejj
답변함 5년 전
0

Thank you for for this feature suggestion. We have definitely heard requests for this feature and it will be considered for our roadmap.

We do not comment on the timing of new features until they are announced but new feature releases are noted in our regular maintenance announcements at the top of this forum and on our What's New page. https://aws.amazon.com/redshift/whats-new/

답변함 5년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠