Redshift Spectrum: no way to glance at tables with nested data

0

Hi,
I'm enjoying working with Redshift Spectrum, but I find it difficult to quickly scan a table with nested fields. By that I mean something along the lines of "select * from table limit 100"

This is impossible if the table has nested fields, because Spectrum doesn't allow "*" queries on these.

"Query 1 ERROR: ERROR: Nested tables do not support '*' in the SELECT clause."

But there are few alternatives for me - say I have dozens of columns in my table, do I need to enumerate them while somehow deducing which ones are nested and thus to be avoided?

There is a related issue that I can't do "select nested_field from table" either, presumably because Spectrum can't serialize the field in any way.

These two issues require me to know the schema of all my tables before writing my queries, which is quite user unfriendly. It would be really nice if Spectrum supported displaying string representations of nested fields, if only for these simple selects, they are crucial for exploratory work. (I also use Presto, Hive, and Spark, neither of those has a problem with this.)

Thanks!

ondrejj
已提问 5 年前2428 查看次数
3 回答
0

Hi ondrejj,

If you're sure Presto readily allows you to query this external data in an exploratory way, then I respectfully suggest that you use Athena, that uses Presto as it's query engine, to do your exploratory SQL and Redshift Spectrum for actual analysis.

Assuming you are using the Glue data catalog for your external table catalog then Athena and Redshift can directly share the same table definitions. If you've already done the DDL in Redshift Spectrum then should be able to use it in Athena as is with little to no setup, nothing to provision or spin up, and the only cost would be the Athena scan costs.

I hope this helps,
-Kurt

klarson
已回答 5 年前
0

We'd love to use Athena, but we're not using Glue and we're not currently planning on doing so.

We also like to query Redshift at the same time as S3 (which both Spectrum and Presto allow), so that's another deal breaker.

Last but not least, we don't want to use multiple tools (like Athena + Spectrum), rather one that supports all our workloads - currently Presto leads the way, but we like Spectrum's simplicity in terms of ops.

ondrejj
已回答 5 年前
0

Thank you for for this feature suggestion. We have definitely heard requests for this feature and it will be considered for our roadmap.

We do not comment on the timing of new features until they are announced but new feature releases are noted in our regular maintenance announcements at the top of this forum and on our What's New page. https://aws.amazon.com/redshift/whats-new/

已回答 5 年前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则