- Newest
- Most votes
- Most comments
Hi alexandrec,
If you query a partitioned table and do not specify the partition in a WHERE clause then Athena will scan the entire table. If you do specify a partition with a WHERE clause then Athena will scan only the data in the specified partition. This is mentioned in Amazon Athena documentation [1].
This would be why the first query scans the partitions correctly but the second one does not.
I hope this answers your question about the behavior you experience with this query. You may also benefit from [2] which outlines some Athena performance tuning tips.
[1] https://docs.aws.amazon.com/athena/latest/ug/partitions.html#partitions-considerations-limitations
[2] https://aws.amazon.com/blogs/big-data/top-10-performance-tuning-tips-for-amazon-athena/
JOIN clause conditions should be understood by the underlying engine as filter predicates. No where in the documentation it is said anything about that specifically. Otherwise, Athena becomes a huge money sink for very large tables with thousands of partitions. The only solution is to reduce the amount of partitions, so we scan less data, but that comes at the cost of losing data in the table. Where can I get help from an AWS Athena engineer?
Hi alexandrec,
You can raise a technical support case at [1] to seek assistance from an AWS Athena engineer. Note that you will need a support plan that is Developer or above. See [2] for details on creating a support case with AWS.
[1] https://console.aws.amazon.com/support/home#/case/create?issueType=technical [2] https://docs.aws.amazon.com/awssupport/latest/user/case-management.html
Relevant content
- asked a year ago
- Accepted Answerasked 8 months ago
- asked 7 years ago
- AWS OFFICIALUpdated 3 months ago
- AWS OFFICIALUpdated 2 months ago
- AWS OFFICIALUpdated 2 months ago
- AWS OFFICIALUpdated 4 months ago
Did you try using a WHERE condition in the final select statement as a DUMMY statement. eg.
where 1=1
or
where fr.timestamp > 1671633935
Yes, dummy statements do not work.