By using AWS re:Post, you agree to the AWS re:Post Terms of Use

I have a partitioned table like this year=2024/month=02/day=25 , will using a where clause like Date(Year-month-hour ) > Date('2024-02-20') reduce data scanned by using partitions ?

0

I have a partitioned table in athena based on

I query the data with this where clause DATE(year || '-' || month || '-' || day) >= from_iso8601_timestamp('2024-03-18T00:00:00')

does it still make use of partitioning to reduce the data scanned

asked 8 months ago610 views
2 Answers
0

For partition pruning to work, you would have to use a where like: WHERE year='2024' and month='02' and day='25'

AWS
answered 8 months ago
profile picture
EXPERT
reviewed 8 months ago
0

Adding to the above answer, you can consider enabling partition projection and use date format for year, month and day partitions. https://docs.aws.amazon.com/athena/latest/ug/partition-projection-supported-types.html

To enable partition projection, you can define the partitioning column and table properties as shown below -

CREATE EXTERNAL TABLE tableName(
col1 string,
col2 int,
.....)
PARTITIONED BY (
   `timestamp` string)
LOCATION
	's3://bucket/prefix/'
TBLPROPERTIES (
  'projection.enabled'='true', 
  'projection.timestamp.format'='yyyy-MM-dd', 
  'projection.timestamp.interval'='1', 
  'projection.timestamp.interval.unit'='DAYS', 
  'projection.timestamp.range'='2023-01-01,NOW', 
  'projection.timestamp.type'='date', 
  'storage.location.template'='s3://bucket/prefix/${timestamp}')

With this you can use following in where clause timestamp >= '2024-03-18'. In this case you can pass date as string and need not use 'from_iso8601_timestamp' function.

AWS
answered 8 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions