Athena - not respecting ResultReuseByAgeConfiguration correctly

0

Hi all,

I run some AWS Athena queries with the ResultReuseByAgeConfiguration option set. From a behaviour perspective, I would expect Athena to cache the results, but it seems Athena is caching it too well. I am using parameterised queries, and it seems that Athena is not taking the parameterised values in consideration when deciding what needs to be cached. Is this expected behaviour, or possibly a bug in how Athena manages caching?

To demonstrate, I create a very simple stored procedure in Athena.

PREPARE TEST_CACHE FROM
SELECT ?

Running the following code, you would expect all tests to pass (value inserted and returned are the same)

import boto3
import time

def query(v,reuse=False):
    athena = boto3.client('athena')
    query_id1 = athena.start_query_execution(
            QueryString           = 'execute TEST_CACHE',
            ExecutionParameters = [v],
            QueryExecutionContext = { 'Database' : 'default' },
            WorkGroup             = 'ca-main-athena',
            ResultReuseConfiguration={
            'ResultReuseByAgeConfiguration': {
                'Enabled': reuse,
                'MaxAgeInMinutes': 60
            }
        }
    )['QueryExecutionId']

    while True:
        finish_state = athena.get_query_execution(QueryExecutionId=query_id1)["QueryExecution"]["Status"]["State"]
        if finish_state == "RUNNING" or finish_state == "QUEUED":
            time.sleep(1)
        else:
            break

    athenaresult1 = athena.get_query_results(QueryExecutionId=query_id1)
    r = athenaresult1['ResultSet']['Rows'][1]['Data'][0]['VarCharValue']
    if  r == v:
        print(f"result match - expected {v} and got {r}")
    else:
        print(f"result does not match - expected {v} and got {r}")

# = works as expected - result reuse is turned off
query("alpha",False)
query("bravo",False)
query("charlie",False)

# = does not work as expected - result reuse is turned on
query("delta",True)
query("echo",True)
query("foxtrot",True)

The result does demonstrate that the caching is not correct.

$ python test_athena.py 
result match - expected alpha and got alpha
result match - expected bravo and got bravo
result match - expected charlie and got charlie
result does not match - expected delta and got charlie
result does not match - expected echo and got charlie
result does not match - expected foxtrot and got charlie
profile picture
massyn
asked 2 months ago151 views
1 Answer
0

Hi massyn, this post has been brought to the Athena team's attention and added to our ticket queue for investigation. You will receive a followup here when we have more details. Thank you for your submission!

profile pictureAWS
mattmyo
answered 2 months ago
  • We have resolved this issue. You should be able to see the intended behaviour for parameterized queries with caching now.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions