Querying Athena from Lambda function - QUEUED state?

0

Hi there,
I've been successfully querying s3 via athena from inside a lambda function for quite some time but it has suddenly stopped working. Further investigation shows that the response from get_query_execution() is returned a state of 'QUEUED' (which i was led to believe is not used?!)

My code snippet is as follows:
while (max_execution > 0 and state in ['RUNNING']):
max_execution = max_execution - 1
print("maxexecution=" + str(max_execution))
response = client.get_query_execution(QueryExecutionId = execution_id)

    if 'QueryExecution' in response and \  
            'Status' in response\['QueryExecution'] and \  
            'State' in response\['QueryExecution']\['Status']:  

            state = response\['QueryExecution']\['Status']\['State']  
            print(state)  
            if state == 'SUCCEEDED':  
                print("Query SUCCEEDED: {}".format(execution_id))  

                s3_key = 'athena_output/' _ execution_id _ '.csv'  
                print(s3_key)  
                local_filename = '/tmp/' _ execution_id _ '.csv'  
                print(local_filename)  

                rows = \[]  
                try:  
                    print("s3key =" + s3_key)  
                    print("localfilename = " + local_filename)  
                    s3.Bucket(BUCKET).download_file(s3_key, local_filename)  
                    with open(local_filename) as csvfile:  
                        reader = csv.DictReader(csvfile)  
                        for row in reader:  
                            rows.append(row)  
                except botocore.exceptions.ClientError as e:  
                    if e.response\['Error']\['Code'] == "404":  
                        print("The object does not exist.")  
                        print(e)  
                    else:  
                        raise  
                return json.dumps(rows)  
            elif state == 'FAILED':  
                return False  
    time.sleep(10)  
return False  

Obviously it is returning false as i don't cater for get_query_execution() returning a state of 'QUEUED' - what can cause it to return this state? From the doco i've seen this state isn't supposed to be in use yet? How should this be handled? is it as simple as adding 'QUEUED' to the while condition? e.g. while (max_execution > 0 and state in ['RUNNING', 'QUEUED'])

jaldons
asked 4 years ago1234 views
2 Answers
0

Hi jaldons,

David here from AWS. I'm a Premium Support engineer from the Big Data team, working with EMR, Athena and other related services.

The Athena team recently deployed a host of new functionality for Athena, including more granular CloudWatch metrics for Athena queries.

For more information:
AWS What's New page: https://aws.amazon.com/about-aws/whats-new/analytics/?whats-new-content.sort-by=item.additionalFields.postDateTime&whats-new-content.sort-order=desc&awsf.whats-new-products=general-products%23amazon-athena
Athena docs on CloudWatch metrics: https://docs.aws.amazon.com/athena/latest/ug/query-metrics-viewing.html

As part of the deployment of more granular metrics, Athena now includes a "QUEUED" status for queries. This status indicates that an Athena query is waiting for resources to be allocated for processing. Query flow is roughly:
SUBMITTED -> QUEUED -> RUNNING -> COMPLETED/FAILED
Note that queries that fail due to system errors can be put back into the queue and retried.

I apologise for the frustration that this change has caused.

It seems like the forum formatting has stripped some elements from your code snippets.
However, I think that your WHILE loop is working on an array of the possible query statuses, which didn't previously cater for "QUEUED".
If that is the case, then yes, adding "QUEUED" to that array will allow your application to handle the new status.

Is this affecting a production system in your environment?
Please feel free to message me privately with more details, and I can see what I can do to help you.

Kind regards,
David F.

answered 4 years ago
0

Thanks for that - at least it means i'm not going crazy! Is there some way in which these sort of changes are communicated normally? Luckily the particular API's i'm dealing with aren't in production yet but it would have been disastrous if we were a month further along... luckily it's an easy fix but knowing before it breaks would be preferable

jaldons
answered 4 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions