Athena/Glue Timestamp Format in Microseconds

0

Hello,

I would like to change the timestamp representation in Athena and Glue to Microseconds. We crawl the data via Glue and store the in Glue tables and we try to query the data via Athena. By default the timestamp is expressed in Milliseconds and therefore the fractional seconds are truncated from the data in our parquet-files. However, we do need the entire information. Here is the example

Athena/Glue: YYYY-MM-DD hh:mm:ss.fff Real Data: YYYY-MM-DD hh:mm:ss.ffffff

Is there a possibility to get the entire fractional seconds? Retrieving the lost fractional seconds via function in Athena just fills up the seconds with zeros. :(

Thanks in advance Best regards, Susa

Susa
asked a year ago2282 views
2 Answers
2
Accepted Answer

As far as I know, both AWS Glue and Amazon Athena do not natively support timestamp precision beyond milliseconds, i.e., they do not support microseconds or nanoseconds.

Timestamps in AWS Glue and Athena are represented in the YYYY-MM-DD HH:MI:SS.fff format (down to milliseconds precision) as you've noted. This is a limitation of the current timestamp data type in these services. You can read more about the supported data types in the official documentation: Data types in Amazon Athena

There are a couple of possible workarounds, but they are not perfect and may require additional data manipulation:

  1. String Field: Store your timestamps as a string field with your desired precision in your Parquet files. You can then convert or manipulate these strings in your Athena queries as needed.

    You could use the string format YYYY-MM-DD hh:mm:ss.ffffff and cast it as a timestamp when necessary in your Athena queries:

    SELECT DATE_PARSE(your_timestamp_string, '%Y-%m-%d %H:%i:%s.%f') 
    FROM your_table

    Keep in mind that this will still truncate your timestamp to milliseconds when casting it as a timestamp.

  2. Separate Fields: Another approach is to separate the date and time components and the fractional seconds into separate fields. This would allow you to maintain precision for your fractional seconds in a separate numeric field, although it would complicate date/time operations.

If this is a crucial requirement for your use case, I would recommend reaching out to AWS support or your AWS account manager to discuss possible solutions or to submit a feature request.

If the answer is helpful, please click "Accept Answer" and upvote it.

profile picture
EXPERT
answered a year ago
profile picture
EXPERT
reviewed 2 months ago
0

Thanks a million for your fast and precise answer. We will go for the first approach since it is the easiest and safest method in my opinion. We will discuss a feature request with our account manager.

Best regards, Susa

Susa
answered a year ago
  • Hi Susa, I’m happy to help. Please don’t forget to click on “Accept Answer” as that helps other members of the community find this article easily on search, and recognizes my effort to help other people with their AWS questions.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions