Query Id: ddfe7caf-be98-4c33-a2f1-b50502a22961

0

amazon athena takes long time to run my query I want all s3 object lists in csv format

asked 2 years ago205 views
2 Answers
1

Hi

S3 Inventory (not real time) can be an option depending on what you are trying to achieve. You can use Athena to query this CSV also. Overall Amazon S3 Inventory is one of the tools Amazon S3 provides to help manage your storage. You can use it to audit and report on the replication and encryption status of your objects. You can also simplify and speed up business workflows and big data jobs using Amazon S3 Inventory, which provides a scheduled alternative to the Amazon S3 synchronous List API operation. Amazon S3 Inventory does not use the List API to audit your objects and does not affect the request rate of your bucket. Refer to S3 inventory pricing at https://aws.amazon.com/s3/pricing/.

Amazon S3 Inventory provides comma-separated values (CSV), Apache optimized row columnar (ORC) or Apache Parquet output files that list your objects and their corresponding metadata on a daily or weekly basis for an S3 bucket or a shared prefix (that is, objects that have names that begin with a common string).

For more information https://docs.aws.amazon.com/AmazonS3/latest/userguide/storage-inventory.html

AWS
answered 2 years ago
AWS
EXPERT
reviewed 2 years ago
  • based on the limited information on the question , the option from this answer seems the most appropriate.

1

You need to consider the volume of data you're trying to scan - not just the amount you expect to retrieve, but how much data will have to be scanned to gather the retrieved data. If you have a lot of data you'll need to partition it and limit your query to target less partitions to avoid timeouts and long queries. As an example, see this blog post about using Partition Projection for Athena queries on CloudTrail logs in S3 - https://www.linkedin.com/pulse/using-athena-partition-projection-query-cloudtrail-other-kinsman/?lipi=urn%3Ali%3Apage%3Ad_flagship3_profile_view_base_post_details%3Brq0yhJ20SQKlSN9blwev9g%3D%3D

EXPERT
answered 2 years ago
AWS
EXPERT
reviewed 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions