Athena Federated Query Alternative for DocumentDB

0

We have a coupe instances of DocumentDB running and we are wanting to be able to visualize the data in Quicksight. We started down the path of using Athena Federated query outlined in this article: https://aws.amazon.com/blogs/big-data/visualize-mongodb-data-from-amazon-quicksight-using-amazon-athena-federated-query/

After spending a couple of weeks troubleshooting why Athena wouldn't query our DocumetDB instance we discovered that there is no support for camel casing with the Athena connector for DocumentDB. Instead of converting all the DB names and collections to lowercase in our DocumentDB instances we are wondering if anyone has found another approach to visualize data from DocumentDB in Quicksight.

texnoob
asked 8 months ago353 views
1 Answer
0
Accepted Answer

1. DMS (Data Migration Service):

  • Use AWS DMS to replicate data from DocumentDB to another supported datastore like Amazon Redshift or RDS.
  • Once the data is in Redshift or RDS, you can easily connect it to QuickSight for visualization.
  • Note: This solution might introduce some latency (depending on the replication frequency), and there could be costs associated with the additional infrastructure and data transfer.

2. Custom Lambda Function:

  • Write a custom AWS Lambda function to periodically export data from DocumentDB and store it in Amazon S3 as CSV, Parquet, or another supported format.
  • Use QuickSight's native S3 integration to visualize the data.
  • This approach also introduces some latency but offers more flexibility in terms of data transformation and processing.

3. ETL Using AWS Glue:

  • Use AWS Glue to extract data from DocumentDB, transform it as needed, and then load it into another datastore like Redshift or even back into S3.
  • This approach can be more scalable and flexible than using DMS, especially if you have complex transformation requirements.

4. Direct API Integration (Last Resort):

  • Build a custom application that queries DocumentDB directly and then serves the results via an API.
  • Use this API as a source for QuickSight.
  • This method requires more development effort and is not as scalable as the other solutions, but it might be viable for smaller datasets or specific use cases.

5. Feedback to AWS:

  • If you believe that the camel casing limitation is a significant impediment, consider providing feedback to AWS. AWS often prioritizes features and fixes based on user feedback.
  • You can provide feedback through the AWS Management Console, the AWS Support channels, or even AWS forums.

6. Third-Party Tools:

  • There are third-party tools and connectors available that can help bridge the gap between databases like DocumentDB and BI tools like QuickSight. Consider exploring some of these options if none of the AWS-native solutions are suitable.

Remember, the best approach depends on your specific use case, the volume of data, the frequency of data changes, and your tolerance for latency between data updates in DocumentDB and visualization in QuickSight.

profile picture
answered 8 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions