Accessing DynamoDb data from Sagemaker

1

I have some data in my DynamoDb database and I want to access it in my Sagemaker notebook. To my surprise having done some research it looks like I need to transfer the data from DynamoDb into an S3 bucket and access the data from a csv file in there. Is that correct? What is the best way of doing this?

3 Answers
3

You can access the data in your SageMaker Notebook by reading it using the boto3 client. Initialize a DynamoDB Client and do a Scan which will return all the data that you require.

Note that this will be in JSON format and you would need to convert that to a format which is useful for the rest of your notebook, such as converting to a pandas dataframe.

https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/dynamodb.html

profile pictureAWS
EXPERT
answered 2 years ago
0

Hey so in short, that is correct and you will have to export your DynamoDB table to S3. Here is a guide on how to do that: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/S3DataExport.HowItWorks.html (Note: DynamoDB can export your table data in two formats: DynamoDB JSON and Amazon Ion)

Another way to approach this is by directly exporting your DynamoDB table into a .csv then uploading that file to S3: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/workbench.querybuilder.exportcsv.html

You can also use AWS Glue to transfer your table from DynamoDB to S3 (Although Glue can be a bit of a learning curve), the documentation should give you the tools you need: https://docs.aws.amazon.com/glue/latest/dg/how-it-works.html

Sagemaker Data Wrangler is also able to import data from many sources other than S3, although DynamoDB is not supported yet. You can check out support data sources for Data Wrangler here: https://docs.aws.amazon.com/sagemaker/latest/dg/data-wrangler.html

Hope this helps, and please let me know if you require anything else!

answered 2 years ago
0

Hello there,

Yes, there is no direct connection between Dynomodb and Sagemaker. You should export data to S3 then read it from Sakemaker or you can use glue or EMR to create connection between them.

You can check the following documentation:

https://aws.amazon.com/blogs/big-data/analyze-data-in-amazon-dynamodb-using-amazon-sagemaker-for-real-time-prediction/

answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions