1 Answer
- Newest
- Most votes
- Most comments
1
You have a couple of options:
- Amazon SageMaker DataWrangler. You can use Databricks as a data source in SageMaker Data Wrangler. This allows you to interactively query the data stored in Databricks using SQL, and preview data before importing it. Once the data is imported you can cleanse, engineer features, and prepare it for training. Please refer to the blog below for more information: https://aws.amazon.com/blogs/machine-learning/prepare-data-from-databricks-for-machine-learning-using-amazon-sagemaker-data-wrangler/
- AWS Glue: Assuming the Delta Lake table are stored in S3 you can build a Glue job to read, transform , and prepare the data for Sagemaker training. More info can be found here: https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-format-delta-lake.html
answered 2 years ago
Relevant content
- asked 2 years ago
- asked 2 years ago
- Accepted Answerasked 6 years ago
- asked 2 years ago
- AWS OFFICIALUpdated 2 months ago
- AWS OFFICIALUpdated 2 months ago
- AWS OFFICIALUpdated 2 months ago
- AWS OFFICIALUpdated 2 years ago