All Content tagged with AWS Glue

AWS Glue is a serverless data integration service that makes it easy to discover, prepare, and combine data for analytics, machine learning, and application development.

Content language: English

Select up to 5 tags to filter
Sort by most recent
Hi, My product is multi-tenant and each tenant is having isolated AWS RDS PostgreSQL DB. I have over 2000 tenants i.e. 2000 RDS dbs and over a period it will grow continuously. My product is kind...
2
answers
0
votes
33
views
asked a day ago
I have below python script in AWS Glue job. For incremental load logic i have now set the Job bookmark option to enable and also provided temporary path. I have "btb" table in oracle database. The...
1
answers
0
votes
15
views
RahulD
asked a day ago
I am having some inconsistencies with my Glue crawler and Glue ETL job and I need some help to figure out the best setup. At the moment, I have an S3 bucket where I store CSV files and I partition the...
0
answers
0
votes
18
views
Vas
asked a day ago
I have below python script where i have enabled job bookmark and also provided path in temp directory to create bookmark files. The problem is its creating bookmark json file which is empty. I dont...
1
answers
0
votes
12
views
RahulD
asked 2 days ago
Hello AWS Experts, I’m new to AWS Glue bookmarking and need some assistance with an issue I’m facing. Scenario: I have a testing Glue job that processes CSV files from the Landing/ folder, converts...
0
answers
0
votes
16
views
Sampath
asked 5 days ago
I have been testing the direct write to Iceberg feature in Firehose and I have come to realize that there is a problem with the feature. It does not always work. Let me elaborate: - I created an IAM...
0
answers
0
votes
23
views
profile picture
Humaid
asked 5 days ago
[Extracting key insights from Amazon S3 access logs with AWS Glue for Ray](https://aws.amazon.com/blogs/big-data/extracting-key-insights-from-amazon-s3-access-logs-with-aws-glue-for-ray/) introduces a...
2
answers
1
votes
29
views
asked 5 days ago
**Issue:** Our aim is to reduce logging to control data ingestion by 'PutLogEvent' metrics of CW. In past when we ran our glue job against a 35GB data size, we got a billed ~2K for cloudwatch most of...
1
answers
0
votes
45
views
asked 6 days ago
I have below Python script where currently it generates several gz files with size 4MB in S3 bucket. Its bydeafult what AWS glue has created. But now i want to create multiple files of each file size...
2
answers
0
votes
35
views
RahulD
asked 8 days ago
I have below python script in AWS Glue job. For incremental load logic i have now set the Job bookmark option to enable. And then i try to run the glue job again but it did not create any temporary...
1
answers
0
votes
30
views
RahulD
asked 9 days ago
We have data stored in Cosmos DB NoSQL and need to migrate it to Snowflake using AWS Glue with a Change Data Capture (CDC) approach. Our objective is to perform CRUD operations based on CDC to handle...
1
answers
0
votes
15
views
sowndar
asked 9 days ago