All Content tagged with AWS Glue
AWS Glue is a serverless data integration service that makes it easy to discover, prepare, and combine data for analytics, machine learning, and application development.
Content language: English
Select up to 5 tags to filter
Sort by most recent
1798 results
I am using an AWS Glue to write an ETL pipeline that gets data from an S3 bucket, processes them and writes them back to the bucket while also creating a Glue catalog table. The code I am using is the...
i'm now trying to use AWS CLI to set the jobRunQueuing param after job is created, however, below is not working :
aws glue update-job --job-name my-job --job-update '{"JobRunQueuingEnabled":...
Glue job has a new feature to use job queuing for sequencially running job run requests when max concurrency limit has reached.
This feature can be enabled by setting jobRunQueuingEnabled to true from...
I met a error from
```
Caused by: java.net.SocketTimeoutException: connect timed out
```
while script went to
```
# Script generated for node Amazon Redshift
AmazonRedshift_nodexxxxxxxxxxx =...
Hi folks,
I have a partitioned table in Athena that uses dynamic partition projection, enabled with the following table properties:
```
projection.account.type injected
projection.region.type ...
I have created a dynamodb table that stores some data, then created a glue crawler that crawls to store the metadata of this table so I can query it using Athena. I am seeing the dynamodb table got...
Would like a clear step by step install of the connector. I was able to download docker image but where does that go? First time using Glue and setting it up this way.
Currently crawler and glue job has been scheduled to run every 15 mins. But its running even without data and hence higher cost.
To streamline this, one option is to use an event driven approach as...
Hi
I've created a crawler that responds to SNS Queue messages and crawls some text files in an S3 bucket.
The bucket has sub-folders that the crawler seems to have identified and created a partition...
In AWS DMS, I have a Serverless replication, but I want to modify it now justs to add an extra table. No matter what I change, I get this error: Task Settings CloudWatchLogGroup or CloudWatchLogStream...
Hello,
I have data in the AWS S3 bucket in the format year/month/day/hour/file1.json, file2.json..... file6.json.
In the JSON file, the data are not separated by coma or new line....
Hallo,
I have job bookmark logic in glue with bookmark key is defined as my source is jdbc oracle.
Sometimes as fallback solution i might want to reset job bookmark with certain previous persisted...