Skip to content

All Content tagged with AWS Glue

AWS Glue is a serverless data integration service that makes it easy to discover, prepare, and combine data for analytics, machine learning, and application development.

Content language: English

Filter content
Select tags to filter
Sort by
Sort by most recent
2170 results
I'm running `TRUNCATE` commands in my AWS Glue jobs on iceberg tables managed by AWS Lake Formation with hybrid access mode enabled, but I receive "not authorized to perform: s3:DeleteObject" errors. ...
Hi, in AWS docs it is said that data that Glue jobs use is encrypted via KMS keys when it is at rest and in transit, which provides ambiguity since Glue jobs might use S3 instead of their local disk. ...
1
answers
0
votes
54
views
asked 3 months ago
While using **AWS DataZone Catalog**, I observed an issue with **glossary term navigation from assets**. ### **Observed behavior** * Assets (e.g., Glue Tables) show associated **Glossary Terms** as ...
1
answers
0
votes
53
views
asked 4 months ago
I have an IAM role for my bucket that has the following permissions: "s3:PutObject", "s3:GetObject", "iam:PassRole", "kms:Decrypt", "kms:Encrypt", "kms:GenerateDataKey", "s3:ListBucket", "s3:DeleteObj...
1
answers
0
votes
63
views
asked 4 months ago
Production streaming applications across Apache Flink, Spark Streaming, Kafka Streams, Kinesis Client Library and AWS Lambda might experience throttling on AWS Glue Schema Registry's GetSchemaVersion ...
I am attempting to use Hudi 1.1.1 in an AWS 5.0 Glue job. To avoid conflicts with the pre-installed Glue Hudi libraries, I have set the job parameter `--datalake-formats` to an empty string (""). Howe...
2
answers
0
votes
105
views
asked 4 months ago
Apache Iceberg is a leading open table format for analytics. When paired with Kafka Connect, teams can build streaming pipelines into Iceberg tables. However, a known issue with the Kafka Connect Ice...
Looking to implement sort compaction on an S3 Tables Iceberg table following the blog post: > [https://aws.amazon.com/blogs/aws/new-improve-apache-iceberg-query-performance-in-amazon-s3-with-sort-and...
5
answers
0
votes
363
views
asked 4 months ago
I'm using Amazon Kinesis Data Firehose to deliver streaming data into an Apache Iceberg table registered in the AWS Glue Data Catalog, which is part of a non-default catalog (s3tablescatalog/pulse-pro...
1
answers
0
votes
157
views
asked 4 months ago
I am building a Lakehouse solution using aws glue visual etl. When writing the dataset using the target s3 node in visual editor, there is no option to specify writemode() to overwrite When i checked ...
1
answers
0
votes
89
views
asked 4 months ago
I'm trying to install Python modules in my AWS Glue Python Shell job using wheel files stored in Amazon Simple Storage Service (Amazon S3). My job runs in a private Virtual Private Cloud (VPC) with Am...
We have a large-scale S3 data lake with the following characteristics: - Source: AWS Flink application writing Parquet files directly to S3 - Volume: ~4000 Parquet files per hour, ranging from 200GB ...
1
answers
0
votes
81
views
asked 4 months ago
  • 1
  • 2
  • 3
  • 4
  • 5
  • •••
  • 181
  • Page size
    12 / page