Questions tagged with Data Lakes
Content language: English
Select up to 5 tags to filter
Sort by most recent
Browse through the questions and answers listed below or filter and sort to narrow down your results.
I'm a beginner in tasks of date engineer.
I have a taks to create a data lakehouse and i'm tryding undestand how to do it using these tools: DMS, S3, Glue and Hudi.
I already created a simple data...
1
answers
0
votes
655
views
asked 2 years agolg...
Hello! I am looking for an equivalent to this solution that MIcrosoft has flaunted called IDP intelligent data platform, it is governance + operations + analytics in one. they flaunt synapse with aml...
0
answers
0
votes
137
views
asked 2 years agolg...
I'm investigating and deploying https://docs.aws.amazon.com/solutions/latest/data-lake-solution/welcome.html
Looking at the GitHub repo https://github.com/aws-solutions/aws-data-lake-solution it looks...
3
answers
0
votes
432
views
asked 2 years agolg...
HI , I have data in delta format in s3 ,
created external table with following query .
>CREATE EXTERNAL TABLE delta_mongo.transactions
> ( `_id` string ,
>account_id bigint,
>bucket_end_date...
0
answers
0
votes
149
views
asked 2 years agolg...
From time to time I have a csv file coming with single row and it breaks the Glue Crawler because of the at least 2 row requirement to be classified as a CSV.
Is there a way I can provide a custom CSV...
0
answers
0
votes
172
views
asked 2 years agolg...
We have incoming file with the fixed length field length (.dat).
For example:
|2|123 |AWS |0505 |3
When Glue Crawler crawles the file, it ignores all the int/long values that have trailing...
0
answers
0
votes
134
views
asked 2 years agolg...
Hello, due to the following Step by Step Guide provided by the official AWS Athena user-guide (Link at the End of the question), it should be possible to connect Tableau Desktop to Athena and Lake...
0
answers
0
votes
331
views
asked 2 years agolg...
I need to pre-process some data on S3 before the Glue Crawler crawls the data. For this I created an S3 Object Lambda to do the pre-processing. If I test the Object Lambda using the CLI, it provides...
0
answers
1
votes
189
views
asked 2 years agolg...
We are using AWS EMR hadoop cluster, where prestodb is running. willing secure prestodb.
Need complete step by step implementation guide on apache ranger (open source) or via ldap.
Application...
0
answers
0
votes
167
views
asked 2 years agolg...
I have created a templated job (with parameters) to ingest data from different tables (passing the database and table as parameter) and write the data to S3 (passing the destination bucket as...
1
answers
1
votes
409
views
asked 2 years agolg...
Hudi Clusteringlg...
I am using EMR 6.6.0, which has hudi 10.1. I am trying to bulkinsert and do inline clustering using Hudi. But seems its not clustering the file as per file size being mentioned. But it is still...
1
answers
0
votes
510
views
asked 2 years agolg...
AWS Health Lakelg...
I was trying to go to documentation for Health Lake this is what I get:
Bad Request Your browser sent a request that the server could not understand. size of request header field excess server...
1
answers
0
votes
321
views
asked 2 years agolg...