Questions tagged with Analytics

Content language: English

Sort by most recent

Browse through the questions and answers listed below or filter and sort to narrow down your results.

Hello. I have an Athena query utilizing UNLOAD to bring data over to my S3 buckets. The query works quite well. However, I do not get the associated header information (column names) in the transferred files. I do not see an explicit parameter that I might be able to use to ensure the header attachment to the compressed (.gz) CSV files. Any help would be appreciated. Thanks. ``` UNLOAD (SELECT * FROM dataplace.datatable WHERE file_date = date '2022-07-01') TO 's3://my/super/bucket' WITH (format='TEXTFILE', field_delimiter = ',') ```
2
answers
0
votes
50
views
asked 3 months ago
i am trying to get give lake formation tag to user /role ,like how we give the same to the data catalog
2
answers
0
votes
41
views
Krishna
asked 3 months ago
How to include pyspark libraries when running a AWS Glue Python job.
1
answers
0
votes
52
views
asked 3 months ago
I'm trying to copy the CSV file from S3 bucket to Redshift and these are the conditions using, IGNOREHEADER AS 1 delimiter ',' region as 'us-east-1' csv; even the table schema nd CSV file is having the same data without nulls, still facing errors as - Invalid digit, Value 'B', Pos 0, Type: Integer ,Invalid dateformat, Char length exceeded. These are the following Cols presenet in the data - Uploaddate,Batchid,Inventoryid,Itemtype,Model,Supplierid,Quantity,Unit,Costcenter,Inventorydate,Embodiedco2,Usageco2,Riskadjustment.
1
answers
0
votes
40
views
asked 3 months ago
Hi, i have a table in Athena which has 'testcase' and 'tags' fields. here is the sample records of the table: * testcase1 has tags 'tag1,tag2,tag3' * testcase2 has tags 'tag2,tag4' * testcase3 has tags 'tag2' * testcase4 does not has tags I want to get all the tags from this table in QuickSight * tag1 * tag2 * tag3 * tag4 i've tried with adding new dataset from Athena and choose Use Custom SQL option, using STRING_SPLIT function, but cannot make it work. Can you please help me with the sql query? Many Thanks, Yj Liu
1
answers
0
votes
47
views
asked 3 months ago
Hi, We want to migrate our informatica jobs to AWS glue, is there any tool or process to migrate the jobs ?
1
answers
0
votes
39
views
asked 4 months ago
I have an include path like this one: s3://my-datalake/projects/. In this project folder, I have these folders within - daily-2022-11-05, daily-2022-11-06, incremental_123456, and incremental_234567 Each of these files contains a parquet file. Now, when the crawler runs, I want it to exclude everything that starts with incremental_ in its name. I did try using `incremental_**/**`. This is working for one crawler and isn't working for the other one. What I meant by isn't working for the other one - when I run the crawler it isn't updating the table or is failing.
1
answers
0
votes
28
views
asked 4 months ago
Using Glue Crawlers, I created Glue tables and querying it from Athena- How to I convert string to Date format? "2022-11-16T00:00:00.000Z" I have tried to_date function!
1
answers
0
votes
113
views
asked 4 months ago
Hello Dears, I am creating an ETL job to transfer data from MySQL table to s3 bucket, I just need to know how to build mapping between the two nodes, I need to have each row like the following format: Mysql format: col1,col2,col3,col4 xx,yy,ss,dd rr,ll,ff,gg JSON files should be like the following (each file should contain 1000 rows from MySQL table) : {'col1':'xx', 'col2':'yy','col3':'ss' , 'col4':'dd'} {'col1':'rr', 'col2':'ll','col3':'ff' , 'col4':'gg'} I appreciate your comments and help! With many thanks B
2
answers
0
votes
47
views
profile picture
asked 4 months ago
Hey, I would need to replicate to a Pivot table I have created in Quicksight, the "Summarize value field by AVERAGE" that is easily done in Excel. How can I do it?
0
answers
0
votes
15
views
asked 4 months ago
How can I calculate a baseline value that has a filter? For example, what I am trying to do is sumOver(sum(X),[{Row in Pivot table}, {Column in Pivot table}=0])
0
answers
0
votes
17
views
asked 4 months ago
Hi all, I run a simple query on Athena using Athena engine version 3. ``` create table if not exists "test"."total_customer" as select * from "A_customer" union all select * from "B_customer" ``` And I faced this error: `HIVE_PATH_ALREADY_EXISTS: Target directory for table 'test.total_customer' already exists: s3://xxxx/total_customer/2022/11/27/tables/ef02d72d-75e2-4b14-b53f-95d591094cfa. You may need to manually clean the data at location 's3://xxxx/total_customer/2022/11/27/tables/ef02d72d-75e2-4b14-b53f-95d591094cfa' before retrying. Athena will not delete data in your account.` The weird thing is it works perfectly when I use UNION instead of UNION ALL. So I'm not sure if this is Athena engine version 3 error or the query error. Thank you guys for taking a look at my issue.
1
answers
0
votes
85
views
asked 4 months ago