Questions tagged with Analytics
Content language: English
Sort by most recent
Hello. I have an Athena query utilizing UNLOAD to bring data over to my S3 buckets. The query works quite well. However, I do not get the associated header information (column names) in the transferred files. I do not see an explicit parameter that I might be able to use to ensure the header attachment to the compressed (.gz) CSV files. Any help would be appreciated. Thanks.
```
UNLOAD (SELECT * FROM dataplace.datatable WHERE file_date = date '2022-07-01')
TO 's3://my/super/bucket'
WITH (format='TEXTFILE', field_delimiter = ',')
```
i am trying to get give lake formation tag to user /role ,like how we give the same to the data catalog
How to include pyspark libraries when running a AWS Glue Python job.
I'm trying to copy the CSV file from S3 bucket to Redshift and these are the conditions using, IGNOREHEADER AS 1
delimiter ','
region as 'us-east-1'
csv;
even the table schema nd CSV file is having the same data without nulls, still facing errors as - Invalid digit, Value 'B', Pos 0, Type: Integer ,Invalid dateformat, Char length exceeded.
These are the following Cols presenet in the data - Uploaddate,Batchid,Inventoryid,Itemtype,Model,Supplierid,Quantity,Unit,Costcenter,Inventorydate,Embodiedco2,Usageco2,Riskadjustment.
Hi,
i have a table in Athena which has 'testcase' and 'tags' fields. here is the sample records of the table:
* testcase1 has tags 'tag1,tag2,tag3'
* testcase2 has tags 'tag2,tag4'
* testcase3 has tags 'tag2'
* testcase4 does not has tags
I want to get all the tags from this table in QuickSight
* tag1
* tag2
* tag3
* tag4
i've tried with adding new dataset from Athena and choose Use Custom SQL option, using STRING_SPLIT function, but cannot make it work.
Can you please help me with the sql query?
Many Thanks,
Yj Liu
Hi,
We want to migrate our informatica jobs to AWS glue, is there any tool or process to migrate the jobs ?
I have an include path like this one: s3://my-datalake/projects/.
In this project folder, I have these folders within - daily-2022-11-05, daily-2022-11-06, incremental_123456, and incremental_234567 Each of these files contains a parquet file.
Now, when the crawler runs, I want it to exclude everything that starts with incremental_ in its name.
I did try using `incremental_**/**`. This is working for one crawler and isn't working for the other one. What I meant by isn't working for the other one - when I run the crawler it isn't updating the table or is failing.
Using Glue Crawlers, I created Glue tables and querying it from Athena-
How to I convert string to Date format?
"2022-11-16T00:00:00.000Z"
I have tried to_date function!
Hello Dears,
I am creating an ETL job to transfer data from MySQL table to s3 bucket, I just need to know how to build mapping between the two nodes, I need to have each row like the following format:
Mysql format:
col1,col2,col3,col4
xx,yy,ss,dd
rr,ll,ff,gg
JSON files should be like the following (each file should contain 1000 rows from MySQL table) :
{'col1':'xx', 'col2':'yy','col3':'ss' , 'col4':'dd'}
{'col1':'rr', 'col2':'ll','col3':'ff' , 'col4':'gg'}
I appreciate your comments and help!
With many thanks
B
Hey, I would need to replicate to a Pivot table I have created in Quicksight, the "Summarize value field by AVERAGE" that is easily done in Excel. How can I do it?
How can I calculate a baseline value that has a filter? For example, what I am trying to do is sumOver(sum(X),[{Row in Pivot table}, {Column in Pivot table}=0])
Hi all,
I run a simple query on Athena using Athena engine version 3.
```
create table if not exists "test"."total_customer" as
select *
from "A_customer"
union all
select *
from "B_customer"
```
And I faced this error:
`HIVE_PATH_ALREADY_EXISTS: Target directory for table 'test.total_customer' already exists: s3://xxxx/total_customer/2022/11/27/tables/ef02d72d-75e2-4b14-b53f-95d591094cfa. You may need to manually clean the data at location 's3://xxxx/total_customer/2022/11/27/tables/ef02d72d-75e2-4b14-b53f-95d591094cfa' before retrying. Athena will not delete data in your account.`
The weird thing is it works perfectly when I use UNION instead of UNION ALL. So I'm not sure if this is Athena engine version 3 error or the query error.
Thank you guys for taking a look at my issue.