How to transfer dataset (whole dataset at once) from Bigquery to S3 by AWS Glue?
Hi Dears
Hope all is great with you.
I have tried to migrate data from google big query to AWS S3. I have an issue from source side as below (console).
Connection options:
Enter additional key-value pairs for your data source connection:
**Key: parentProject value: serene-craft-3363XX
**Key: table value: bigquery-public-data:austin_bikeshare.bikeshare_stations
The question that if i want to put dataset instead of table, how that will be??
Noting that I tried write ( dataset ) but not working!
Can you advise please? Thanks in advance Basem
Hi,
If I guess correctly you are using AWS Glue Studio and the AWS Glue big Query connector.
Currently the Glue Big query connector is working at table level (as the BigQuery Spark Connector does).
If you want to export all the tables in a dataset you may edit the script generated by Glue Studio and customize it.
you would first need to add the google.cloud python library using the method mentioned here.
then before you read the table, you read the list of tables in the dataset as described here.
Finally you iterate on the tables and you read/write them to S3.
This is one possibility the other would be to use an orchestrator as StepFunctions (an alternative could be Airflow), to run a python script to read the list of tables, and then execute the your job (once parametrized by tablename) in parallel for each table.
hope this helps
Relevant questions
Copying data from sql server to snowflake with AWS GLUE
asked a month agoTransfer Data from S3
asked 14 days agoDo we have to Spice duplicate data?
Accepted Answerasked 2 months agoHow to copy a large dataset from on-premises Hadoop Cluster to S3?
Accepted Answerasked 3 years agoTransfer from S3 to Redshift and filter specific key
asked 7 months agoUnable to refresh QuickSight Dataset
asked 5 months agoCopying labeled dataset for AWS Rekognition from one account to another
asked 7 months agoHow could we have Glue to get data from csv as String?
Accepted Answerasked 2 months agoHow to transfer dataset (whole dataset at once) from Bigquery to S3 by AWS Glue?
Accepted Answerasked 3 months agoPartition schema mismatch in Glue Table
asked a month ago