Running PySpark Jobs Locally with the AWS Glue ETL library - On windows


I have followed the steps outlined to install Developing using the AWS Glue ETL library - Python on Windows found here:

After following the installation instructions, it's unclear how to actually execute a spark job successfully locally In powershell I have:

a simple pyspark job called

from pyspark.context import SparkContext
from pyspark.sql import SparkSession

sc = SparkContext()
spark = SparkSession(sc)

I have:

  1. navigated to the aws-glue-libs directory
  2. in PowerShell attempted.\bin\gluesparksubmit F:\programming\
  3. the output seems to be nothing

Can you please provide correct example on how to execute a aws glue job locally?

The ultimate goal here is to develop my glue jobs locally in Pycharm before deploying to the AWS Glue Service.

posta un anno fa475 visualizzazioni
1 Risposta

Those scripts are for Linux Bash, not PowerShell.
It might be possible to get then work using a Cygwin shell but it might be easier if you use the docker option

profile pictureAWS
con risposta un anno fa

Accesso non effettuato. Accedi per postare una risposta.

Una buona risposta soddisfa chiaramente la domanda, fornisce un feedback costruttivo e incoraggia la crescita professionale del richiedente.

Linee guida per rispondere alle domande