Running PySpark Jobs Locally with the AWS Glue ETL library - On windows

0

I have followed the steps outlined to install Developing using the AWS Glue ETL library - Python on Windows found here: https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-libraries.html

After following the installation instructions, it's unclear how to actually execute a spark job successfully locally In powershell I have:

a simple pyspark job called my_script.py:

from pyspark.context import SparkContext
from pyspark.sql import SparkSession

sc = SparkContext()
spark = SparkSession(sc)

I have:

  1. navigated to the aws-glue-libs directory
  2. in PowerShell attempted.\bin\gluesparksubmit F:\programming\my_script.py
  3. the output seems to be nothing

Can you please provide correct example on how to execute a aws glue job locally?

The ultimate goal here is to develop my glue jobs locally in Pycharm before deploying to the AWS Glue Service.

Adriano
feita há um ano491 visualizações
1 Resposta
0

Those scripts are for Linux Bash, not PowerShell.
It might be possible to get then work using a Cygwin shell but it might be easier if you use the docker option

profile pictureAWS
ESPECIALISTA
respondido há um ano

Você não está conectado. Fazer login para postar uma resposta.

Uma boa resposta responde claramente à pergunta, dá feedback construtivo e incentiva o crescimento profissional de quem perguntou.

Diretrizes para responder a perguntas