Is there a hello world on running pydeequ in glue?

0

Hi there I'm trying to use pydeequ in glue (3.0 ETL) using this tutorial.

However I get this error:

2023-07-01 13:57:35,492 ERROR [main] glue.ProcessLauncher (Logging.scala:logError(73)): Error from Python:Traceback (most recent call last):
  File "/tmp/glue_job_integration.py", line 144, in <module>
    .onData(sdf)
  File "/home/spark/.local/lib/python3.7/site-packages/pydeequ/analyzers.py", line 52, in onData
    return AnalysisRunBuilder(self._spark_session, df)
  File "/home/spark/.local/lib/python3.7/site-packages/pydeequ/analyzers.py", line 124, in __init__
    self._AnalysisRunBuilder = self._jvm.com.amazon.deequ.analyzers.runners.AnalysisRunBuilder(df._jdf)
TypeError: 'JavaPackage' object is not callable
2023-07-01 13:57:35,492 ERROR [main] glue.ProcessLauncher (Logging.scala:logError(73)): Error from Python:Traceback (most recent call last): File "/tmp/glue_job_integration.py", line 144, in <module> .onData(sdf) File "/home/spark/.local/lib/python3.7/site-packages/pydeequ/analyzers.py", line 52, in onData return AnalysisRunBuilder(self._spark_session, df) File "/home/spark/.local/lib/python3.7/site-packages/pydeequ/analyzers.py", line 124, in __init__ self._AnalysisRunBuilder = self._jvm.com.amazon.deequ.analyzers.runners.AnalysisRunBuilder(df._jdf) TypeError: 'JavaPackage' object is not callable

I realize the example was made for Sagemaker. Anybody have a suggestion? This is my first time using deequ. Thank for reading!

已提问 1 年前351 查看次数
1 回答

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则