Invoking EvaluateDataQuality().process_rows returns an NoSuchElement error

0

I'm trying to customize a script built via Visual ETL and came across a error Steps I followed

  • A DynamicFrame created from csv file present in s3 needs to be evaluated for its data-type
  • I'm enforcing a schema to a custom schema using apply_mapping function and reset its schema
  • Then I perform a conversion from NaN to None ( converting DynamicFrame to Dataframe and then back to DynamicFrame)
  • I passed this transformed **Parsed_dynamic_Frame **(DynamicFrame) to EvaluateDataQuality().process_rows for validation with ruleset Rules = [ ColumnDataType "Col_name" = "integer" ]
data_schema_dquee_results_set = EvaluateDataQuality().process_rows(
        frame=Parsed_dynamic_Frame,
        ruleset=ruleset,
        publishing_options={
            "dataQualityEvaluationContext": "data_quality_schema_rule_set",
            # "enableDataQualityCloudWatchMetrics": True,
            "enableDataQualityResultsPublishing": True
        },
        additional_options={"performanceTuning.caching": "CACHE_NOTHING"}
    )

I encounter the following error

: An error occurred while calling z:com.amazonaws.services.glue.dq.EvaluateDataQuality.processRows.
: java.util.NoSuchElementException: None.get
	at scala.None$.get(Option.scala:529)
	at scala.None$.get(Option.scala:527)
	at com.amazonaws.services.glue.dq.EvaluateDataQualityHelper.processInner(EvaluateDataQuality.scala:283)
	at com.amazonaws.services.glue.dq.EvaluateDataQualityHelper.process(EvaluateDataQuality.scala:157)
	at com.amazonaws.services.glue.dq.EvaluateDataQualityHelper.process$(EvaluateDataQuality.scala:152)
	at com.amazonaws.services.glue.dq.EvaluateDataQuality$.process(EvaluateDataQuality.scala:64)
	at com.amazonaws.services.glue.dq.EvaluateDataQuality$.processRows(EvaluateDataQuality.scala:124)
	at com.amazonaws.services.glue.dq.EvaluateDataQuality.processRows(EvaluateDataQuality.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
	at py4j.Gateway.invoke(Gateway.java:282)
	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
	at py4j.commands.CallCommand.execute(CallCommand.java:79)
	at py4j.GatewayConnection.run(GatewayConnection.java:238)
	at java.lang.Thread.run(Thread.java:750)

Couldn't find much here. Can someone help me here?

Fyi, This is all done in AWS Glue Notebooks %glue_version 4.0 %idle_timeout 2880 %worker_type G.1X %number_of_workers 2

Prog
gefragt vor 5 Monaten261 Aufrufe
1 Antwort
0
Akzeptierte Antwort

The "publish" option requires information about the job in order to register it so it's visible in the job quality tab.
Since you don't have a job but a notebook, it fails. You would need to disable enableDataQualityResultsPublishing as you did for enableDataQualityCloudWatchMetrics I'll open a ticket so it's handled better but maybe try setting the properties manually with some placeholders and see if it lets you run the code:

spark._jvm.java.lang.System.setProperty("spark.glue.JOB_NAME", "MyNotebook")
spark._jvm.java.lang.System.setProperty("spark.glue.JOB_RUN_ID", "jr_1234")

profile pictureAWS
EXPERTE
beantwortet vor 5 Monaten
profile picture
EXPERTE
überprüft vor 5 Monaten
  • @Gonzalo Herreros, Setting enableDataQualityResultsPublishing to False prevented the error But, does that mean I'll not be able to access DataQualityRulesPass , DataQualityRulesFail , DataQualityRulesSkip , DataQualityEvaluationResult parameters?

  • It looks like I actually need the job results for further processing.

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen