- Newest
- Most votes
- Most comments
Hello jweyenberg,
Below is the key error message:
An error occurred while calling o79.getDynamicFrame.\n: java.lang.RuntimeException: class com.amazonaws.services.gluejobexecutor.model.AccessDeniedException:User: arn:aws:sts::390172919295:assumed-role/AWSGlueServiceRole-zohodatacrawl/GlueJobRunnerSession is not authorized to perform: lakeformation:GetDataAccess on resource: arn:aws:glue:us-east-2:390172919295:table/appdata/payment_csv because no identity-based policy allows the lakeformation:GetDataAccess action
which indicates the role (AWSGlueServiceRole-zohodatacrawl) you are using to run the Glue job does not have "GetDataAccess" action permission on the table "arn:aws:glue:us-east-2:390172919295:table/appdata/payment_csv".
You can add below policy into your role to fix this:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "LakeFormationDataAccess",
"Effect": "Allow",
"Action": [
"lakeformation:GetDataAccess"
],
"Resource": "*"
}
]
}
Reference: https://docs.aws.amazon.com/lake-formation/latest/dg/upgrade-glue-lake-formation-step3.html
As Lake Formation is known for multi-layer security, if you meet other Lake Formation related permission errors, you can have a check on this page: https://docs.aws.amazon.com/lake-formation/latest/dg/troubleshooting.html
OP posting the full error logs for additional detail:
Aug 26, 2022, 5:44:08 AM Pending execution 2022-08-26 12:44:21,934 main WARN JNDI lookup class is not available because this JRE does not support JNDI. JNDI string lookups will not be available, continuing configuration. java.lang.ClassNotFoundException: org.apache.logging.log4j.core.lookup.JndiLookup at java.net.URLClassLoader.findClass(URLClassLoader.java:387) at java.lang.ClassLoader.loadClass(ClassLoader.java:418) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352) at java.lang.ClassLoader.loadClass(ClassLoader.java:351) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:264) at org.apache.logging.log4j.util.LoaderUtil.loadClass(LoaderUtil.java:173) at org.apache.logging.log4j.util.LoaderUtil.newInstanceOf(LoaderUtil.java:211) at org.apache.logging.log4j.util.LoaderUtil.newCheckedInstanceOf(LoaderUtil.java:232) at org.apache.logging.log4j.core.util.Loader.newCheckedInstanceOf(Loader.java:301) at org.apache.logging.log4j.core.lookup.Interpolator.<init>(Interpolator.java:95) at org.apache.logging.log4j.core.config.AbstractConfiguration.<init>(AbstractConfiguration.java:114) at org.apache.logging.log4j.core.config.DefaultConfiguration.<init>(DefaultConfiguration.java:55) at org.apache.logging.log4j.core.layout.PatternLayout$Builder.build(PatternLayout.java:430) at org.apache.logging.log4j.core.layout.PatternLayout.createDefaultLayout(PatternLayout.java:324) at org.apache.logging.log4j.core.appender.ConsoleAppender$Builder.<init>(ConsoleAppender.java:121) at org.apache.logging.log4j.core.appender.ConsoleAppender.newBuilder(ConsoleAppender.java:111) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.logging.log4j.core.config.plugins.util.PluginBuilder.createBuilder(PluginBuilder.java:158) at org.apache.logging.log4j.core.config.plugins.util.PluginBuilder.build(PluginBuilder.java:119) at org.apache.logging.log4j.core.config.AbstractConfiguration.createPluginObject(AbstractConfiguration.java:813) at org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration.java:753) at org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration.java:745) at org.apache.logging.log4j.core.config.AbstractConfiguration.doConfigure(AbstractConfiguration.java:389) at org.apache.logging.log4j.core.config.AbstractConfiguration.initialize(AbstractConfiguration.java:169) at org.apache.logging.log4j.core.config.AbstractConfiguration.start(AbstractConfiguration.java:181) at org.apache.logging.log4j.core.LoggerContext.setConfiguration(LoggerContext.java:446) at org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:520) at org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:536) at org.apache.logging.log4j.core.LoggerContext.start(LoggerContext.java:214) at org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:146) at org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:41) at org.apache.logging.log4j.LogManager.getContext(LogManager.java:194) at org.apache.logging.log4j.LogManager.getLogger(LogManager.java:597) at org.apache.spark.metrics.sink.MetricsConfigUtils.<clinit>(MetricsConfigUtils.java:12) at org.apache.spark.metrics.sink.MetricsProxyInfo.fromConfig(MetricsProxyInfo.java:17) at com.amazonaws.services.glue.cloudwatch.CloudWatchLogsAppenderCommon.<init>(CloudWatchLogsAppenderCommon.java:62) at com.amazonaws.services.glue.cloudwatch.CloudWatchLogsAppenderCommon$CloudWatchLogsAppenderCommonBuilder.build(CloudWatchLogsAppenderCommon.java:79) at com.amazonaws.services.glue.cloudwatch.CloudWatchAppender.activateOptions(CloudWatchAppender.java:73) at org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307) at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172) at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104) at org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:842) at org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:768) at org.apache.log4j.PropertyConfigurator.parseCatsAndRenderers(PropertyConfigurator.java:672) at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:516) at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:580) at org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:526) at org.apache.log4j.LogManager.<clinit>(LogManager.java:127) at org.slf4j.impl.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:81) at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:358) at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:383) at org.apache.spark.network.util.JavaUtils.<clinit>(JavaUtils.java:41) at org.apache.spark.internal.config.ConfigHelpers$.byteFromString(ConfigBuilder.scala:67) at org.apache.spark.internal.config.ConfigBuilder$$anonfun$bytesConf$1.apply(ConfigBuilder.scala:235) at org.apache.spark.internal.config.ConfigBuilder$$anonfun$bytesConf$1.apply(ConfigBuilder.scala:235) at org.apache.spark.internal.config.TypedConfigBuilder$$anonfun$transform$1.apply(ConfigBuilder.scala:101) at org.apache.spark.internal.config.TypedConfigBuilder$$anonfun$transform$1.apply(ConfigBuilder.scala:101) at org.apache.spark.internal.config.TypedConfigBuilder.createWithDefault(ConfigBuilder.scala:143) at org.apache.spark.internal.config.package$.<init>(package.scala:121) at org.apache.spark.internal.config.package$.<clinit>(package.scala) at org.apache.spark.SparkConf$.<init>(SparkConf.scala:716) at org.apache.spark.SparkConf$.<clinit>(SparkConf.scala) at org.apache.spark.SparkConf.set(SparkConf.scala:95) at org.apache.spark.SparkConf$$anonfun$loadFromSystemProperties$3.apply(SparkConf.scala:77) at org.apache.spark.SparkConf$$anonfun$loadFromSystemProperties$3.apply(SparkConf.scala:76) at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733) at scala.collection.immutable.HashMap$HashMap1.foreach(HashMap.scala:221) at scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:428) at scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:428) at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732) at org.apache.spark.SparkConf.loadFromSystemProperties(SparkConf.scala:76) at org.apache.spark.SparkConf.<init>(SparkConf.scala:71) at org.apache.spark.SparkConf.<init>(SparkConf.scala:58) at com.amazonaws.services.glue.SparkProcessLauncherPlugin$class.getSparkConf(ProcessLauncher.scala:41) at com.amazonaws.services.glue.ProcessLauncher$$anon$1.getSparkConf(ProcessLauncher.scala:78) at com.amazonaws.services.glue.ProcessLauncher.<init>(ProcessLauncher.scala:84) at com.amazonaws.services.glue.ProcessLauncher.<init>(ProcessLauncher.scala:78) at com.amazonaws.services.glue.ProcessLauncher$.main(ProcessLauncher.scala:29) at com.amazonaws.services.glue.ProcessLauncher.main(ProcessLauncher.scala) 2022-08-26 12:44:21,938 main INFO Log4j appears to be running in a Servlet environment, but there's no log4j-web module available. If you want better web container support, please add the log4j-web JAR to your web archive or server lib directory. GlueTelmty1*,SM,GAUGE,2022-08-26T12:44:26Z,,jr_2201071918b9e489e76f8ab97876e7ba3a757354740fb454ff6f706dbbe16a63,glue.spark-application-
Here is the stack trace from the run here:
22/08/24 15:24:40 ERROR GlueExceptionAnalysisListener: [Glue Exception Analysis] {
"Event": "GlueETLJobExceptionEvent",
"Timestamp": 1661354680434,
"Failure Reason": "Traceback (most recent call last):\n File \"/tmp/wp-zoho-parquet-transform\", line 20, in <module>\n datasource0 = glueContext.create_dynamic_frame.from_catalog(database = \"appdata\", table_name = \"payment_csv\", transformation_ctx = \"datasource0\")\n File \"/opt/amazon/lib/python3.6/site-packages/awsglue/dynamicframe.py\", line 642, in from_catalog\n return self._glue_context.create_dynamic_frame_from_catalog(db, table_name, redshift_tmp_dir, transformation_ctx, push_down_predicate, additional_options, catalog_id, **kwargs)\n File \"/opt/amazon/lib/python3.6/site-packages/awsglue/context.py\", line 186, in create_dynamic_frame_from_catalog\n return source.getFrame(**kwargs)\n File \"/opt/amazon/lib/python3.6/site-packages/awsglue/data_source.py\", line 36, in getFrame\n jframe = self._jsource.getDynamicFrame()\n File \"/opt/amazon/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py\", line 1257, in __call__\n answer, self.gateway_client, self.target_id, self.name)\n File \"/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/utils.py\", line 63, in deco\n return f(*a, **kw)\n File \"/opt/amazon/spark/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py\", line 328, in get_return_value\n format(target_id, \".\", name), value)\npy4j.protocol.Py4JJavaError: An error occurred while calling o79.getDynamicFrame.\n: java.lang.RuntimeException: class com.amazonaws.services.gluejobexecutor.model.AccessDeniedException:User: arn:aws:sts::390172919295:assumed-role/AWSGlueServiceRole-zohodatacrawl/GlueJobRunnerSession is not authorized to perform: lakeformation:GetDataAccess on resource: arn:aws:glue:us-east-2:390172919295:table/appdata/payment_csv because no identity-based policy allows the lakeformation:GetDataAccess action (Service: AWSLakeFormation; Status Code: 400; Error Code: AccessDeniedException; Request ID: 4813cf44-7a93-4f9e-b833-88ba1f383457; Proxy: null) (Service: AWSGlueJobExecutor; Status Code: 400; Error Code: AccessDeniedException; Request ID: 253e2ef8-35e6-46f9-9a63-3300136a7ead; Proxy: null)\n\tat com.amazonaws.services.glue.remote.LakeformationCredentialsProvider.refresh(LakeformationCredentialsProvider.scala:51)\n\tat com.amazonaws.services.glue.remote.LakeformationCredentialsProvider.<init>(LakeformationCredentialsProvider.scala:78)\n\tat sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)\n\tat sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)\n\tat sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)\n\tat java.lang.reflect.Constructor.newInstance(Constructor.java:423)\n\tat com.amazonaws.services.glue.remote.MichiganAWSCredentialProviderProxy$.get(MichiganAWSCredentialProviderProxy.scala:14)\n\tat com.amazonaws.services.glue.util.FileSchemeWrapper.setHadoopConfiguration(FileSchemeWrapper.scala:50)\n\tat com.amazonaws.services.glue.util.FileSchemeWrapper.executeWith(FileSchemeWrapper.scala:81)\n\tat com.amazonaws.services.glue.util.FileSchemeWrapper.executeWithQualifiedScheme(FileSchemeWrapper.scala:89)\n\tat com.amazonaws.services.glue.HadoopDataSource.getDynamicFrame(DataSource.scala:543)\n\tat com.amazonaws.services.glue.DataSource$class.getDynamicFrame(DataSource.scala:97)\n\tat com.amazonaws.services.glue.HadoopDataSource.getDynamicFrame(DataSource.scala:240)\n\tat sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\n\tat sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)\n\tat sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat java.lang.reflect.Method.invoke(Method.java:498)\n\tat py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)\n\tat py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)\n\tat py4j.Gateway.invoke(Gateway.java:282)\n\tat py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)\n\tat py4j.commands.CallCommand.execute(CallCommand.java:79)\n\tat py4j.GatewayConnection.run(GatewayConnection.java:238)\n\tat java.lang.Thread.run(Thread.java:750)\n",
"Stack Trace": [
{
"Declaring Class": "get_return_value",
"Method Name": "format(target_id, \".\", name), value)",
"File Name": "/opt/amazon/spark/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py",
"Line Number": 328
},
{
"Declaring Class": "deco",
"Method Name": "return f(*a, **kw)",
"File Name": "/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/utils.py",
"Line Number": 63
},
{
"Declaring Class": "__call__",
"Method Name": "answer, self.gateway_client, self.target_id, self.name)",
"File Name": "/opt/amazon/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py",
"Line Number": 1257
},
{
"Declaring Class": "getFrame",
"Method Name": "jframe = self._jsource.getDynamicFrame()",
"File Name": "/opt/amazon/lib/python3.6/site-packages/awsglue/data_source.py",
"Line Number": 36
},
{
"Declaring Class": "create_dynamic_frame_from_catalog",
"Method Name": "return source.getFrame(**kwargs)",
"File Name": "/opt/amazon/lib/python3.6/site-packages/awsglue/context.py",
"Line Number": 186
},
{
"Declaring Class": "from_catalog",
"Method Name": "return self._glue_context.create_dynamic_frame_from_catalog(db, table_name, redshift_tmp_dir, transformation_ctx, push_down_predicate, additional_options, catalog_id, **kwargs)",
"File Name": "/opt/amazon/lib/python3.6/site-packages/awsglue/dynamicframe.py",
"Line Number": 642
},
{
"Declaring Class": "<module>",
"Method Name": "datasource0 = glueContext.create_dynamic_frame.from_catalog(database = \"appdata\", table_name = \"payment_csv\", transformation_ctx = \"datasource0\")",
"File Name": "/tmp/wp-zoho-parquet-transform",
"Line Number": 20
}
],
"Last Executed Line number": 20,
"script": "wp-zoho-parquet-transform"
}
Relevant content
- AWS OFFICIALUpdated 3 years ago
- AWS OFFICIALUpdated a year ago
There is not enough information to resolve this. It would help if you could paste the complete stack trace.
The best guess would be related to permissions to write into S3(Lake formation or S3 permisions, etc) OR and issue with flattening your struct. See if you can do it using Spark instead:
resolvechoice3 = resolvechoice3.toDF()#convert to data frame resolvechoice3 = resolvechoice3.select(flatten(resolvechoice3.schema))#flatten resolvechoice3 = DynamicFrame.fromDF(resolvechoice3, glue_context, "final_convert")
Thank you! I posted the full trace as an answer below; no luck with the Spark additions unfortunately.
Hello jweyenberg, Can you post the stack trace before and after the exact error message "An error occurred while calling o79.getDynamicFrame. java.lang.reflect.InvocationTargetException"?
Ah, sorry - misunderstood. Please see that also in an Answer response below.