(Glue 3.0 / Glue 4.0) How do you use 'useConnectionProperties' and 'connectionName' when opening a JDBC connection?

0

The Glue JDBC Connection documentation states:

If you already have a JDBC connection defined, you can reuse the configuration properties defined in it, such as: url, user and password; so you don't have to specify them in the code as connection options. To do so, use the following connection properties:

  • "useConnectionProperties": Set it to "true" to indicate you want to use the configuration from a connection.
  • "connectionName": Enter the connection name to retrieve the configuration from, the connection must be defined in the same region as the job.

There is no further documentation of how to use these properties. I have tried to set these properties as kwargs to glueContext.create_dynamic_frame.from_options(), but the method continues to throw an error if url is not specified in connectionOptions:

dbtable = 'schema.table'
query = f"select top 1 * from {dbtable}"

dyf = glueContext.create_dynamic_frame.from_options(connection_type="sqlserver", 
                                                    useConnectionProperties="true", 
                                                    connectionName="My-Glue-Connection",
                                                                        connection_options={
                                                                            "dbtable": dbtable,
                                                                            "sampleQuery": query
                                                                        })
Py4JJavaError: An error occurred while calling o212.getDynamicFrame.
: java.util.NoSuchElementException: key not found: url
	at scala.collection.MapLike$class.default(MapLike.scala:228)
	at scala.collection.AbstractMap.default(Map.scala:59)
	at scala.collection.MapLike$class.apply(MapLike.scala:141)
	at scala.collection.AbstractMap.apply(Map.scala:59)
	at com.amazonaws.services.glue.util.JDBCWrapper$.apply(JDBCUtils.scala:913)
	at com.amazonaws.services.glue.util.JDBCWrapper$.apply(JDBCUtils.scala:909)
	at com.amazonaws.services.glue.JDBCDataSource.getDynamicFrame(DataSource.scala:943)
	at com.amazonaws.services.glue.DataSource$class.getDynamicFrame(DataSource.scala:97)
	at com.amazonaws.services.glue.SparkSQLDataSource.getDynamicFrame(DataSource.scala:709)
	...

The same error occurs if I pass the two properties via connection_options:

dyf = glueContext.create_dynamic_frame.from_options(connection_type="sqlserver", 
                                                        connection_options={
                                                            "useConnectionProperties": "true", 
                                                            "connectionName": "IDAP-Glue-Connection",
                                                            "dbtable": dbtable,
                                                            "sampleQuery": query
                                                        })
Py4JJavaError: An error occurred while calling o123.getDynamicFrame.
: java.util.NoSuchElementException: key not found: url
...

How is this feature intended to be used? The only methods I've found to use a Glue Connection to read from a JDBC database are really unwieldy, I would expect tighter integration of Glue Connections in Glue Jobs.

已提问 1 年前948 查看次数
1 回答
0

Hello,

Looking at the errors i can observe you are not able to use connection options in glue script with a defined connection in glue catalog

To simpilfy the process of generating the script you can use glue studio that automatically generates a script when you have a connection defined.

Glue studio generates a function directJDBCSource that returns a dynamic frame and parameters such as connectionName,connectionType, database="test",table="test.testing",redshiftTmpDir="" . The values of these parameter can be supplied to to the function to produce a dynamic frame that extracts data from a jdbc source

You can copy the script to any other any other custom job where you have a connection defined in catalog

AWS
Sahil_S
已回答 1 年前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则