Use Glue schema registry when reading from Kinesis

0

I want to store the schema for Avro formatted messages in Glue schema registry, and I want to use this schema when reading records from Kinesis data stream. Currently, for reading records from the stream, I'm using something like: avro_schemas = { "record1": """ { "type": "record", "name": "record1", "fields": [ {"name": "intField", "type": "int"}, {"name": "strField", "type": "string"} ] } """ }

dataframe = glueContext.create_data_frame.from_options( connection_type="kinesis", connection_options={ "typeOfData": "kinesis", "streamARN": <stream_arn>, "startingPosition": "latest", "classification": "avro", "inferSchema": "false", "avroSchema": avro_schema }, transformation_ctx=f"kinesis_data_frame" )

How can I read the schema from the registry and use it to create the data frame?

YK
preguntada hace 5 meses665 visualizaciones
1 Respuesta
0
Respuesta aceptada

You can use boto3 to get the schema: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/glue/client/get_schema_version.html
Normally you don't have to do that, you create a table based on the schema and then you use it in the streaming job.
Check this: https://docs.aws.amazon.com/glue/latest/dg/add-job-streaming.html#create-table-streaming

profile pictureAWS
EXPERTO
respondido hace 5 meses

No has iniciado sesión. Iniciar sesión para publicar una respuesta.

Una buena respuesta responde claramente a la pregunta, proporciona comentarios constructivos y fomenta el crecimiento profesional en la persona que hace la pregunta.

Pautas para responder preguntas