Use Glue schema registry when reading from Kinesis

0

I want to store the schema for Avro formatted messages in Glue schema registry, and I want to use this schema when reading records from Kinesis data stream. Currently, for reading records from the stream, I'm using something like: avro_schemas = { "record1": """ { "type": "record", "name": "record1", "fields": [ {"name": "intField", "type": "int"}, {"name": "strField", "type": "string"} ] } """ }

dataframe = glueContext.create_data_frame.from_options( connection_type="kinesis", connection_options={ "typeOfData": "kinesis", "streamARN": <stream_arn>, "startingPosition": "latest", "classification": "avro", "inferSchema": "false", "avroSchema": avro_schema }, transformation_ctx=f"kinesis_data_frame" )

How can I read the schema from the registry and use it to create the data frame?

YK
feita há 5 meses664 visualizações
1 Resposta
0
Resposta aceita

You can use boto3 to get the schema: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/glue/client/get_schema_version.html
Normally you don't have to do that, you create a table based on the schema and then you use it in the streaming job.
Check this: https://docs.aws.amazon.com/glue/latest/dg/add-job-streaming.html#create-table-streaming

profile pictureAWS
ESPECIALISTA
respondido há 5 meses

Você não está conectado. Fazer login para postar uma resposta.

Uma boa resposta responde claramente à pergunta, dá feedback construtivo e incentiva o crescimento profissional de quem perguntou.

Diretrizes para responder a perguntas