Kinesis Data transformation - using Lambda vs Glue

1

I am planning to move all the filtered logs from CloudWatch log group through Kinesis Firehose to an S3 bucket in parquet files. Given that CloudWatch log group always pushes gzipped data to Kinesis Firehose, I had to add a Lambda to unzip the data.

Now I am unsure, if the conversion of this filtered json data to Parquet should be done either by the Lambda (that is invoked to unzip the data) or should i convert it using the Glue table. Will it incur additional cost if I add AWS Glue to convert record format? Or if it is feasible to convert the data format in the Lambda itself? What are the pros and cons of using the either option?

I would appreciate some guidance on this.

2回答
1

Hi Neisha,

For streaming the filtered logs through Kinesis Firehose to an S3 bucket in parquet files, it's preferred to use Glue Table to convert your JSON input data into Parquet format, as Kinesis Firehose has a well-defined integration with Glue Tables for Schema specification and Record conversion [1].
If your input data is in a format other than JSON, then you can use your lambda function to convert it into JSON first.

Choosing the route of converting format using the Lambda function itself will require developing your own logic and code for format conversion, and it might need more computation duration. So, Glue is ideally a go-to choice for such conversion (especially when you are working with Kinesis Firehose, which has integrations).
Some examples of achieving this are mentioned here [2] [3] for your reference.

Regarding the cost, you will have to pay for the Data Catalog storage and requests. See the pricing here [4].

References:
[1] https://docs.aws.amazon.com/firehose/latest/dev/record-format-conversion.html
[2] https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-kinesisfirehose-deliverystream.html#aws-resource-kinesisfirehose-deliverystream--examples
[3] https://catalog.us-east-1.prod.workshops.aws/workshops/2300137e-f2ac-4eb9-a4ac-3d25026b235f/en-US/lab-3-kdf/kinesis
[4] https://aws.amazon.com/glue/pricing/

Thanks,
Atul

profile picture
回答済み 8ヶ月前
0

Go for Glue as IBAtulAnand mentioned. Less overhead and no need to maintain the lambda

profile picture
エキスパート
回答済み 8ヶ月前

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ