KDA (Flink) to S3

0

A customer is building a streaming solution that processes messages from their customers. Data in the stream is sent from multiple customers - each JSON in the stream has a customerId that tells you what customer originally sent the data.

They want to store a copy of this data on S3, but partitioned by customerId. We could do this with multiple Firehoses (one for each customer). Is this possible to do with Flink - so use Flink to pull from the Kinesis stream and send to multiple destinations on S3, into files based on customer Id? I'm guessing this is possible - is it easy and can we use any existing libraries?

Thanks

AWS
Nick
질문됨 4년 전709회 조회
1개 답변
0
수락된 답변

This post describes how you can use KDA/Flink for exactly that use case: https://aws.amazon.com/blogs/big-data/streaming-etl-with-apache-flink-and-amazon-kinesis-data-analytics/. The section "Persisting data in Amazon S3 with data partitioning" describes how to realize data partitioning and you can find the sources on GitHub: https://github.com/aws-samples/amazon-kinesis-analytics-streaming-etl.

답변함 4년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠