Real time application insights architecture on AWS
We have several desktop (Win/Mac/Nix) monolithic applications that generate JSON events exposing these events via POST. Currently it is a one way deal, outbound from app to the stream for nothing but analytics. We are now looking to analyze those events to determine if there are actionable "data sequences" that start to appear. My Question is (and opinions are welcomed...) What is the most efficient (#1 - performant, #2 - cost effective, #3 - resilient) way to stream those records (250-500 byte msg, at a peak of 1,000 / second) to AWS for the purpose of: 1. Transforming the record data to a known model (i.e. std date format, simple XForms) 2. Analyzing the normalized record (ML - predictive analytics or iterative learning) 3. Persisting the record 4. Based on the analysis, sending back a payload (or pointer to a CDN stored payload) to the original calling app? In a nutshell looking to do a type of on-preises, near real-time application performance analytics of the applications using a single on-premises service spewing JSON out to AWS native cloud services that do all teh work, and return info as timely as possible.
Thanks.
<><
Hi,
there can be multiple options and patterns, one possibility would be to use an architecture based on the following components:
- Amazon Kinesis Stream: you stream your data to Kinesis, it can also be done using a API Gateway in front of it to capture your POST event
- Amazon Kinesis Firehose + Lambda to transform the data and make it available in S3 (the lambda after the transformation could also call a Sagemaker inference)
- If the analysis transformation to be done need to be more complex you could consider to use Flink on Kineses Data Analytics
- Amazon S3 to store the transformed data, you could also store the payload and have your CDN point to it
- Lambda to send back the payload , maybe with API Gateway to publish the APIs.
Two examples of these patterns can be seen in this reference architecture for Monitoring Streaming Data with Machine Learning and in this blog post on Real-Time In-Stream Inference with AWS Kinesis, SageMaker, & Apache Flink
hope this helps
Relevant questions
Using the 'time' field in PutEventsRequestEntry to schedule events
asked 3 years agoHow to know if a CloudTrail event is a Management Event or Data Event?
Accepted Answerasked a month agoCustomize SNS - Mobile: Push Notification Platform Application events payload
asked 2 months ago"Access Denied" not showing on CloudTrail
asked 5 months agoHandling S3 events. Race condition?
asked a month agoAnalyze historical cloudtrail data for S3
asked 5 months agoReal time application insights architecture on AWS
Accepted Answerasked a month agoCan we get sub categories in standard entity recognition of AWS Comprehend?
asked a month agoReal time interactions, training fails
asked 2 years agoCloud Watch Event Integration With Third Party Tool
Accepted Answerasked 4 months ago