There are many ways to design this type of architecture and as you've seen some customers will do things differently. It totally depends on their comfort level with various technologies; requirements from an ingest and analytics perspective; as well as their budget.
Because there's no one "right" answer and this is a complex problem to solve I'd recommend that you reach out to a local AWS Solutions Architect to have a discussion as they can guide you and find the best solution for you.
In this case, the architecture that you have is fine and is very cost effective. but as above there are always other ways of doing it.
Thank you! Are you available to have a discussion?
Typically, yes. But there's no way on re:Post to (securely) exchange identities and details; and there's no way of telling if we're even close to being in the same timezone. I'd recommend that you contact your local AWS office - wherever "local" is.
I'm in the United States, Eastern Standard Time. I'm also not sensitive about sharing my own personal contact info here for the time being. I'll delete this later, but should you decide to reach out my email is ep@pcom.global.
Brett nailed this. :) I honestly like the Api Gateway -> Lambda -> Kinesis Data Stream -> Firehose -> OpenSearch -> Grafana(whatever dashboard tool you like). You have very little code to write with this stack. If you need to do some aggregation work, you can add KDA into the mix with the existing KDS as your source and another KDS as your sink. But like Brett said, so many options.
For the purposes of ad tracking (which does not always pass full parameters all of the time), it seems like a semi-structured database like Redshift would be the way to go, however, I'm concerned about whether or not making 50,000 INSERT requests per day is... well... smart.
50,000 inserts a day isn't a lot - that's less than one per second.
Relevant content
- asked 4 months ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 7 months ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 3 months ago
To be very specific to the question which you have asked the answer is Yes. If you don't have any specific cloud native services then you can eliminate these.
You can load data into Amazon Redshift from a range of data sources including Amazon S3, Amazon RDS, Amazon DynamoDB, Amazon EMR, AWS Glue, AWS Data Pipeline and or any SSH-enabled host on Amazon EC2 or on-premises. Amazon Redshift attempts to load your data in parallel into each compute node to maximize the rate at which you can ingest data into your data warehouse cluster. Clients can connect to Amazon Redshift using ODBC or JDBC and issue 'insert' SQL commands to insert the data. Please note this is slower than using S3 or DynamoDB since those methods load data in parallel to each compute node while SQL insert statements load via the single leader node.