Questions tagged with Architecture Strategy
Content language: English
Sort by most recent
AWS Real-Time Ad Tracker Architecture
Hello. I'm attempting to build an ad-tracking application that can attribute, store, and then query and analyze website visitor information in real or near real-time. Unfortunately, I'm finding difficulty designing the application architecture as I am new to AWS overall. So far, I expected my application to look like this: 1. API Gateway to serve as a secure endpoint for websites and ad servers to send website visitor information (think utm parameters, device resolution, internal ID's etc) 2. Lambda/Node.js to route and attribute session information 3. DynamoDB for its ability to handle high-volume write rates in a cost-efficient way. 4. S3 to create frequent/on-demand backups of DynamoDB which can then be analyzed by 5. ? Considering passing all S3 data back for client-side processing in my dashboard. **However:** I just found [this case study with Nasdaq](https://aws.amazon.com/solutions/case-studies/nasdaq-case-study/?pg=ln&sec=c) utilizing [redshift and other services shown here](https://aws.amazon.com/redshift/?p=ft&c=aa&z=3). Judging from the 'Data' label featured in the first illustration of the latter link (clickstreams, transactions, etc) it appears to be exactly what I need. So, I suppose my question would be from a cost, simplicity and efficiency standpoint: Would it just be easier to eliminate dynamodb and s3 and instead configure my lambda functions to send their data directly into redshift? Any guidance would be greatly appreciated, thank you!
Find out hidden costs when moving to AWS
Hello everyone 👋 I am opening this post to ask for some costs-related information as I would like to get an estimation of how much money I would be paying for the architecture of my service. The architecture is quite simple. I would like to ship some data from Google Cloud infrastructure to an AWS S3 bucket and then download it in an EC2 machine to process it. This is a picture of that diagram: ![AWS arch diagram](/media/postImages/original/IMu5VnT59oTDuiZ7T2sYOHPg) With regards to costs, and as far as I have found, I would **only** be paying the data transfer Google Cloud to the AWS one as **network egress costs** and the costs related to hosting the information in S3. As stated in the ["overview of data transfer costs for common architectures"](https://aws.amazon.com/blogs/architecture/overview-of-data-transfer-costs-for-common-architectures/) and [Amazon S3 pricing guide - data transfer section](https://aws.amazon.com/s3/pricing/): - I don't pay for data transferred from an Amazon S3 bucket to any AWS service(s) within the same AWS Region as the S3 bucket (including to a different account in the same AWS Region) if I use an Internet Gateway to access that data. - I don't pay for data transferred in from the internet. Am I right? Am I missing anything in this diagram (sorry for the networking abstraction I'm making in the diagram above. As stated in the paragraph above, I'd be accessing the S3 bucket by using an Internet Gateway, both EC2/S3 running in the same region). Thanks lots in advance!
Having trouble deciding the optimal architecture
So I am working on a product which is B2B. The landing page of our product is eg. example.com. And once someone purchases the subscription(let's say org1 purchases it) it we ought to provide them a custom web dashboard which can be accessed through the URL (org1.example.com) . Now i have some questions about this- Can we make this subdomain available only to the org1 members and not make it publicly available? I am a novice but can we use concepts like Route 53 Hosted zone using Client vpn endpoint or something like that? If it is possible, is it also possible to attach one SSL certificate to this domain? Anything would be helpful.
KDA failover strategy
Currently I am trying out a KDA + Flink setup and first thing that stops me a little bit is failover strategy. There is only one option for a failover: take savepoint -> destroy cluster -> start new cluster -> start job from savepoint, and this option ends up in few minutes downtime. I am aware that aws uses a bit different strategy for maintenance window where both clusters run at the same time and only job stop, start creates a downtime which is way shorter then the only available option. Does anyone know if this and when this second option will be available for cases like autoscale/ user maintenance/ etc.? And will it be compatible with custom autoscaling rules?
Integration Test Strategy/tool/library for AWS config rules
Hi,We are using the RDK to build/test/deploy our config rules using multi-account architecture.We would like to perform the integration tests for our config rules deployed in the accounts , Hence i am looking for suggestions for any library or suggestion that can help me to test my rules . ( the ideas is to deploy the dummy resources in the test account and run the rule against the resources , then collect the test result of all the rules and send it back to pipeline ).
AWS IoT Greengrass (V2) and Video Streaming
Hello, The use case I have is this - There are two types of AWS IoT Greengrass V2 core devices that are implemented, which are connected (in the same private LAN network) in hub and spoke architecture. None of them are connected to client devices (Greengrass is being used because of its IPC and orchestration benefits): 1. [Spoke] AWS IoT Greengrass V2 core device is directly attached to a camera. The Video stream is sent to an Hub AWS IoT Greengrass V2 core device for ML processing (inference) that must be near-real time. 2. [Hub] AWS IoT Greengrass V2 core device that is processing and Fan-Out video streams: **A)** to ML inference interface (**local component of the hub**) **B)** to Kinesis Firehose (S3; to re-train the model) **C)** AWS Kinesis Video Stream (for human to view the video online) I have a couple of questions: 1. Is the architecture feasible? Make sense? 2. What is the best (performance and security wise) technology (open source, AWS component, protocol) to use in Spoke and Hub devices to send the video stream from the spokes to the hub (the video has to be high quality with minimal/no compression to keep the inference accuracy high)? 2. Can the Stream Manager component of AWS IoT Greengrass V2 core send (Hub) streams in fun-out mode (**e.g., to two different destinations concurrently, AWS Kinesis Firehose and AWS Kinesis Video Streams**)? Thank you, Yossi
Lambda using docker - Billed time very high for small duration
Hi, We are using lambda in a docker (ECR), The code is in python with a lot of dependencies (Docker size 700mb) We noticed that our code execution is very small (+-650ms) but the billed time is very high in comparison (4500ms) with a init time of (+- 3800ms) So basically init time is 80% of our billed time for each execution. My previous pipeline : Every 45min we spin 200 instances of the lambda with a different argument at the same time. (I guess 200 cold start then?) My new pipeline : Every 45min i spin the same lambda with no argument (1 Cold start) and this lambda spin 200 times the same lambda but with an argument (200 warm start?) but with the new pipeline the problem is still the same and I dont see any improvement: REPORT RequestId: f9d67348-2deb-410e-b874-b29e8b3569b2 Duration: 653.58 ms Billed Duration: 4353 ms Memory Size: 300 MB Max Memory Used: 284 MB Init Duration: 3698.46 ms So here are my 3 questions then: - Is 3800ms a normal cold start time? - In my new pipeline why is it not improving the init duration? - What approach do you recommend to reduce/fix the cost? Thank you !
Architecture for multi-region ECS application
Hi everyone, I just wanted to get feedback on my proposed solution for a multi-region ECS dockerized app. Currently we have the following resources in Region A: ``` Postgres DB (Used for user accounts only) Backend+Frontend NextJS App (Dockerized) ECS Backend Microservice App for conversion of files (Dockerized) ECS Backend 3rd party API + Datastore (This resource is also deployed in other regions) Unknown architecture ``` I now need to deploy to Regions B and C. The Backend 3rd party API is already deployed in these regions. I am thinking of deploying the following resources to the following regions: ``` Backend+Frontend NextJS App (Dockerized) Backend Microservice App for conversion of files (Dockerized) ``` Our app logs in the user (authentication + authorization) using the 3rd party API, and after login we can see which region their data is in. So after login I can bounce them + their token to the appropriate region. I cannot use Route53 routing reliably because the Source of Truth about their region is available after login, and, for example, they may be (rarely) accessing from region B (if they are travelling) while their datastore is in region C (In which case I need to bounce them to region C). I also don't need to replicate our database to other regions because it only stores their account information for billing purposes, so the performance impact is minimal and only checked on login/logout. Currently we have low 10s of users, so I can easily restructure and deploy a different architecture if/when we start scaling. Critique is welcome!
Does an IaaS architecture also include services such as AWS (IoT) analytics?
Hello, I have a question regarding cloud service types (IaaS, PaaS & SaaS). For an internship, the company I work at has asked me to look into a specific cloud solution. I wish to create a scalable IoT-capable cloud solution for customers. So in turn, I have done quite a bit of research on cloud/edge computing for the past few weeks. However, one thing still confuses me a little. If I used an IaaS cloud architecture, would I still be able to use services like AWS (IoT) analytics, or would this mean it isn't an IaaS architecture anymore? I looked all over, but I can't find an answer for this specific question. I hoped to find some IaaS/PaaS/SaaS cloud architecture diagrams to gain some clarity, but haven't found these. Could someone help me clarify this? Thank you very much!
Self-service Developer Platform architecture on EKS with Crossplane
Hello aws re:Post I am currently designing a solution that would allow my users to be able to provision workloads on demand through a click-to-deploy interface. In order to minimize costs, I thought of an architecture as follows : 2 K8S clusters (1 on-premise and 1 on AWS) - On-premise : Developer Platform management cluster - nodePool mgmt - Crossplane - Customer cluster definition - Catalog definition - nodePool tools - APIs - Prometheus - Grafana - ... - AWS : Customer resources cluster - nodePool customerName (one per customer) - ns mgmt (created on customer register) - ns tenant (one per tenant : production, development, ..) If you don't know Crossplane, you can look at the documentation here : https://crossplane.io/docs/v1.7/ but it is a bit like AWS Controllers for Kubernetes. It provides infrastructure abstraction as Kubernetes CRDs. First, I want to get feedback on such an architecture and know if you have better ideas/suggestions. Secondly, I would like to get an opinion on how to manage clients on the AWS side please. Thanks in advance for your answers.
How to query all items of a specific type?
I have a dynamo table, and have two main types of items: groups and users. I am using groupIds and userIds as partition keys, and userIds and groupIds as sort keys. In this way I am able to map many to many relationships. So querying a userId I can get all of the users groups, and querying a group I can get all of the users in that group. I'm wondering if there is a way to architect it in such a way where I can easily query all groups without performing a scan? My thought is to have a partition key that is just something like "groups" and a sort key of groupIds. Is this a proper approach?