Browse through the questions and answers listed below or filter and sort to narrow down your results.
Best way to run a web scraper from my S3 website and save the returned information to a database
I currently have a website hosted on S3 with a working RDS database connection. I would like to run a dynamic web scraper from this site that activates based on user input. Ideally, the user will click a button which will activate a search for a specific product across a specified list of retail sites. This data will be displayed on the site and also saved to the database. It doesn't matter if the data goes directly to the site or to the database. I would like to use an existing web scraping tool such as Scrapy. What is the best way to handle this?
How to keep instance running after disconnecting SSH.
So currently I have a Lightsail instance, with a domain connected to it. I SSH into the instance and do `./run.sh` which is a script I have to start a webserver. Great, the webserver now works and the domain is also working. But as soon as I disconnect the SSH connection, the website is no longer up. If I go back into it again, I can see that the processes are not running anymore, but my instance has been active this entire time. How do I keep it running when I'm not connected to it?
Host S3 website with on-premises DNS server without Route53
I have an S3 bucket set up for "Static Website Hosting", a CloudFront distribution with our own SSL certificate and allowing Host header, a Global Accelerator IP address attached to an Application Load Balancer. The Application Load Balancer does a 301 redirect to the S3 bucket. I know that you can set an EC2 instance as a target group and forward to that to preserve the original URL. Can you also preserve the original URL using this method of 301 redirect to a Cloudfront distribution tied to an S3 bucket?
secure API GW with WAF
Hi team, I want to secure my Http API GW (not REST API GW) with WAF, as the HTTP API GW doesn't support the WAF I read this article to do the workaround : https://wellarchitectedlabs.com/security/300_labs/300_multilayered_api_security_with_cognito_and_waf/3_prevent_requests_from_accessing_api_directly/ but in my case the origin of my distribution is an S3 bucket (static website hosting is not enabled) that host Angular APP : **users **=> **CloudFront **(with angular App in s3 bucket as Origin : this s3 bucket is not configured with static website hosting because content is served from CloudFront not directly from S3) => A**PI GW** = > **NLB **=> **fargate cluster** is there a way to use the method explained in the above article when my origin is an S3 bucket with no static web hosting enabled ? my objective is to protect my HTTP API GW with WAF, the solution on the article is perfect with custom header and secret manager but not applicable in my case because my DF distribution has S3 as origin.
Best way to publicly host a website of HTML/JS files and connect them to a RDS database?
I already have a working public domain and a S3 bucket for the HTML files. I also have a RDS database that is connected to a MySQL database. I would like to be able to run queries on the RDS database and dynamically display, sort, add, remove, and update records based on site user inputs. For example, I have a basic search function, and I would like that search function to run a query to select the relevant records from the database and display them on the site. Is there a way I can accomplish this within S3, or is there another platform I should use within AWS?
better metrics for error 5XX count in ALB for Cloudwatch
We are trying to identify the better math metric logic for the alarm threshold where we build threshold on the request count that is if we receive 100 requests and 10 percent of these return http error code 5XX errors trigger an alarm. Than it must create cloud watch alarm.
Wordpress on Lightsail goes down daily
Hi, I am new to AWS services. I have bought a domain (on AWS) and created a Wordpress instance on Lightsail (Bundle 0.5GB) and it was working fine. On the second day it went unreachable ("Unhealthy" in the Health Checks). So I have deleted the Wordpress instance and the DNS; then I made all the steps from beginning and the Wordpress website was again up and working. Today, about 4 hours ago, my Wordpress is again unreachable ("Unhealthy" in the Health Checks). When I try with a "curl -v" I get: ``` * Trying 3.xx.xxx.x... * TCP_NODELAY set * Connected to xxxxxx.org (3.xx.xxx.x) port 80 (#0) > GET / HTTP/1.1 > Host: xxxxxx.org > User-Agent: curl/7.64.1 > Accept: */* ``` and it gets stuck there. The static IP and the domain name are correct. Can it be a Lightsail failure? What I should check? If useful, I can write IP and domain name in clear. Since it is a website for helping the Ukraine people, I would really like to get it working asap. Thanks, Domenico
Performing a Latency Test
Hi There I am new to computer clouding and I have been asked if I could run a latency test on our system. Basically our hardware communicates via the internet to our application which is hosted on AWS Cloud using Fargate with AWS ECS to run containers (as detailed below). My question is, is it possible to run a latency test when our application is structured on the cloud and structured in such way? UI Hosting: AWS Fargate with AWS ECS to run containers, NLB, AWS Ireland region. Backend: AWS ECS with AWS EC2 to run containers, AWS ECS with AWS Fargate, AWS Ireland region.
How do I restrict access to an s3 bucket behind a transit gateway?
Suppose I have 2 VPCs, VPC-A and VPC-B. My workspace (where traffic originates from) is in VPC-A, and I have my application in VPC-B. These VPCs are attached by a transit gateway. These VPCs are in two accounts, Account-A and Account-B respectively. Suppose I want to make a private S3 bucket in Account-B. This S3 bucket houses a static site which we only want to make visible to the interconnected network, VPC-A and VPC-B. How would I go about doing this without modifying any resources in Account-A? Note: It seems you cannot just use an S3 Gateway Service Endpoint. As per the docs, > Endpoint connections cannot be extended out of a VPC. Resources on the other side of a VPN connection, VPC peering connection, transit gateway, AWS Direct Connect connection, or ClassicLink connection in your VPC cannot use the endpoint to communicate with resources in the endpoint service. (This is because a typical "Gateway" Service endpoint works on the principle of actively modifying a subnet's route table so that IPs that match the current public S3 IPs are redirected through the endpoint. If modifications are only in Account-B, then adding these route tables would be impossible)
Files for Signed cookie-based authentication with Amazon CloudFront and AWS Lambda@Edge + Email Logging
I'm trying to follow the demo in the blog posts: 1. [Signed cookie-based authentication with Amazon CloudFront and AWS Lambda@Edge: Part 1 -Authentication](https://aws.amazon.com/blogs/networking-and-content-delivery/signed-cookie-based-authentication-with-amazon-cloudfront-and-aws-lambdaedge-part-1-authentication/) and 2. [Signed cookie-based authentication with Amazon CloudFront and AWS Lambda@Edge: Part 2 – Authorization](https://aws.amazon.com/blogs/networking-and-content-delivery/signed-cookie-based-authentication-with-amazon-cloudfront-and-aws-lambdaedge-part-2-authorization/) However, there aren't any files provided for `login.html`, `restricted-content.html`, and `assets/*`. My questions are: 1. How could I find these .html files, or how could I create these files on my own? 2. How could I modify the Lambda@Edge function if I wanted to log the email addresses to a file in an S3 bucket?
CloudFront breaks session sickiness
We have a quite simple webapp infrastructure: 2 backend EC2 in a target group, internet facing ALB and CloudFront. Session stickiness is turned on on the target group. When we have only one default behavior with Managed-CachingDisabled and Managed-AllViewer policies session stickiness is fine. Obviously this doesn't cache anything. When I add a new behavior to cache resources in media folder session stickiness doesn't work any more. This policy uses one query string param in cache key, otherwise it is like Managed-CachingOptimized both CF and ALB on HTTPS, same SSL cert installed on both. I don't really understand why stickiness stops working even if we have AWSALB cookie on cached items too. Please let me know if you need any other details.