By using AWS re:Post, you agree to the Terms of Use
/Storage/

Storage

AWS offers a complete range of services for you to store, access, govern, and analyze your data to reduce costs, increase agility, and accelerate innovation. Select from object storage, file storage, and block storage services, backup, and data migration options to build the foundation of your cloud IT environment.

Recent questions

see all
1/18

Using DMS and SCT for extracting/migrating data from Cassandra to S3

IHAC who is doing scoping with an Architecture using DMS and SCT. I had a few questions I was hoping you can get answered for me. 1. Does AWS DMS support data validation with Cassandra as a source? I don’t see it here - https://docs.aws.amazon.com/dms/latest/userguide/CHAP_BestPractices.html#CHAP_BestPractices.DataValidation but I do see Cassandra as a valid source target here https://aws.amazon.com/about-aws/whats-new/2018/09/aws-dms-aws-sct-now-support-the-migration-of-apache-cassandra-databases/ 2. Does AWS DMS support ongoing replication with Cassandra as a source? Reading the docs it looks like if I wanted to extract data from Cassandra and write to s3 (Using DMS) then post process that data into a different format (Like json) and write to a different S3 bucket, I could so by attaching a Lamba to the original S3 event from the DMS extract and drop. Can you confirm my understanding? 3. How is incremental data loaded ongoing after initial load from Cassandra (with DMS)? In the docs it looks like its stored in s3 in csv form. Does it write 1 csv per source table and keep appending or updating the existing csv? does it create 1 csv per row, per batch...etc? I’m wondering how the event in step 3 would be triggered if I did want to continuously post process updates as they come in in real time and covert source data from Cassandra into Json data I store on s3.
0
answers
0
votes
2
views
AWS-User-7019446
asked 3 days ago

_temp AWS lake formation blueprint pipeline tables appears to IAM user in athena editor although I didn't give this user permission on them

_temp lake formation blueprint pipeline tables appears to IAM user in Athena editor, although I didn't give this user permission on them below the policy granted to this IAM user,also in lake formation permsissions ,I didnt give this user any permissions on _temp tables: { "Version": "2012-10-17", "Statement": [ { "Sid": "Stmt1652364721496", "Action": [ "athena:BatchGetNamedQuery", "athena:BatchGetQueryExecution", "athena:GetDataCatalog", "athena:GetDatabase", "athena:GetNamedQuery", "athena:GetPreparedStatement", "athena:GetQueryExecution", "athena:GetQueryResults", "athena:GetQueryResultsStream", "athena:GetTableMetadata", "athena:GetWorkGroup", "athena:ListDataCatalogs", "athena:ListDatabases", "athena:ListEngineVersions", "athena:ListNamedQueries", "athena:ListPreparedStatements", "athena:ListQueryExecutions", "athena:ListTableMetadata", "athena:ListTagsForResource", "athena:ListWorkGroups", "athena:StartQueryExecution", "athena:StopQueryExecution" ], "Effect": "Allow", "Resource": "*" }, { "Effect": "Allow", "Action": [ "glue:GetDatabase", "glue:GetDatabases", "glue:BatchDeleteTable", "glue:GetTable", "glue:GetTables", "glue:GetPartition", "glue:GetPartitions", "glue:BatchGetPartition" ], "Resource": [ "*" ] }, { "Sid": "Stmt1652365282568", "Action": "s3:*", "Effect": "Allow", "Resource": [ "arn:aws:s3:::queryresults-all", "arn:aws:s3:::queryresults-all/*" ] }, { "Effect": "Allow", "Action": [ "lakeformation:GetDataAccess" ], "Resource": [ "*" ] } ] }
1
answers
0
votes
2
views
AWS-User-6747057
asked 3 days ago

`RequestTimeout`s for S3 put requests from a Lambda in a VPC for larger payloads

# Update # I increased the timeout of the lambda to 5 minutes, and the timeout of the PUT request to the S3 bucket to 5 minutes as well. Before this the request itself would timeout, but now I'm actually getting a response back from S3. It is a 400 Bad Request response. The error code is `RequestTimeout`. And the message in the payload of the response is "Your socket connection to the server was not read from or written to within the timeout period." This exact same code works 100% of the time for a small payload (on the order of 1KB), but apparently for payloads on the order of 1MB it starts breaking. There is no logic in _my code_ that does anything differently based on the size of the payload. I've read similar issues that suggest the issue is with the wrong number of bytes being provided in the "content-length" header, but I've never provided a value for that header. Furthermore, the lambda works flawlessly when executed in my local environment. The problem definitely appears to be a networking one. At first glance it might seem like this is just an issue with the lambda being able to interact with services outside of the VPC, but that's not the case because the lambda _does_ work exactly as expected for smaller file sizes (<1KB). So it's not that it flat out can't communicate with S3. Scratching my head here... # Original # I use S3 to host images for an application. In my local testing environment the images upload at an acceptable speed. However, when I run the same exact code from an AWS Lambda (in my VPC), the speeds are untenably slow. I've concluded this because I've tested with smaller images (< 1KB) and they work 100% of the time without making any changes to the code. Then I use 1MB sized payloads and they fail 98% percent of the time. I know the request to S3 is the issue because of logs made from within the Lambda that indicate the execution reaches the upload request, but — almost — never successfully passes it (times out).
0
answers
0
votes
3
views
AWS-User-8949906
asked 4 days ago

Help with copying s3 bucket to another location missing objects

Hello All, Today I was trying to copy a directory from one location to another, and was using the following command to execute my copy. aws s3 s3://bucketname/directory/ s3://bucketname/directory/subdirectory --recursive The copy took overnight to complete because it was 16.4TB in size, but when I got into work the next day, it was done, or at least it had completed. But when I do a compare between the two locations I get the following bucketname/directory/ 103,690 objects - 16.4TB bucketname/directory/subdirectory/ 103,650 - 16.4TB So there is a 40 object difference between the source location and the destination location. I tried using the following command to copy over the files that were missing aws s3 sync s3://bucketname/directory/ s3://bucket/directory/subdirectory/ which returned no results. It sat for a while maybe like 2 minutes or so, and then just returned to the next line. I am at my wits end trying to copy of the missing objects, and my boss thinks that I lost the data, so I need to figure out a way to get the difference between the source and destination copied over. If anyone could help me with this, I would REALY appreciate it. I am a newbie with AWS, so I may not understand everything that I am told, but I will try anything to get this resolved. I am doing all the commands through an EC2 instance that I am ssh into, and then use AWS CLI commands. Thanks to anyone who might be able to help me. Take care, -Tired & Frustrated :)
1
answers
0
votes
2
views
AWS-User-1139934
asked 5 days ago

Non guessable CloudFront URL

I'm wondering if there's a way to make the S3 path unguessable. Let's suppose I have an S3 path like this: https://s3-bucket.com/{singer_id}/album/song/song.mp3, this file will be served through CloudFront, so the path will be: https://cloundfront-dist-id.com/{singer_id}/album/song/song.mp3?signature=... (I'm using signed URLs). My question is : it is possible to make the /{singer_id}/album/song/song.mp3 not guessable by hashing it using for example Lambda or Lambda@Edge function so the client will see a url like this https://cloundfront-dist-id.com/some_hash?signature= ? Thanks in advance. https://stackoverflow.com/questions/70885356/non-guessable-cloudfront-url I am also facing issue. Question may arise why need of hash because signed url are secure. For my side, I need such url with s3 path hidden. I am using same AWS bucket for retrieving image for internal use without signed url and sharing that file to others using signed url. Internal USe CDN without signed url after CNAMe https://data.example.com/{singer_id}/album/song/song.mp3 Signed url https://secured-data.example.com/{singer_id}/album/song/song.mp3?signature=. &Expires == Since both using same AWS bucket and if someone guesses in signed url then access content https://data.example.com/{singer_id}/album/song/song.mp3?signature=. &Expires . File opens . In this scenario, I want to hide {singer_id}/album/song/song.mp3 to some new value and file is displayed under new name
1
answers
0
votes
5
views
Techxonia
asked 5 days ago

Popular users

see all
1/18

Learn AWS faster by following popular topics

1/1