By using AWS re:Post, you agree to the Terms of Use

Questions tagged with Compute

Sort by most recent

Browse through the questions and answers listed below or filter and sort to narrow down your results.

Data transfer speeds from S3 bucket -> EC2 SLURM cluster are slower than S3 bucket -> Google SLURM cluster

Hello, I am currently benchmarking big data multi-cloud transfer speeds at a range of parallel reads using a cluster of EC2 instances & similar Google machines. I first detected an issue when using a `c5n.2xlarge` EC2 instance for my worker nodes reading a 7 GB dataset in multiple formats from an S3 bucket. I have verified that the bucket is in the same cloud region as the EC2 nodes, but the data transfer executed far slower to EC2 instances than it did for GCP. The data is not going into EBS, rather being read in-memory, where the data chunks are then removed from memory when the process is complete. Here are a list of things I have tried to diagnose the problem: 1. Upgrading to a bigger instance type. I am aware that there is a network bandwidth limit to each instance type, and I saw a read speed increase when I changed to a `c5n.9xlarge` (From your documentation, there should be 50 Gpbs of bandwidth), but it was still slower than reading from S3 to a Google VM with larger network proximity. I also upgraded instance type again, but there little to no speed increase. Note that hyperthreading is turned off for each EC2 instance. 2. Changing the S3 bucket parameter `max_concurrent_requests` to `100`. I am using python to benchmark these speeds, so this parameter was passed into a `storage_options` dictionary that is used in different remote data access APIs (see the [Dask documentation](https://docs.dask.org/en/stable/how-to/connect-to-remote-data.html#:~:text=%22config_kwargs%22%3A%20%7B%22s3%22%3A%20%7B%22addressing_style%22%3A%20%22virtual%22%7D%7D%2C) for more info). Editing this parameter had absolutely no effect on the transfer speeds. 3. Verified that enhanced networking is active on all worker nodes & controller node. 4. Performed the data transfer directly from a worker node command line for both AWS and GCP machines. This was done to rule out my testing code being at fault, and the results were the same: S3 -> EC2 was slower than S3-> GCP. 5. Varying how many cores of each EC2 instance were used in each SLURM job. For the Google machines, each worker node has 4 cores and 16 GB memory, so each job that I submit there takes up an entire node. However, when I had to upgrade my EC2 worker node instances, there are clearly more cores than just 4 per node. To try and maintain a fair comparison, I configured each SLURM job to only access 8 cores per node in my EC2 cluster (I am performing 40 parallel reads at maximum, so if my understanding is correct each node will have 8 separate data stream connections, with 5 total nodes being active at a time with `c5n.9xlarge` instances). I also tried seeing if allocating all of a node's resources for my 40 parallel reads would speed things up (2 instances with all 18 cores in each active, and a third worker instance with only 4 cores active), but there was no effect. I'm fairly confident there is a solution to this, but I am having an extremely difficult time figuring out what it is. I know that setting an endpoint shouldn't be the problem, because GCP is faster than EC2 and there is egress occurring there. Any help would be appreciated, because I want to make sure I get an accurate picture of S3->EC2 before presenting my work. Please let me know if more information is needed!
1
answers
0
votes
38
views
asked 2 months ago

yum update python not working in AL2 EC2 instance (from elastic beanstalk).

The security adivsory here https://alas.aws.amazon.com/AL2/ALAS-2022-1802.html indicates that the AL2 Python package has been patched and an update is available (in python-2.7.18-1.amzn2.0.5.aarch64). The adisory directs: ``` Issue Correction: Run yum update python to update your system. ``` However, executing yum update python does not update the package - no update to the package is found. Why is the package update not applied? ``` [ec2-user@ip-redacted ~]$ yum info python Loaded plugins: extras_suggestions, langpacks, priorities, update-motd 207 packages excluded due to repository priority protections Installed Packages Name : python Arch : aarch64 Version : 2.7.18 Release : 1.amzn2.0.4 Size : 139 k Repo : installed Summary : An interpreted, interactive, object-oriented programming language URL : http://www.python.org/ License : Python Description : Python is an interpreted, interactive, object-oriented programming : language often compared to Tcl, Perl, Scheme or Java. Python includes : modules, classes, exceptions, very high level dynamic data types and : dynamic typing. Python supports interfaces to many system calls and : libraries, as well as to various windowing systems (X11, Motif, Tk, : Mac and MFC). : : Programmers can write new built-in modules for Python in C or C++. : Python can be used as an extension language for applications that need : a programmable interface. : : Note that documentation for Python is provided in the python-docs : package. : : This package provides the "python" executable; most of the actual : implementation is within the "python-libs" package. [ec2-user@ip-redacted ~]$ sudo yum update python Loaded plugins: extras_suggestions, langpacks, priorities, update-motd amzn2-core | 3.7 kB 00:00:00 207 packages excluded due to repository priority protections No packages marked for update [ec2-user@ip-redacted ~]$ ```
0
answers
0
votes
27
views
asked 2 months ago

Answering Questions on Compute - Ask the Experts

Ask the Expert - AWS Support You Event Welcome to the follow up discussion on "Answering Your re:Post Questions on Compute" Twitch event. If you missed the live [Twitch event](https://www.twitch.tv/aws) on August 15, you can access the [recording](https://www.youtube.com/watch?v=lzadlmq4LcM0). If you have a question you'd like to see discussed during the live event, post it using **comment button** below. After the live event, you can continue asking questions until August 22. Provide description about Compute Join experts Rob Higareda, Sukhmeet Mata, and Mike Wells, who will answer in detail your questions asked in this post and other questions made in re:Post regarding Compute. **Our AWS experts** **Rob Higareda** is a Principal Technical Account Manager who works with Enterprise customers to lead them on their AWS journey. Rob is an advocate for AWS security services and enjoys helping customers adopt AWS services via our catalog of workshops. **Sukhmeet Mata** helped his customers to develop highly scalable ML solutions in various areas like Computer Vision, genomics/gene processing, Market Forecasts, etc. As a TAM, Sukhmeet gets to put his technical and coding skills to use by helping customers in building, managing and deploying solutions on the cloud as well as helping customers in their modernization and digitalization journey. He has helped his customers in being operationally efficient by helping them optimize costs with automation as well as developed High Level Solution Architectures and System Designs for new and existing workloads. **Mike Wells** Mike's role as a Sr Technical Account Manager allows him to put his Architecture, Development and Cloud knowledge into practice helping customers achieve cloud excellence. Mike is based in Boulder, Colorado and is an avid outdoors man and cyclist. **Ask questions from Tuesday August 9 to by using the "Comment" button** ***- Please be sure to upvote answers if they helped you -***
4
answers
1
votes
307
views
profile picture
SUPPORT ENGINEER
asked 2 months ago