Join us at re:Invent 2022!

8 minute read
Content level: Foundational
5

Learn about all the AWS Inferentia and Trainium re:Invent sessions

re:Invent

Are you coming to re:Invent this year and looking for exciting sessions to attend? We will have many sessions that will help you learn more about purpose-built ML accelerators: AWS Inferentia and AWS Trainium. These sessions will focus on why Inferentia/Trainium maybe a good fit for you, how to optimize your models for the highest performance while lowering costs, and how to create sustainable solutions to accelerate deep learning applications in the cloud. We have deep dive sessions to help you learn more about the technology and hands on workshops to help you get started.

Join us at re:Invent this year at one of the sessions below, or just drop by to say hi!

Find the Sessions below in the re:Invent Session catalog


Breakouts

Accelerate deep learning and innovate faster with AWS Trainium

CMP313
Amazon EC2 Trn1 instances, powered by AWS Trainium chips, are purpose-built for high-performance, deep-learning training and offer up to 50 percent cost-to-train savings over equivalent GPU-based instances. In this session, learn about AWS Trainium and Trn1 innovations, the AWS collaboration with PyTorch and Hugging Face, and the successes users have seen.

Session type
Breakout Session
Tracks
Compute (CMP), CMP: AWS Silicon Innovation, CMP: ML Infrastructure
Allocation Tracking
Compute

Sustainability and AWS silicon

SUS206
With the world’s increasing need for computing and machine learning becoming mainstream, continually innovating at the chip level is critical to sustainably powering the workloads of the future. In this session, learn how AWS continues to innovate on chip design as the organization works toward Amazon’s goal of achieving net-zero carbon by 2040. Find out about the carbon emissions associated with the silicon manufacturing process and hardware usage, and how the design process at AWS delivers higher power efficiency and lower carbon footprint for chips designed by AWS. Learn how sustainability is integrated into Pinterest’s AWS architecture decisions.

Session type
Breakout Session
Tracks
Compute (CMP), CEN: Sustainability, CMP: AWS Silicon Innovation
Allocation Tracking
Sustainability
Area of Interest
Sustainability, Graviton, Innovation at AWS

Reduced costs and better performance for startups with AWS Inferentia

CMP226
Amazon EC2 Inf1 instances, powered by AWS Inferentia chips, deliver up to 70 percent lower cost per inference and up to 2.3 times higher throughput than comparable GPU-based Amazon EC2 instances. Attend this session to hear how startup companies have realized these benefits to grow their businesses and deliver innovative experiences to their end users.

Session type
Breakout Session
Tracks
Compute (CMP), CMP: AWS Silicon Innovation, CMP: ML Infrastructure
Allocation Tracking
Compute

AI parallelism explained: How Amazon Search scales deep-learning training

CMP209
Transformer-based models have caused the rapid growth of model sizes over the past few years, with sizes and complexities increasing rapidly (more than 100 billion parameters), driven by proportional increases in accuracy and capabilities. Broader adoption of these advancements is blocked due to the ability to scale across a heterogeneous infrastructure. In this session, dive deep into parallelism strategies. Learn how Amazon Search trains large language models using various parallelism strategies and deploys them into production at scale. See a demo of all strategies (including DeepSpeed and PyTorch FSDP) and open-source code.

Session type
Breakout Session
Tracks
Compute (CMP), CMP: ML Infrastructure
Services
AI for Data Analysts, Amazon Machine Learning (Amazon ML)

Silicon innovation at AWS

CMP201
Organizations are bringing diverse workloads onto AWS at a faster rate than ever before. To run diverse workloads with the performance and costs that users expect, AWS often innovates on their behalf and delivers breakthrough innovations even at the silicon level. AWS efforts in silicon design began with the AWS Nitro System but quickly extended to AWS Graviton processors and purpose-built inference chips with AWS Inferentia. In this session, explore the AWS journey into silicon innovation and learn about some of the thought processes, learnings, and results from the experience so far.

Session type
Breakout Session
Tracks
Compute (CMP), CMP: AWS Silicon Innovation, Architecture (ARC)
Allocation Tracking
Compute

Choosing the right accelerator for training and inference

CMP207
Amazon EC2 provides the broadest and deepest portfolio of instances for machine learning applications. From GPU-based high-performance instances such as P4 and G5, to Trn1 and Inf1 instances purpose-built with AWS silicon for best price performance, there’s a right instance for each of your machine learning workloads. In this session, learn about these instances, benchmarks, and ideal use case guidelines for each of these instances. See a demo of how to initiate and scale machine learning workloads in production.

Session type
Breakout Session
Tracks
Compute (CMP), CMP: ML Infrastructure, AI & ML (AIM)
Allocation Tracking
Cloud Operations


Workshops

Train & deploy a Hugging Face NLP model with AWS Trainium & AWS Inferentia

CMP206
Amazon EC2 Trn1 instances, powered by AWS Trainium chips, and Amazon EC2 Inf1 instances, powered by AWS Inferentia chips, are built to provide high performance and low-cost machine learning training and inference in the cloud. In this workshop, learn how to train a Hugging Face NLP model on a Trn1 instance and then deploy it on an Inf1 instance. The workshop also covers how to use NeuronPerf to generate performance benchmarks for your models. You must bring your laptop to participate.

Session type
Workshop
Tracks
Compute (CMP), CMP: ML Infrastructure

Deploy deep learning models with hyperscale performance on SageMaker

AIM401
In this workshop, explore the hardware and software optimizations available for deep learning model deployment, when and how to use them, and the impact on the model performance and cost. Learn how to run thousands of deep learning models using Amazon SageMaker multi-model endpoints and reduce model serving costs. Learn how to use the AWS Neuron SDK to compile and deploy models on AWS Inferentia chips. Lastly, dive deep into model compilation and optimization techniques using TensorRT and NVIDIA Triton Inference Server features to achieve hyperscale inference on Amazon SageMaker. You must bring your laptop to participate.

Session type
Workshop
Tracks
DOP: DevOps, AIM: SageMaker, AI & ML (AIM)
Area of Interest
Cost Optimization, High Performance Computing, Price Performance
Allocation Tracking
Artificial Intelligence and Machine Learning
Services
Amazon SageMaker

Deep learning with Amazon SageMaker, AWS Trainium, and AWS Inferentia

AIM212
With Amazon SageMaker you can build, train, and deploy machine learning models for almost any use case. Amazon EC2 Trn1 instances, powered by AWS Trainium, and EC2 Inf1 instances, powered by AWS Inferentia, deliver the best price performance for deep learning training and inference. In this workshop, walk through training a BERT model for natural language processing on Trn1 instances to save up to 50 percent in training costs over equivalent GPU-based EC2 instances. Also learn how to deploy this model for inference on Inf1 instances for up to 2.3x higher throughput and up to 70 percent lower cost per inference than comparable GPU-based EC2 instances. You must bring your laptop to participate.

Session type
Workshop
Tracks
AIM: SageMaker, AIM: ML Infrastructure & Framework, AI & ML (AIM)
Area of Interest
High Performance Computing, Innovation at AWS
Allocation Tracking
Artificial Intelligence and Machine Learning
Services
Amazon SageMaker, Amazon EC2


Chalk Talks

Optimizing deep learning models with compilation for faster inference

AIM330
As deep learning models get larger and more complex, improving inference latency and throughput to meet business requirements can be challenging. In this chalk talk, learn how to optimize a deep learning model with the Amazon SageMaker Neo inference compiler and deploy it to different cloud and edge hardware platforms, including CPUs, GPUs, and AWS Inferentia. Discover how deep learning compilation works and how it can help accelerate your inference jobs. This talk includes a demonstration on how to compile and deploy a trained PyTorch model for inference.

Session type
Chalk Talk
Tracks
AI & ML (AIM), AIM: ML Infrastructure & Framework, AIM: SageMaker
Area of Interest
Cost Optimization
Allocation Tracking
Artificial Intelligence and Machine Learning
Services
Amazon SageMaker

Choosing the right ML instance for training and inference on AWS

AIM407
As a data scientist, choosing the right compute instance for your workload on AWS can be challenging. On AWS, you can choose from CPUs, GPUs, AWS Trainium, and Intel Habana Gaudi to accelerate training. For inference, you can choose from CPUs, GPUs, and AWS Inferentia. This chalk talk guides you through how to choose the right compute instance type on AWS for your deep learning projects. Explore the available options, such as the most performant instance for training, the best instance for prototyping, and the most cost-effective instance for inference deployments. Learn how to use Amazon SageMaker Inference Recommender to help you make the right decision for your inference workloads.

Session type
Chalk Talk
Tracks
AIM: ML Infrastructure & Framework, AI & ML (AIM), AIM: SageMaker
Area of Interest
High Performance Computing, Innovation at AWS, Demo
Allocation Tracking
Artificial Intelligence and Machine Learning
Services
Amazon SageMaker

profile pictureAWS
EXPERT
published a year ago1297 views
1 Comment

Did you miss re:Invent? Check out the recap here: https://repost.aws/articles/ARWg0vtgR7RriapTABCkBnng

profile pictureAWS
EXPERT
replied a year ago