AWS Batch GPU busy or unavailable

0

I'm trying to deploy a Python app in a Docker container that utilizes CUDA to AWS Batch. When I try to run a Batch Job I get this error:

RuntimeError: CUDA error: all CUDA-capable devices are busy or unavailable

I'm a bit confused as I thought AWS Batch would assign an EC2 instance with an available GPU. I request at least 1 GPU when I submit a job. Haven't had any luck finding anyone with the same issue. It's possible I messed up configuring something in my Dockerfile or in AWS Batch, but it sounds like I'm correctly accessing the GPU and something on AWS' end is messed up. Let me know if you need any other info from me.

Docker environment: nvidia/cuda:11.6.0-cudnn8-devel-ubuntu20.04

Compute Environment: p2-family EC2s (not spot instances)

jlin
질문됨 2년 전59회 조회
답변 없음

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠