En utilisant AWS re:Post, vous acceptez les AWS re:Post Conditions d’utilisation

Base Docker image / Dockerfile for running R jobs in AWS Batch

0

Is there a recommended Docker image / Dockerfile starting point for running R scripts in AWS Batch? I have looked at a few here, and the starting image is could be any of the following: rocker, python:3.6, ubuntu:16.04 or r-base:

What are the key points to consider in choosing a good starting point for a Dockerfile for R jobs? For this particular workload, my main R packages are data.table and the tidyverse packages (no need for RStudio / Shiny server).

1 réponse
0

First you have to decide do you want Microsoft R or Open R, Microsoft R I believe has some optimizations but is a pain to get to run in docker (they hard code the UIDs resulting in conflicts depending on the host node OS). I did Microsoft R for a client a few years back, it was a bit painful. Unfortunately I can't share that file/work.

If Open R works for your use case then why not just start from r-base? https://hub.docker.com/_/r-base/

I'd recommend a Dockerfile that is specific for your use case, installing data.table / other packages post batch startup isn't ideal, so FROM r-base generally we copied a install.R script to the docker image and run it in r-script to install everything. Then saved that image to ECR and referenced it in the batch jobs.

If you are running a lot of jobs you may want to also look into the ECS cluster to change the caching on the computer environment since the images will be quite large.

profile picture
répondu il y a 2 ans

Vous n'êtes pas connecté. Se connecter pour publier une réponse.

Une bonne réponse répond clairement à la question, contient des commentaires constructifs et encourage le développement professionnel de la personne qui pose la question.

Instructions pour répondre aux questions