Base Docker image / Dockerfile for running R jobs in AWS Batch

0

Is there a recommended Docker image / Dockerfile starting point for running R scripts in AWS Batch? I have looked at a few here, and the starting image is could be any of the following: rocker, python:3.6, ubuntu:16.04 or r-base:

What are the key points to consider in choosing a good starting point for a Dockerfile for R jobs? For this particular workload, my main R packages are data.table and the tidyverse packages (no need for RStudio / Shiny server).

1 個回答
0

First you have to decide do you want Microsoft R or Open R, Microsoft R I believe has some optimizations but is a pain to get to run in docker (they hard code the UIDs resulting in conflicts depending on the host node OS). I did Microsoft R for a client a few years back, it was a bit painful. Unfortunately I can't share that file/work.

If Open R works for your use case then why not just start from r-base? https://hub.docker.com/_/r-base/

I'd recommend a Dockerfile that is specific for your use case, installing data.table / other packages post batch startup isn't ideal, so FROM r-base generally we copied a install.R script to the docker image and run it in r-script to install everything. Then saved that image to ECR and referenced it in the batch jobs.

If you are running a lot of jobs you may want to also look into the ECS cluster to change the caching on the computer environment since the images will be quite large.

profile picture
已回答 1 年前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南