Some Dockerfiles for Building R Package Binaries
I went down a strange path recently, trying to compile binaries of R packages for Linux. I’m not sure why — this area is pretty much covered by the RStudio Package Manager. I’ll leave my Dockerfiles here in case they’re of any use to a future wayward R programmer.
The intention here is to build a Docker image that can build an R binary with the below command. I’m trying to build x86 binaries on my ARM Macbook, so I’m specifying the platform during both build
and run
.
docker run --platform linux/amd64 -v ~/packages:/packages $IMAGE $PACKAGE $VERSION
This will output the compiled binary into a subdirectory ~/packages
corresponding to the target version of R. These binaries are not portable — they depend very much on the Linux distribution used to build them.
Method 1: conda-build
conda
is a package manager mostly associated with Python, but it can also be used for R and other languages.
The Dockerfile below installs Miniconda and conda-build
, which it uses to build the R package binaries. These are binaries that must be installed with conda
, rather than through R directly.
I use mamba
and boa
, which provide faster alternatives to conda install
and conda build
, respectively.
Every time conda
/mamba
builds an R package, it fetches all dependencies from scratch. To speed this up, I install R in the docker build
process so that it’s cached. Finally I hardcode the script that’s used to build the R package, depending on whether a version is specified.
ARG OS_IDENTIFIER=ubuntu
ARG OS_TAG=20.04
ARG PLATFORM=linux/amd64
FROM --platform=${PLATFORM} ${OS_IDENTIFIER}:${OS_TAG}
ENV LANG en_US.UTF-8
RUN apt-get update && apt-get install -y curl
# Install Miniconda and conda-build, which is needed to compile R packages
# for conda-forge
ARG MINICONDA_VERSION=py38_4.9.2
ARG MINICONDA_INSTALLER=Miniconda3-${MINICONDA_VERSION}-Linux-x86_64.sh
RUN curl -LO https://repo.anaconda.com/miniconda/${MINICONDA_INSTALLER} \
&& bash ${MINICONDA_INSTALLER} -p /miniconda -b \
&& rm ${MINICONDA_INSTALLER}
ENV PATH=/miniconda/bin:${PATH}
RUN conda install conda-build
# Mamba is much faster for installing packages, and boa lets us use it
# when building packages
RUN conda install -c conda-forge mamba boa
# conda-build (and its mamba equivalent) will always reach out to a repository
# to install dependencies, rather than using pre-installed packages. However,
# by installing r-base now we can cache the required packages, so that R
# doesn't have to be downloaded each time a package is built.
ENV R_VERSION=4.0.3
RUN mamba install -c conda-forge r-base=${R_VERSION}
# Compiled packages are outputted to this directory. When this container is run,
# /packages can be used as a target for -v
RUN mkdir -p /packages/R-{$R_VERSION}
RUN echo "#!/bin/bash" > build_r_package.sh \
&& echo ' \n\
package=$1 \n\
version=$2 \n\
if [[ -n "$2" ]]; then \n\
echo "Building r-$package-$version" \n\
conda skeleton cran --version $version $package \n\
conda mambabuild --R ${R_VERSION} -c conda-forge --output-folder /packages/R-${R_VERSION} r-$package-$version \n\
else \n\
echo "Building r-$package" \n\
conda skeleton cran $package \n\
conda mambabuild --R ${R_VERSION} -c conda-forge --output-folder /packages/R-${R_VERSION} r-$package \n\
fi ' >> build_r_package.sh \
&& chmod +x build_r_package.sh
ENTRYPOINT ["/build_r_package.sh"]
Even with mamba
this is a slow process — it takes over 10 minutes to compile the glue
package, which has minimal dependencies.
Method 2: Just R
Using just R requires a bit more logic. I’ve separated out some R helper scripts, as well as the bash script that does the actual building. I start with rocker
which already has R installed. I also need the remotes
package to install package dependencies.
FROM rocker/r-ver:4.0.3
RUN apt-get update && apt-get install -y curl
RUN Rscript -e 'install.packages("remotes")'
ENV R_VERSION=4.0.3
RUN mkdir -p /packages/R-${R_VERSION}
RUN mkdir /scripts
ADD helpers.R /scripts/helpers.R
ADD build-R-package.sh /scripts/build-R-package.sh
RUN chmod +x /scripts/build-R-package.sh
ENTRYPOINT ["/scripts/build-R-package.sh"]
The R helper functions I need query CRAN to determine the latest available version of a package. If the desired version is not the latest, then the source needs to be downloaded from the CRAN archives.
cran_version <- function(package) {
if (is.null(getOption("repos")) || getOption("repos") == "@CRAN@") {
options(repos = c(CRAN = "https://cloud.r-project.org/"))
}
available <- as.data.frame(available.packages())
filtered <- available[available$Package == package,]
if (nrow(filtered) != 1) {
stop(package, " is not available on CRAN")
}
filtered$Version
}
cran_source_url <- function(package, version = NULL) {
if (is.null(version)) {
version <- cran_version(package)
latest_version <- TRUE
} else {
latest_version <- (version == cran_version(package))
}
bundle <- paste0(package, "_", version, ".tar.gz")
if (latest_version) {
paste0("https://cran.r-project.org/src/contrib/", bundle)
} else {
paste0("https://cran.r-project.org/src/contrib/Archive/", package, "/", bundle)
}
}
The bash script calls on the helpers as needed. If no version is specified, the latest version is used. Then the source is downloaded from CRAN and the package is built. It’s also installed — building and installing are closely related with R. Finally the resulting binary is moved to the packages
directory.
#!/bin/bash
package=$1
version=$2
if [[ -z "$version" ]]; then
version=$(Rscript -e "source('/scripts/helpers.R');cat(cran_version('$package'))")
fi
url=$(Rscript -e "source('/scripts/helpers.R');cat(cran_source_url('$package', '$version'))")
echo "Downloading $url"
curl -LO $url
Rscript -e "remotes::install_deps('/${package}_${version}.tar.gz')"
mkdir binary && cd binary
R CMD INSTALL --build /${package}_${version}.tar.gz
mv * /packages/R-${R_VERSION}
The image at the top of this page is in the public domain
devtools::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#> setting value
#> version R version 4.0.3 (2020-10-10)
#> os macOS Big Sur 10.16
#> system x86_64, darwin17.0
#> ui X11
#> language (EN)
#> collate en_AU.UTF-8
#> ctype en_AU.UTF-8
#> tz Australia/Melbourne
#> date 2021-04-19
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────
#> package * version date lib source
#> callr 3.6.0 2021-03-28 [1] CRAN (R 4.0.3)
#> cli 2.4.0 2021-04-05 [1] CRAN (R 4.0.2)
#> crayon 1.4.1 2021-02-08 [1] CRAN (R 4.0.2)
#> desc 1.3.0 2021-03-05 [1] CRAN (R 4.0.2)
#> devtools 2.3.2 2020-09-18 [1] CRAN (R 4.0.2)
#> digest 0.6.27 2020-10-24 [1] CRAN (R 4.0.2)
#> downlit 0.2.1 2020-11-04 [1] CRAN (R 4.0.2)
#> ellipsis 0.3.1 2020-05-15 [1] CRAN (R 4.0.2)
#> evaluate 0.14 2019-05-28 [1] CRAN (R 4.0.1)
#> fansi 0.4.2 2021-01-15 [1] CRAN (R 4.0.2)
#> fs 1.5.0 2020-07-31 [1] CRAN (R 4.0.2)
#> glue 1.4.2 2020-08-27 [1] CRAN (R 4.0.2)
#> htmltools 0.5.1.1 2021-01-22 [1] CRAN (R 4.0.2)
#> hugodown 0.0.0.9000 2021-04-19 [1] Github (r-lib/hugodown@97ea0cd)
#> knitr 1.32 2021-04-14 [1] CRAN (R 4.0.2)
#> lifecycle 1.0.0 2021-02-15 [1] CRAN (R 4.0.2)
#> magrittr 2.0.1 2020-11-17 [1] CRAN (R 4.0.2)
#> memoise 1.1.0 2017-04-21 [1] CRAN (R 4.0.2)
#> pkgbuild 1.2.0 2020-12-15 [1] CRAN (R 4.0.2)
#> pkgload 1.1.0 2020-05-29 [1] CRAN (R 4.0.2)
#> prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.0.2)
#> processx 3.5.1 2021-04-04 [1] CRAN (R 4.0.2)
#> ps 1.6.0 2021-02-28 [1] CRAN (R 4.0.2)
#> purrr 0.3.4 2020-04-17 [1] CRAN (R 4.0.2)
#> R6 2.5.0 2020-10-28 [1] CRAN (R 4.0.2)
#> remotes 2.2.0 2020-07-21 [1] CRAN (R 4.0.2)
#> rlang 0.4.10 2020-12-30 [1] CRAN (R 4.0.2)
#> rmarkdown 2.7.10 2021-04-19 [1] Github (rstudio/rmarkdown@eb55b2e)
#> rprojroot 2.0.2 2020-11-15 [1] CRAN (R 4.0.2)
#> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.0.2)
#> stringi 1.5.3 2020-09-09 [1] CRAN (R 4.0.2)
#> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.0.2)
#> testthat 3.0.1 2020-12-17 [1] CRAN (R 4.0.2)
#> usethis 2.0.1 2021-02-10 [1] CRAN (R 4.0.2)
#> vctrs 0.3.7 2021-03-29 [1] CRAN (R 4.0.2)
#> withr 2.4.2 2021-04-18 [1] CRAN (R 4.0.3)
#> xfun 0.22 2021-03-11 [1] CRAN (R 4.0.2)
#> yaml 2.2.1 2020-02-01 [1] CRAN (R 4.0.2)
#>
#> [1] /Library/Frameworks/R.framework/Versions/4.0/Resources/library