%@meta language="R-vignette" content="-------------------------------- %\VignetteIndexEntry{A Future for R: Apply Function to Elements in Parallel} %\VignetteAuthor{Henrik Bengtsson} %\VignetteKeyword{R} %\VignetteKeyword{package} %\VignetteKeyword{vignette} %\VignetteKeyword{future} %\VignetteKeyword{lazy evaluation} %\VignetteKeyword{synchronous} %\VignetteKeyword{asynchronous} %\VignetteKeyword{parallel} %\VignetteKeyword{cluster} %\VignetteEngine{R.rsp::rsp} %\VignetteTangle{FALSE} --------------------------------------------------------------------"%> # A Future for R: Apply Function to Elements in Parallel ## Introduction The purpose of this package is to provide worry-free parallel alternatives to base-R "apply" functions, e.g. `apply()`, `lapply()`, and `vapply()`. The goal is that one should be able to replace any of these in the core with its futurized equivalent and things will just work. For example, instead of doing: ```r library(datasets) library(stats) y <- lapply(mtcars, FUN = mean, trim = 0.10) ``` one can do: ```r library(future.apply) plan(multisession) ## Run in parallel on local computer library(datasets) library(stats) y <- future_lapply(mtcars, FUN = mean, trim = 0.10) ``` Reproducibility is part of the core design, which means that perfect, parallel random number generation (RNG) is supported regardless of the amount of chunking, type of load balancing, and future backend being used. To enable parallel RNG, use argument `future.seed = TRUE`. ## Role Where does the **[future.apply]** package fit in the software stack? You can think of it as a sibling to **[foreach]**, **[furrr]**, **[BiocParallel]**, **[plyr]**, etc. Just as **parallel** provides `parLapply()`, **foreach** provides `foreach()`, **BiocParallel** provides `bplapply()`, and **plyr** provides `llply()`, **future.apply** provides `future_lapply()`. Below is a table summarizing this idea:
Package | Functions | Backends |
---|---|---|
future.apply |
Future-versions of common goto *apply() functions available in base R (of the base package):future_apply() ,
future_by() ,
future_eapply() ,
future_lapply() ,
future_Map() ,
future_mapply() ,
future_.mapply() ,
future_replicate() ,
future_sapply() ,
future_tapply() , and
future_vapply() .
The following function is not implemented: future_rapply() |
All future backends |
parallel |
mclapply() , mcmapply() ,
clusterMap() , parApply() , parLapply() , parSapply() , ...
|
Built-in and conditional on operating system |
foreach |
foreach() ,
times()
|
All future backends via doFuture |
furrr |
future_imap() ,
future_map() ,
future_pmap() ,
future_map2() ,
...
|
All future backends |
BiocParallel |
Bioconductor's parallel mappers:bpaggregate() ,
bpiterate() ,
bplapply() , and
bpvec()
|
All future backends via doFuture (because it supports foreach) or via BiocParallel.FutureParam (direct BiocParallelParam support; prototype) |
plyr |
**ply(..., .parallel = TRUE) functions:aaply() ,
ddply() ,
dlply() ,
llply() , ...
|
All future backends via doFuture (because it uses foreach internally) |