One day you want to compute one quantile across 95 rasters, and suddenly your script takes 7 hours to run, luckily at night. And then you’re facing the truth: you will have to dig into the stars R-package. But to this date, few implementations lie around. So we share here one demonstration of the efficiency of stars with a special big up for the st_apply function.
stars
R package (Pebesma 2021) has been for me for a while the unreachable R fruit, almost a Tantalus-like temptation.
And then I needed to compute the 95th percentile of a stack of 95 rasters. I guessed stars
could be a solution but my particular task - that is summarising at grid cell level a raster stack - was not directly covered in the manual. And stars
is still young and does not benefit of a long stash of StackOverflow questions (1000+ for the r-raster
tag)
Therefore I will share the simple workflow of computing a quantile across a stack of raster in the stars
framework. I will cover the stars
notions of: dimension
, stars_proxy
and st_apply
.
You first load the rasters in a stack named y
:
library(stars)
ls <- list.files(dir, full.names=TRUE)
y <- read_stars(ls, quiet=T, proxy =T, along='attributes')
Here occur our first two tricks.
First we read the rasters in a stars_proxy
object which doesn’t load the rasters in memory.
Second, we tell stars to create one object that stacks all the rasters, in a dimension called attributes
, such that y
is a stars object with three dimensions: x, y, attributes.
Then we compute our grid-cell metric by using st_apply
across this third dimension.
Two last tricks:
PROGRESS =T
shows a very informative progress barFUTURE = T
makes use of the parallel processing offered by the future.apply
R package.Lastly, in order to actually run the computation, we need to tell R to save the output such that it will starts the process:
write_stars(q_95, outdir)
Results: the computation took 2h instead of 7h30.
For attribution, please cite this work as
Darin (2021, Nov. 9). Meet Edith: Efficient computation across multiple rasters. Retrieved from https://edarin.github.io/thatsme/posts/2021-11-09-efficient-computation-across-multiple-rasters/
BibTeX citation
@misc{darin2021efficient, author = {Darin, Edith}, title = {Meet Edith: Efficient computation across multiple rasters}, url = {https://edarin.github.io/thatsme/posts/2021-11-09-efficient-computation-across-multiple-rasters/}, year = {2021} }