Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preserve metadata with zip archives #109

Merged
merged 13 commits into from
Oct 31, 2024
9 changes: 5 additions & 4 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -34,15 +34,16 @@ RoxygenNote: 7.3.2
Imports:
targets (>= 1.8.0),
rlang (>= 1.1.3),
cli (>= 3.6.2)
cli (>= 3.6.2),
terra (>= 1.7.71),
withr (>= 3.0.0),
zip
Suggests:
crew (>= 0.9.2),
ncmeta,
sf,
stars,
terra (>= 1.7.71),
testthat (>= 3.0.0),
withr (>= 3.0.0)
testthat (>= 3.0.0)
Config/testthat/edition: 3
URL: https://github.com/njtierney/geotargets, https://njtierney.github.io/geotargets/
BugReports: https://github.com/njtierney/geotargets/issues
2 changes: 2 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@
* Added the `description` argument to all `tar_*()` functions which is passed to `tar_target()`.
* Requires GDAL 3.1 or greater to use "ESRI Shapefile" driver in `tar_terra_vect()` (#71, #97)
* `geotargets` now requires `targets` version 1.8.0 or higher
* `tar_terra_rast()` gains a `preserve_metadata` option that when set to `"zip"` reads/writes targets as zip archives that include aux.json "sidecar" files sometimes written by `terra` (#58)
* `terra` (>= 1.7.71), `withr` (>= 3.0.0), and `zip` are now required dependencies of `geotargets` (moved from `Suggests` to `Imports`)

# geotargets 0.1.0 (14 May 2024)

Expand Down
17 changes: 14 additions & 3 deletions R/geotargets-option.R
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,13 @@
#' a unique set of creation options. For example, with the default `"GeoJSON"`
#' driver:
#' <https://gdal.org/drivers/vector/geojson.html#layer-creation-options>
#' @param terra_preserve_metadata character. When `"drop"` (default), any
#' auxiliary files that would be written by [terra::writeRaster()] containing
#' raster metadata such as units and datetimes are lost (note that this does
#' not include layer names set with `names() <-`). When `"zip"`, these
#' metadata are retained by archiving all written files as a zip file upon
#' writing and unzipping them upon reading. This adds extra overhead and will
#' slow pipelines.
#'
#' @details
#' These options can also be set using `options()`. For example,
Expand Down Expand Up @@ -56,7 +63,8 @@ geotargets_option_set <- function(
gdal_raster_driver = NULL,
gdal_raster_creation_options = NULL,
gdal_vector_driver = NULL,
gdal_vector_creation_options = NULL
gdal_vector_creation_options = NULL,
terra_preserve_metadata = NULL
) {
# TODO do this programmatically with formals() or something? `options()` also accepts a named list
options(
Expand All @@ -67,7 +75,9 @@ geotargets_option_set <- function(
"geotargets.gdal.vector.driver" = gdal_vector_driver %||%
geotargets_option_get("gdal.vector.driver"),
"geotargets.gdal.vector.creation.options" = gdal_vector_creation_options %||%
geotargets_option_get("gdal.vector.creation.options")
geotargets_option_get("gdal.vector.creation.options"),
"geotargets.terra.preserve.metadata" = terra_preserve_metadata %||%
geotargets_option_get("terra.preserve.metadata")
)

}
Expand All @@ -87,7 +97,8 @@ geotargets_option_get <- function(name) {
"geotargets.gdal.raster.driver",
"geotargets.gdal.raster.creation.options",
"geotargets.gdal.vector.driver",
"geotargets.gdal.vector.creation.options"
"geotargets.gdal.vector.creation.options",
"geotargets.terra.preserve.metadata"
))

env_name <- gsub("\\.", "_", toupper(option_name))
Expand Down
76 changes: 64 additions & 12 deletions R/tar-terra-rast.R
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,13 @@
#' to [terra::writeRaster()]
#' @param gdal character. GDAL driver specific datasource creation options
#' passed to [terra::writeRaster()]
#' @param preserve_metadata character. When `"drop"` (default), any auxiliary
#' files that would be written by [terra::writeRaster()] containing raster
#' metadata such as units and datetimes are lost (note that this does not
#' include layer names set with `names() <-`). When `"zip"`, these metadata
#' are retained by archiving all written files as a zip file upon writing and
#' unzipping them upon reading. This adds extra overhead and will slow
#' pipelines.
#' @param ... Additional arguments not yet used
#'
#' @inheritParams targets::tar_target
Expand Down Expand Up @@ -38,6 +45,7 @@ tar_terra_rast <- function(name,
pattern = NULL,
filetype = geotargets_option_get("gdal.raster.driver"),
gdal = geotargets_option_get("gdal.raster.creation.options"),
preserve_metadata = geotargets_option_get("terra.preserve.metadata"),
...,
tidy_eval = targets::tar_option_get("tidy_eval"),
packages = targets::tar_option_get("packages"),
Expand All @@ -53,14 +61,16 @@ tar_terra_rast <- function(name,
retrieval = targets::tar_option_get("retrieval"),
cue = targets::tar_option_get("cue"),
description = targets::tar_option_get("description")) {
check_pkg_installed("terra")

filetype <- filetype %||% "GTiff"

#check that filetype option is available
drv <- get_gdal_available_driver_list("raster")
filetype <- rlang::arg_match0(filetype, drv$name)

#currently only "drop" and "zip" are valid options
preserve_metadata <- preserve_metadata %||% "drop"
preserve_metadata <- rlang::arg_match0(preserve_metadata, c("drop", "zip"))

#ensure that user-passed `resources` doesn't include `custom_format`
if ("custom_format" %in% names(resources)) {
Expand Down Expand Up @@ -90,19 +100,11 @@ tar_terra_rast <- function(name,
packages = packages,
library = library,
format = targets::tar_format(
read = function(path) terra::rast(path),
write = function(object, path) {
terra::writeRaster(
object,
path,
filetype = filetype,
overwrite = TRUE,
gdal = gdal
)
},
read = tar_rast_read(preserve_metadata = preserve_metadata),
write = tar_rast_write(filetype = filetype, gdal = gdal, preserve_metadata = preserve_metadata),
marshal = function(object) terra::wrap(object),
unmarshal = function(object) terra::unwrap(object),
substitute = list(filetype = filetype, gdal = gdal)
substitute = list(filetype = filetype, gdal = gdal, preserve_metadata = preserve_metadata)
),
repository = repository,
iteration = "list", #only "list" works right now
Expand All @@ -118,3 +120,53 @@ tar_terra_rast <- function(name,
description = description
)
}

tar_rast_read <- function(preserve_metadata) {
switch(
preserve_metadata,
zip = function(path) {
tmp <- withr::local_tempdir()
zip::unzip(zipfile = path, exdir = tmp)
terra::rast(file.path(tmp, basename(path)))
},
drop = function(path) terra::rast(path)
)
}

tar_rast_write <- function(filetype, gdal, preserve_metadata) {
switch(
preserve_metadata,
zip = function(object, path) {
#write the raster in a fresh local tempdir() that disappears when function is done
tmp <- withr::local_tempdir()
dir.create(file.path(tmp, dirname(path)), recursive = TRUE)
terra::writeRaster(
object,
file.path(tmp, path),
filetype = filetype,
overwrite = TRUE,
gdal = gdal
)
#package files into a zip file using `zip::zip()`
raster_files <- list.files(file.path(tmp, dirname(path)), full.names = TRUE)
zip::zip(
file.path(tmp, basename(path)),
files = raster_files,
compression_level = 1,
mode = "cherry-pick",
root = dirname(raster_files)[1]
)
#move the zip file to the expected place
file.rename(file.path(tmp, basename(path)), path)
},
drop = function(object, path) {
terra::writeRaster(
object,
path,
filetype = filetype,
overwrite = TRUE,
gdal = gdal
)
}
)
}
2 changes: 0 additions & 2 deletions R/tar-terra-sprc.R
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,6 @@ tar_terra_sprc <- function(name,
retrieval = targets::tar_option_get("retrieval"),
cue = targets::tar_option_get("cue"),
description = targets::tar_option_get("description")) {
check_pkg_installed("terra")
#ensure that user-passed `resources` doesn't include `custom_format`
if ("custom_format" %in% names(resources)) {
cli::cli_abort("{.val custom_format} cannot be supplied to targets created with {.fn tar_terra_sprc}")
Expand Down Expand Up @@ -184,7 +183,6 @@ tar_terra_sds <- function(name,
retrieval = targets::tar_option_get("retrieval"),
cue = targets::tar_option_get("cue"),
description = targets::tar_option_get("description")) {
check_pkg_installed("terra")
#ensure that user-passed `resources` doesn't include `custom_format`
if ("custom_format" %in% names(resources)) {
cli::cli_abort("{.val custom_format} cannot be supplied to targets created with {.fn tar_terra_sprc}")
Expand Down
2 changes: 0 additions & 2 deletions R/tar-terra-vect.R
Original file line number Diff line number Diff line change
Expand Up @@ -62,8 +62,6 @@ tar_terra_vect <- function(name,
retrieval = targets::tar_option_get("retrieval"),
cue = targets::tar_option_get("cue"),
description = targets::tar_option_get("description")) {
check_pkg_installed("terra")

filetype <- filetype %||% "GeoJSON"
gdal <- gdal %||% "ENCODING=UTF-8"

Expand Down
Loading
Loading