| Title: | Interface to the GeoSphere Austria DataHub API (Data Access) |
|---|---|
| Description: | Package to download data from the GeoSphere data hub (Austrian National Weather Service) which is the data provider. Currently solely allows to download station data (no gridded data). More details about the data sets can be found on <https://data.hub.geosphere.at/> as well as in the API documentation available via <https://dataset.api.hub.geosphere.at/v1/docs/>. |
| Authors: | Reto Stauffer [aut, cre] (ORCID: <https://orcid.org/0000-0002-3798-5507>) |
| Maintainer: | Reto Stauffer <[email protected]> |
| License: | GPL-2 |
| Version: | 0.0-8 |
| Built: | 2026-06-07 07:10:53 UTC |
| Source: | https://github.com/retostauffer/gsdata |
Small helper function to handle http requests to the API.
API_GET( URL, config = NULL, query = NULL, expected_class = NULL, verbose = FALSE )API_GET( URL, config = NULL, query = NULL, expected_class = NULL, verbose = FALSE )
URL |
the URL to be called. |
config |
|
query |
|
expected_class |
|
verbose |
logical, shows some additional information if |
Returns the object we get from httr::content() after a successful
API call. If an error is detected, an error with additional details will be displayed.
Reto Stauffer
Just returns the base URL for API requests
gs_baseurl(version = 1L)gs_baseurl(version = 1L)
version |
integer, defaults to 1. |
String, base URL for the API.
Reto Stauffer
The GeoSphere Austria (formerly ZAMG) datahub API provides an endpoint to get available
datasets. This function returns a (possibly pre-filtered) data.frame
containing the dataset type, mode and resource_id needed
to perform the data requests (see e.g., gs_stationdata()) amongst some
additional information.
gs_datasets( type = "station", mode = NULL, version = 1L, config = list(), verbose = FALSE )gs_datasets( type = "station", mode = NULL, version = 1L, config = list(), verbose = FALSE )
type |
|
mode |
|
version |
integer, API version (defaults to 1). |
config |
empty list by default; can be a named list to be fowrarded
to the |
verbose |
logical, if set |
The API provides an enpoint to get all available data sets which
can be filtered using the arguments type and/or mode.
Classical usecase:
Return all data sets where mode == "historical":
gs_datasets(mode = "historical")
Return all data sets where type == "grid":
gs_datasets(type = "grid")
Can be combined (setting both type and mode).
Returns a data.frame with all available data types and
API endpoints. When both type and mode are equal to NULL
all data available via the API will be returned.
The most important information of this return is the type, mode,
as well as resource_id which is used to perform data requests.
Reto Stauffer
gs_stationdata
Downloading Dataset Meta Data
gs_metadata( mode, resource_id, type = NULL, version = 1L, config = list(), expert = FALSE, verbose = FALSE )gs_metadata( mode, resource_id, type = NULL, version = 1L, config = list(), expert = FALSE, verbose = FALSE )
mode |
character, specify mode of data. |
resource_id |
character, specify resource identifier of data. |
type |
|
version |
integer, API version (defaults to |
config |
empty list by default; can be a named list to be fowrarded
to the |
expert |
logical, defaults to |
verbose |
logical, if set |
Named list with a series of information about the dataset. Most importantly this function returns information about the stations for which data is available as well as what parameters are available. Note that the availability of data depends on the station; the meta information only provides an overview of what is possibliy avialable.
$stations: an sf object (spatial feature data frame) containing information
of all stations belonging to this dataset including geographical location,
name, and id (the station identifier) wihch is used when retrieving
data (see e.g., gs_stationdata()).
$parameters: a data.frame containing the name of the parameters
used when retrieving the data (see e.g., gs_stationdata()) as well as a
parameter and description. Only available in German, tough.
In addition, the following information will be returned as separate entries in the list:
$title/$id_type: title/id type of the dataset
$frequency: observation frequency/temporal interval
(see also gs_temporal_interval())
$type: data type (e.g., "station")
$mode: data set mode (e.g., "historical")
$response_formats: formats the API provides
$start_time/$end_time: date/time range of availability of this data set
$url: URL; origin of the data set
Reto Stauffer
## Loading meta information for data set with ## mode == "historical" and resource_id = "tawes-v1-10min" tawes <- gs_metadata("historical", "tawes-v1-10min") ## Uses partial matching, thus this short form can be used in case ## there is only one match (one specific data set). With verbose = TRUE ## a message will tell which meta data set will be requested. synop <- gs_metadata("hist", "synop", verbose = TRUE) ## generic sf plotting; variable 'altitude' plot(synop$stations["altitude"], pch = 19, cex = 2)## Loading meta information for data set with ## mode == "historical" and resource_id = "tawes-v1-10min" tawes <- gs_metadata("historical", "tawes-v1-10min") ## Uses partial matching, thus this short form can be used in case ## there is only one match (one specific data set). With verbose = TRUE ## a message will tell which meta data set will be requested. synop <- gs_metadata("hist", "synop", verbose = TRUE) ## generic sf plotting; variable 'altitude' plot(synop$stations["altitude"], pch = 19, cex = 2)
Accessing the API endpoint v<version>/station,
see https://dataset.api.hub.geosphere.at/v1/docs/getting-started.html.
gs_stationdata( mode, resource_id, parameters = NULL, start = NULL, end = NULL, station_ids, expert = FALSE, version = 1L, drop = TRUE, verbose = FALSE, format = NULL, limit = 2e+05, config = list() )gs_stationdata( mode, resource_id, parameters = NULL, start = NULL, end = NULL, station_ids, expert = FALSE, version = 1L, drop = TRUE, verbose = FALSE, format = NULL, limit = 2e+05, config = list() )
mode |
character, specify mode of data. |
resource_id |
character, specify resource identifier of data. |
parameters |
character vector to define which parameters to process. |
start, end
|
object of class |
station_ids |
integer vector with the station IDs to be processed. |
expert |
logical, defaults to |
version |
integer, API version (defaults to |
drop |
logical, if |
verbose |
logical, if set |
format |
|
limit |
integer, API data request limit. If the request sent by the user exceeds this limit, the request will be split into batches automatically. Set to 2e5 as the limit stated on the API documentation (1e6) will not be accepted. |
config |
empty list by default; can be a named list to be fowrarded
to the |
This function is a convenience function for downloading different sets of station data from the GeoSphere Austria data hub (formerly ZAMG). The API may change and additional resources may be added, for details see https://dataset.api.hub.geosphere.at/v1/docs/user-guide/endpoints.html.
To see what's available call gs_datasets("station").
The API has a limit for the number of elements for one request. The calculation is based on the number of expecte elements (i.e., number of stations times number of parameters times number of time steps). This function will pre-calculate the number of expected elements and split the request into batches along the time dimension - if needed.
If only data for one single station (length(station_ids) == 1) is requested,
a zoo object will be returned if data is available. If no data is available,
NULL will be returned.
When multiple stations are requested a list of zoo object (or NULL if no data
is available) is returned. The name of the list corresponds to the station id requested.
Reto Stauffer
###################################################################### ## Latest observations for two tawes stations in Innsbruck. ## Parameters TL (air temperature 2m above ground), TS (air temperature 5cm ## above ground) and RR (amount of rain past 10 minutes). innsbruck <- gs_stationdata(mode = "current", resource_id = "tawes-v1-10min", parameters = c("TL", "TS", "RR"), station_ids = c(11121, 11320), expert = TRUE) # Air temp sapply(innsbruck, function(x) x$TL) # Precipitation (rain) sapply(innsbruck, function(x) x$RR) ## Not run: ###################################################################### ## Example for synop data ## Loading meta information meta <- gs_metadata(mode = "historical", resource_id = "synop-v1-1h") ## For station information check head(meta$stations) ## For available parameters (for this mode/resource_id) check head(meta$parameters) ## Getting data over 48 hours for one single station ## Note: If expert = FALSE (default) gs_stationdata() ## will internally call gs_metadata() once more to check ## if the requested station_ids as well as the parameters ## exist for the data set specified (mode/resource_id). mayrhofen <- gs_stationdata(mode = "historical", resource_id = "synop-v1-1h", start = "2020-01-01", end = "2020-01-03", parameters = c("T", "Td", "ff"), station_ids = 11330, verbose = TRUE) library("zoo") plot(mayrhofen, screen = c(1, 1, 2), col = c(2, 3, 4)) ## Getting data over 48 hours for three stations simultanously ## Mayrhofen Tirol, Achenkirch Tirol (no data), and Innsbruck Airport Tirol x <- gs_stationdata(mode = "historical", resource_id = "synop-v1-1h", start = "2020-01-01", end = "2020-01-03", parameters = c("T", "Td", "ff"), station_ids = c(11330, 11328, 11120), expert = TRUE) plot(x[["11330"]], screen = c(1, 1, 2), col = c(2, 3, 4)) is.null(x[["11328"]]) plot(x[["11120"]], screen = c(1, 1, 2), col = c(2, 3, 4)) ###################################################################### ## Example for daily climatological records meta <- gs_metadata("historical", "klima-v2-1d") achenkirch <- gs_stationdata(mode = "historical", resource_id = "klima-v2-1d", start = "2020-06-01", end = "2022-12-31", parameters = c("rr", "rr_i", "rr_iii", "so_h"), station_ids = 8807, expert = TRUE) head(achenkirch) plot(achenkirch, type = "h") ###################################################################### ## Example for 10min KLIMA data # meta$parameter contains available parameters, # meta$stations available stations meta <- gs_metadata("historical", "klima-v2-10min") uibk <- gs_stationdata(mode = "historical", resource_id = "klima-v2-10min", start = "2010-11-01", end = "2011-02-01", parameters = c("tl", "ffam", "ffx"), station_ids = 11803, expert = TRUE) plot(uibk, screens = c(1, 2, 2), col = c(2, 4, 8), ylab = c("temperature", "mean wind\nand gusts")) ###################################################################### ## Example for 10min TAWES data ## NOTE/WARNING: ## ! "tawes" is not quality controlled and provides limited ## ! amount of data. Consider to use the "klima-v2-10min" data set which ## ! provides long-term historical data for the same stations with the ## ! same temporal resolution, however, the station IDs and ## ! parameter names (as well as avilavle parameters) will differ ## ! (check meta data). # meta$parameter contains available parameters, # meta$stations available stations meta <- gs_metadata("historical", "tawes-v1-10min") uibk <- gs_stationdata(mode = "historical", resource_id = "tawes-v1-10min", start = Sys.Date() - 30, end = Sys.Date(), parameters = c("TL", "TP", "FFAM", "FFX"), station_ids = 11320, expert = TRUE) plot(uibk, screens = c(1, 1, 2, 2), col = c(2, 3, 4, 8), ylab = c("temperature\nand dewpoint", "mean wind\nand gusts")) ###################################################################### ## Example for annual histalp data gs_metadata("historical", "histalp-v1-1y") bregenz <- gs_stationdata(mode = "historical", resource_id = "histalp-v1-1y", start = "1854-01-01", end = "2022-01-01", parameters = c("R01", "T01"), station_ids = 23, expert = TRUE) plot(bregenz, col = c(4, 2)) ## End(Not run)###################################################################### ## Latest observations for two tawes stations in Innsbruck. ## Parameters TL (air temperature 2m above ground), TS (air temperature 5cm ## above ground) and RR (amount of rain past 10 minutes). innsbruck <- gs_stationdata(mode = "current", resource_id = "tawes-v1-10min", parameters = c("TL", "TS", "RR"), station_ids = c(11121, 11320), expert = TRUE) # Air temp sapply(innsbruck, function(x) x$TL) # Precipitation (rain) sapply(innsbruck, function(x) x$RR) ## Not run: ###################################################################### ## Example for synop data ## Loading meta information meta <- gs_metadata(mode = "historical", resource_id = "synop-v1-1h") ## For station information check head(meta$stations) ## For available parameters (for this mode/resource_id) check head(meta$parameters) ## Getting data over 48 hours for one single station ## Note: If expert = FALSE (default) gs_stationdata() ## will internally call gs_metadata() once more to check ## if the requested station_ids as well as the parameters ## exist for the data set specified (mode/resource_id). mayrhofen <- gs_stationdata(mode = "historical", resource_id = "synop-v1-1h", start = "2020-01-01", end = "2020-01-03", parameters = c("T", "Td", "ff"), station_ids = 11330, verbose = TRUE) library("zoo") plot(mayrhofen, screen = c(1, 1, 2), col = c(2, 3, 4)) ## Getting data over 48 hours for three stations simultanously ## Mayrhofen Tirol, Achenkirch Tirol (no data), and Innsbruck Airport Tirol x <- gs_stationdata(mode = "historical", resource_id = "synop-v1-1h", start = "2020-01-01", end = "2020-01-03", parameters = c("T", "Td", "ff"), station_ids = c(11330, 11328, 11120), expert = TRUE) plot(x[["11330"]], screen = c(1, 1, 2), col = c(2, 3, 4)) is.null(x[["11328"]]) plot(x[["11120"]], screen = c(1, 1, 2), col = c(2, 3, 4)) ###################################################################### ## Example for daily climatological records meta <- gs_metadata("historical", "klima-v2-1d") achenkirch <- gs_stationdata(mode = "historical", resource_id = "klima-v2-1d", start = "2020-06-01", end = "2022-12-31", parameters = c("rr", "rr_i", "rr_iii", "so_h"), station_ids = 8807, expert = TRUE) head(achenkirch) plot(achenkirch, type = "h") ###################################################################### ## Example for 10min KLIMA data # meta$parameter contains available parameters, # meta$stations available stations meta <- gs_metadata("historical", "klima-v2-10min") uibk <- gs_stationdata(mode = "historical", resource_id = "klima-v2-10min", start = "2010-11-01", end = "2011-02-01", parameters = c("tl", "ffam", "ffx"), station_ids = 11803, expert = TRUE) plot(uibk, screens = c(1, 2, 2), col = c(2, 4, 8), ylab = c("temperature", "mean wind\nand gusts")) ###################################################################### ## Example for 10min TAWES data ## NOTE/WARNING: ## ! "tawes" is not quality controlled and provides limited ## ! amount of data. Consider to use the "klima-v2-10min" data set which ## ! provides long-term historical data for the same stations with the ## ! same temporal resolution, however, the station IDs and ## ! parameter names (as well as avilavle parameters) will differ ## ! (check meta data). # meta$parameter contains available parameters, # meta$stations available stations meta <- gs_metadata("historical", "tawes-v1-10min") uibk <- gs_stationdata(mode = "historical", resource_id = "tawes-v1-10min", start = Sys.Date() - 30, end = Sys.Date(), parameters = c("TL", "TP", "FFAM", "FFX"), station_ids = 11320, expert = TRUE) plot(uibk, screens = c(1, 1, 2, 2), col = c(2, 3, 4, 8), ylab = c("temperature\nand dewpoint", "mean wind\nand gusts")) ###################################################################### ## Example for annual histalp data gs_metadata("historical", "histalp-v1-1y") bregenz <- gs_stationdata(mode = "historical", resource_id = "histalp-v1-1y", start = "1854-01-01", end = "2022-01-01", parameters = c("R01", "T01"), station_ids = 23, expert = TRUE) plot(bregenz, col = c(4, 2)) ## End(Not run)
Helper function to extract and return the temporal interval
(in seconds) for different data sets based on the resource_id
identifier. Used to estimate the number of expected values to be
retrieved as there is per-request limit.
gs_temporal_interval(resource_id)gs_temporal_interval(resource_id)
resource_id |
character, name of the |
Returns an integer vector of the same length as the input
vector resource_id with the temporal resolution of the data
set in seconds or NA in case the string could not have been decoded.
Reto Stauffer
gs_temporal_interval("test-1d-string") ds <- gs_datasets() gs_temporal_interval(ds$resource_id)gs_temporal_interval("test-1d-string") ds <- gs_datasets() gs_temporal_interval(ds$resource_id)
gsdata: Interface to the GeoSphere Austria DataHub API (Data Access)This package allows convenient access to data provided by GeoSphere Austria (Austrias federal agency for geology, geophysics, climatology and meteorology) via their data API which exists since around mid 2023.
The API not only provides access to station data (the one thing currently covered by this package; will be extended) but also access to spatial data; a catalogue which has been extended over and over again over the past 10 months. Details about all available data sets and their temporal and spatial extent can be found on their website:
The API has a request limit; a limit to how much data one is allowed to retrieve in one API request. Details on the current limit can be found in the GeoSphere Dataset API Documentation.
This package internally tries to estimate the request size and split the request into multiple batches in case one single request would (likely) exceed these limits.
Thus, one single call to e.g., gs_stationdata() can trigger multiple
API calls. If used without expert = TRUE two initial calls are made to
check if the data set requested does exist, and that the
stations and parameters requested exist in this data set. If the data request
needs to be split in addition, this can cause a series of calls to the API
which also has a limit on number of requests per time.
In the worst case this causes a temporary ban (timeout due to too many requests) from the servers. One way around is to limit the number of requests per time, more details about this in the next section.
Note that each function call can result in multiple API requests which can lead to a timeout (too many requests). To avoid running into timeout issues:
use expert = TURUE where possible as it
lowers the number of calls to the api.
request data for multiple stations at once, especially when requesting short time periods/few parameters as, in the best case, all data can be retrieved on one single call (if below estimated data request limit).
wait between requests using e.g., Sys.sleep(...).
or use the packages own 'cooldown' option. By default,
a cooldown time of 0.1 seconds is used (the minimum
time between two requests. You can set a custom cooldown time
via options('gsdata.cooldown' = 1). Will overwrite the
default and ensure that there will be at least one second
between consecutive API calls. If you have no time critical
requests this is a good way to be nice to the data provider!
Maintainer: Reto Stauffer [email protected] (ORCID)
Useful links:
This function is called whenever httr::GET returns an
http status code out of the 200 range (success).
Shows http_status code information alongside
with additional messages returned by the API (if any).
show_http_status_and_terminate(scode, xtra = NULL)show_http_status_and_terminate(scode, xtra = NULL)
scode |
numeric, http status code. |
xtra |
|
No return, will terminate R.
Reto Stauffer