Cubble provides a new data structure to manipulate spatio-temporal
vector data. It arranges variables into two forms: nested form and long
form. The nested form shows each site in a row and time invariant
variables as columns. The time varying variables are nested into a
ts
column. In the long form, each row is cross-identified
by the site and time, time varying variables are presented, and time
invariant variables are stored as an attribute. The two forms can be
switched back and forth for manipulation on the spatial and temporal
dimension of the data.
You can install the released version of cubble from CRAN with:
install.packages("cubble")
And the development version from GitHub with:
# install.packages("remotes")
::install_github("huizezhang-sherry/cubble") remotes
as_cubble()
creates a cubble in the nested form by
supplying the spatial identifier, key
, temporal identifier,
index
, and the spatial coordinates that defines the site,
coords
.
library(cubble)
library(dplyr)
#> Warning: package 'dplyr' was built under R version 4.1.2
<- climate_flat |>
nested as_cubble(key = id, index = date, coords = c(long, lat))
nested#> # cubble: id [5]: nested form
#> # bbox: [115.97, -32.94, 133.55, -12.42]
#> # temporal: date [date], prcp [dbl], tmax [dbl], tmin [dbl]
#> id lat long elev name wmo_id ts
#> <chr> <dbl> <dbl> <dbl> <chr> <dbl> <list>
#> 1 ASN00009021 -31.9 116. 15.4 perth airport 94610 <tibble [366 × 4]>
#> 2 ASN00010311 -31.9 117. 179 york 94623 <tibble [366 × 4]>
#> 3 ASN00010614 -32.9 117. 338 narrogin 94627 <tibble [366 × 4]>
#> 4 ASN00014015 -12.4 131. 30.4 darwin airport 94120 <tibble [366 × 4]>
#> 5 ASN00015131 -17.6 134. 220 elliott 94236 <tibble [366 × 4]>
face_temporal()
switches a cubble from the nested form
to the long form. The long form cubble is for operations whose output is
cross-identified by key
and index
, for
example, filtering January records:
<- nested |>
long face_temporal() |>
filter(lubridate::month(date) == 1)
long#> # cubble: date, id [5]: long form
#> # bbox: [115.97, -32.94, 133.55, -12.42]
#> # spatial: lat [dbl], long [dbl], elev [dbl], name [chr], wmo_id [dbl]
#> id date prcp tmax tmin
#> <chr> <date> <dbl> <dbl> <dbl>
#> 1 ASN00009021 2020-01-01 0 31.9 15.3
#> 2 ASN00009021 2020-01-02 0 24.9 16.4
#> 3 ASN00009021 2020-01-03 6 23.2 13
#> 4 ASN00009021 2020-01-04 0 28.4 12.4
#> 5 ASN00009021 2020-01-05 0 35.3 11.6
#> 6 ASN00009021 2020-01-06 0 34.8 13.1
#> 7 ASN00009021 2020-01-07 0 32.8 15.1
#> 8 ASN00009021 2020-01-08 0 30.4 17.4
#> 9 ASN00009021 2020-01-09 0 28.7 17.3
#> 10 ASN00009021 2020-01-10 0 32.6 15.8
#> # … with 145 more rows
face_spatial()
switches the long cubble back to the
nested cubble. The nested form is for operations whose output is only
identified by the key
, for example, mutating the average
maximum temperature in January:
|>
long face_spatial() |>
mutate(avg_max = mean(ts$tmax, na.rm = TRUE))
#> # cubble: id [5]: nested form
#> # bbox: [115.97, -32.94, 133.55, -12.42]
#> # temporal: date [date], prcp [dbl], tmax [dbl], tmin [dbl]
#> id lat long elev name wmo_id ts avg_max
#> <chr> <dbl> <dbl> <dbl> <chr> <dbl> <list> <dbl>
#> 1 ASN00009021 -31.9 116. 15.4 perth airport 94610 <tibble [31 × 4]> 31.6
#> 2 ASN00010311 -31.9 117. 179 york 94623 <tibble [31 × 4]> 34.6
#> 3 ASN00010614 -32.9 117. 338 narrogin 94627 <tibble [31 × 4]> 31.4
#> 4 ASN00014015 -12.4 131. 30.4 darwin airport 94120 <tibble [31 × 4]> 32.8
#> 5 ASN00015131 -17.6 134. 220 elliott 94236 <tibble [31 × 4]> 38.5