Global Historical Climatology Network (GHCN) provides daily climate
measures from stations across the world. The dataset
tmax_hist
extracts the maximum temperature for 39
Australian stations in the period 1970 - 1975 and 2016 - 2020 from the
full Australian historical records starting from year 1969.
tmax_hist
is already in a cubble, with id
as
the key, yearmonth
as the index, and
c(longitude, latitude)
as the coordinates. This vignette
shows the basic manipulation of cubble in an example to compares the
maximum temperature in these two periods for weather stations in
Victoria, Australia.
Victoria stations can be subset by the station number: Australia GHCN station number starts with “ASN00” and followed by the Bureau of Meteorology (BOM) station number, where the 2nd and 3rd digit (7th and 8th in the GHCN number) define the state of the station. Victoria stations start from 76 to 90 and filtering Victoria stations is an operation in the spatial dimension. Hence we want to use the nested form:
<- tmax_hist %>%
tmax filter(between(stringr::str_sub(id, 7, 8), 76, 90))
tmax#> # cubble: id [24]: nested form
#> # bbox: [142, -39.13, 149.92, -34.23]
#> # temporal: yearmonth [mth], tmax [dbl]
#> id latitude longitude elevation name wmo_id ts
#> <chr> <dbl> <dbl> <dbl> <chr> <dbl> <list>
#> 1 ASN00076031 -34.2 142. 50 mildura airport 94693 <tibble>
#> 2 ASN00076047 -35.1 142. 50.3 ouyen (post office) 94831 <tibble>
#> 3 ASN00076064 -35.1 142. 105 walpeup research 95831 <tibble>
#> 4 ASN00079028 -36.7 142. 133 longerenong 95835 <tibble>
#> 5 ASN00080015 -36.2 145. 96 echuca aerodrome 94861 <tibble>
#> 6 ASN00080023 -35.7 144. 77.7 kerang 94844 <tibble>
#> 7 ASN00080091 -36.3 145. 105 kyabram dpi 95833 <tibble>
#> 8 ASN00081049 -36.4 145. 114 tatura inst sustain… 95836 <tibble>
#> 9 ASN00082039 -36.1 147. 175 rutherglen research 95837 <tibble>
#> 10 ASN00084016 -37.6 150. 15.2 gabo island lightho… 94933 <tibble>
#> # … with 14 more rows
We would also like to summarise the daily recording into monthly to smooth the maximum temperature and since it is also a time dimension operation, we can keep using the long form:
<- tmax %>%
tmax face_temporal() %>%
group_by(
month = lubridate::month(yearmonth),
group = as.factor(ifelse(lubridate::year(yearmonth) > 2015, "2016 ~ 2020","1971 ~ 1975"))) %>%
summarise(tmax = mean(tmax, na.rm = TRUE))
tmax#> # cubble: month, group, id [24]: long form
#> # bbox: [142, -39.13, 149.92, -34.23]
#> # spatial: latitude [dbl], longitude [dbl], elevation [dbl], name [chr], wmo_id
#> # [dbl]
#> id month group tmax
#> <chr> <dbl> <fct> <dbl>
#> 1 ASN00076031 1 1971 ~ 1975 30.8
#> 2 ASN00076031 1 2016 ~ 2020 34.6
#> 3 ASN00076031 2 1971 ~ 1975 31.0
#> 4 ASN00076031 2 2016 ~ 2020 32.6
#> 5 ASN00076031 3 1971 ~ 1975 27.7
#> 6 ASN00076031 3 2016 ~ 2020 30.1
#> 7 ASN00076031 4 1971 ~ 1975 23.4
#> 8 ASN00076031 4 2016 ~ 2020 25.2
#> 9 ASN00076031 5 1971 ~ 1975 19.4
#> 10 ASN00076031 5 2016 ~ 2020 19.6
#> # … with 542 more rows
A data quality issue with the rnoaa
data is that while
it records the first and last year recorded of each series without the
period of missingness. For example, station ASN00085279
starts it first record in 1943, pauses for a period from 1946 to 1982,
and then resumes it recording till today. We would like to remove
stations like this by examining whether the summarised time series has a
total of 24 months. This is a station-wise operation and
face_spatial()
can be used to convert the cubble back to
the nested form:
<- tmax %>% face_spatial() %>% filter(nrow(ts) == 24)
tmax
tmax#> # cubble: id [22]: nested form
#> # bbox: [142, -39.13, 149.92, -34.23]
#> # temporal: month [dbl], group [fct], tmax [dbl]
#> id latitude longitude elevation name wmo_id ts
#> <chr> <dbl> <dbl> <dbl> <chr> <dbl> <list>
#> 1 ASN00076031 -34.2 142. 50 mildura airport 94693 <tibble>
#> 2 ASN00076047 -35.1 142. 50.3 ouyen (post office) 94831 <tibble>
#> 3 ASN00076064 -35.1 142. 105 walpeup research 95831 <tibble>
#> 4 ASN00079028 -36.7 142. 133 longerenong 95835 <tibble>
#> 5 ASN00080015 -36.2 145. 96 echuca aerodrome 94861 <tibble>
#> 6 ASN00080023 -35.7 144. 77.7 kerang 94844 <tibble>
#> 7 ASN00080091 -36.3 145. 105 kyabram dpi 95833 <tibble>
#> 8 ASN00081049 -36.4 145. 114 tatura inst sustain… 95836 <tibble>
#> 9 ASN00082039 -36.1 147. 175 rutherglen research 95837 <tibble>
#> 10 ASN00084016 -37.6 150. 15.2 gabo island lightho… 94933 <tibble>
#> # … with 12 more rows
In the glyph map we are about to create, the four variables (major
and minor axis for x and y) need to be in the same table. In cubble, you
can move time invariant variable into the long form with
unfold()
:
<- tmax %>% face_temporal() %>% unfold(latitude, longitude)
tmax
tmax#> # cubble: month, group, id [22]: long form
#> # bbox: [142, -39.13, 149.92, -34.23]
#> # spatial: latitude [dbl], longitude [dbl], elevation [dbl], name [chr], wmo_id
#> # [dbl]
#> id month group tmax latitude longitude
#> <chr> <dbl> <fct> <dbl> <dbl> <dbl>
#> 1 ASN00076031 1 1971 ~ 1975 30.8 -34.2 142.
#> 2 ASN00076031 1 2016 ~ 2020 34.6 -34.2 142.
#> 3 ASN00076031 2 1971 ~ 1975 31.0 -34.2 142.
#> 4 ASN00076031 2 2016 ~ 2020 32.6 -34.2 142.
#> 5 ASN00076031 3 1971 ~ 1975 27.7 -34.2 142.
#> 6 ASN00076031 3 2016 ~ 2020 30.1 -34.2 142.
#> 7 ASN00076031 4 1971 ~ 1975 23.4 -34.2 142.
#> 8 ASN00076031 4 2016 ~ 2020 25.2 -34.2 142.
#> 9 ASN00076031 5 1971 ~ 1975 19.4 -34.2 142.
#> 10 ASN00076031 5 2016 ~ 2020 19.6 -34.2 142.
#> # … with 518 more rows
The cubble
package also provides a
geom_glyph()
interface to plot the glyph map. The geometry
asks for x_major
, y_major
,
x_minor
, and y_minor
as the required
aesthetics:
<- ozmaps::abs_ste %>% filter(NAME %in% c("Victoria"))
nsw_vic
ggplot() +
geom_sf(data = nsw_vic, fill = "transparent", color = "grey", linetype = "dotted") +
geom_glyph(data = tmax,
aes(x_major = longitude, x_minor = month,
y_major = latitude, y_minor = tmax,
group = interaction(id, group), color = group),
width = 1, height = 0.4) +
scale_color_brewer(palette = "Dark2") +
theme_bw() +
coord_sf(xlim = c(141, 150)) +
theme(legend.position = "bottom") +
labs(x = "Longitude", y = "Latitude")
Remember that Australia is in the southern hemisphere, so the temperature curves are in U shapes :)