library(card)
library(magrittr)
library(ggplot2)
library(dplyr)
#> Warning: package 'dplyr' was built under R version 4.0.2
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
This vignette demonstrates the use of the circadian rhythm analysis functions. It demonstrates a robust use of circ_sun()
and circ_center()
to create a new dataset for analysis that is rotated around the zeitgeiber of sunrise.
# Data set to be used is included
data("twins")
head(twins)
#> # A tibble: 6 x 23
#> # Groups: patid, hour [6]
#> patid age bmi race smoking hptn dm chf prevchd med_beta_blocke…
#> <dbl> <dbl> <dbl> <fct> <fct> <fct> <fct> <fct> <fct> <fct>
#> 1 1 49 27.4 0 1 1 0 0 0 0
#> 2 1 49 27.4 0 1 1 0 0 0 0
#> 3 1 49 27.4 0 1 1 0 0 0 0
#> 4 1 49 27.4 0 1 1 0 0 0 0
#> 5 1 49 27.4 0 1 1 0 0 0 0
#> 6 1 49 27.4 0 1 1 0 0 0 0
#> # … with 13 more variables: med_antidepr <fct>, beck_total <dbl>,
#> # sad_bin <fct>, sad_cat <fct>, PETdiff_2 <fct>, dyxtime <dttm>, date <date>,
#> # hour <dbl>, rDYX <dbl>, sDYX <dbl>, HR <dbl>, CP <dbl>, zip <chr>
df <-
twins %>%
subset(patid %in% c(1:30))
The research problem is that participants came from around the country for a study. Continuous, 24-hour measures were obtained. However, the patients came in from different time zones and on different dates over the course of 10 years. Findings could reflect perhaps jetlag instead of physiological disturbances. We could use sunrise and sunset times as natural zeitgeibers to standardize the time series data.
The first step is inclusion of geographical locations, along with study date, to calculate sunrise times. This dataset was recorded in Atlanta, GA, but the individuals flew from across the country. The zipcodes are available in the dataset. This can then be converted to latitude and longitude using the {zipcode} package (which is archived on CRAN) - the zipcode data was bundled in this package for simplicity.
# Zipcodes, contained as characters (because of leading 0)
data("zipcode")
head(zipcode)
#> zip city state latitude longitude
#> 1 00210 Portsmouth NH 43.0059 -71.0132
#> 2 00211 Portsmouth NH 43.0059 -71.0132
#> 3 00212 Portsmouth NH 43.0059 -71.0132
#> 4 00213 Portsmouth NH 43.0059 -71.0132
#> 5 00214 Portsmouth NH 43.0059 -71.0132
#> 6 00215 Portsmouth NH 43.0059 -71.0132
# Get the zipcodes merged into to get latitude and longitude
df <- left_join(df, zipcode[c("zip", "latitude", "longitude")], by = "zip")
# Sunrise is dependent on location and date
df$sunrise <- circ_sun(date = df$date, lat = df$latitude, lon = df$longitude)
Another limitation is the issues in R with POSIX. The time zone of the sunrise is demarcated by “UTC”, but is actually corrected for the timezone by location. Next, we have continuous measures of autonomic physiology measured by ECG, called Dyx. These are recorded in a variable called dyxtime
. The recordings start in the afternoon, and continue to the next day. The sunrise in between is likely the best marker to center around. We need to use the circadian centering function for each patient to be able to compare them fairly.
## Time series data
length(df$dyxtime)
#> [1] 317
# Number of participants
length(unique(df$patid))
#> [1] 14
# Unique sunrise times per patient
zeit <-
df %>%
group_by(patid) %>%
arrange(dyxtime) %>%
select(patid, sunrise) %>%
unique() %>%
group_by(patid) %>%
slice(n()) # Sunrise time during study
names(zeit)[2] <- "zeit"
# Add surnsie zeitgeiber back in
df %<>% left_join(., zeit, by = "patid")
x <- df %>%
group_by(patid) %>%
tidyr::nest()
# Slow and steady method for going through all the potential vectors
# Will look to "tidy" this in the future (TODO)
for(i in seq(x$patid)) {
z <- unique(x[[i,2]][[1]]$zeit)
t <- x[[i,2]][[1]]$dyxtime
x[[i,2]][[1]]$zvec <- circ_center(x[[i,2]][[1]]$dyxtime, z)
}
# Visualize data trend
df <- tidyr::unnest(x)
#> Warning: `cols` is now required when using unnest().
#> Please use `cols = c(data)`
summary(df$zvec)
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> -22.327 -10.938 -4.856 -4.563 1.629 17.664
# Pseudo-rose plot
ggplot(df, aes(x = hour, y = rDYX, group = hour, fill = hour)) +
geom_boxplot() +
coord_polar(theta = "x", start = 0, direction = 1) +
scale_x_continuous(expand = c(0,0), breaks = seq(0, 24, 1)) +
scale_fill_viridis_c() +
theme_minimal()