Dates, times and timezones can be frustrating, especially when working with environmental time series such as those collected by air and water quality sensors.
Environmental time series data often have a strong diurnal signal and are typically plotted with a time axis displaying local time. However, when data are aggregated into larger collections, it is typical to store data with a universal time axis – UTC.
Problems can arise when parsing and formatting dates and times because R defaults to the system timezone available with Sys.timezone()
. Imagine an agency scientist based in Washington, DC, using their laptop to display recent air quality data from Los Angeles while at a conference in Tasmania. The data center processing the data might be in Boulder but the data processing machine might be set to use UTC. Potential timezones (available with OlsonNames()
) relevant to this scenario include:
America/New_York
America/Los_Angeles
Australia/Tasmania
America/Denver
UTC
Which timezone should be used to convert a request for data from “2019-08-08”" to “2018-08-15”" into POSIXct
datetimes?
To enforce specification of timezones and to help with the common user interface need to specify a range of dates or times, the MazamaCoreUtils package provides the following functions:
dateRange()
– parses and returns POSIXct
start and end dates representing full days in the specified timezonetimeRange()
– parses and returns POSixct
start and end times in the specified timezoneparseDatetime()
– parses and returns a vector of POSIXct
values in the specified timezoneThe parseDatetime()
function is intended as a timezone-requiring replacement for lubridate::parse_date_time()
.
Enforcing the specification of timezones throughout a body of code is the most robust way to remove timezone-related errors from your software. To help with this thistype of code review, the package also includes functions for testing whether specific named arguments are used with certain function calls:
lintFunctionArgs_file()
– check a single filelintFunctionArgs_dir()
– check an entire directoryTo use these functions you must define a set of function:argument
rules to be applied such as:
timezoneLintRules <- list(
"parse_date_time" = "tz",
"with_tz" = "tzone",
"now" = "tzone",
"strftime" = "tz"
)
This is interpreted as:
parse_date_time()
function must use the tz
argument explicitly.with_tz()
function must use the tzone
argument explicitlyWhile these functions could be used to test for explicit use in any function:argument
pair, our concern here is primarily with specification of timezones. The packages includes a detailed list of timezoneLintRules
to help with this. As an example, here is the result of linting the dateRange.R
function in this package:
> lintFunctionArgs_file("R/dateRange.R", timezoneLintRules)
# A tibble: 7 x 6
file line_number column_number function_name named_args includes_required
<chr> <int> <int> <chr> <list> <lgl>
1 dateRange.R 125 29 with_tz <chr [1]> TRUE
2 dateRange.R 128 27 with_tz <chr [1]> TRUE
3 dateRange.R 141 18 parse_date_time <chr [2]> TRUE
4 dateRange.R 142 18 parse_date_time <chr [2]> TRUE
5 dateRange.R 159 18 parse_date_time <chr [2]> TRUE
6 dateRange.R 176 18 parse_date_time <chr [2]> TRUE
7 dateRange.R 188 18 now <chr [1]> TRUE
The result shows that the dateRange.R
source code is consistent in always explicitly specifying a timezone.
Hopefully, this attention to timezones will help your code avoid misunderstandings when it comes to date and time requests.