Handling incidence objects

incidence() objects are easy to work with, and we providing helper functions for both manipulating and accessing the underlying data and attributes. As incidence() objects are subclasses of tibbles they also have good integration with tidyverse verbs.

Modifying incidence objects

regroup()

Sometimes you may find you’ve created a grouped incidence but now want to change the internal grouping. Assuming you are after a subset of the grouping already generated, then you can use to regroup() function to get the desired aggregation:

library(outbreaks)
library(dplyr)
library(incidence2)

# load data
dat <- ebola_sim_clean$linelist

# generate the incidence object with 3 groups
inci <- incidence(dat, date_of_onset, groups = c(gender, hospital, outcome), interval = "week")
inci
#> An incidence object: 1,448 x 5
#> date range: [2014-W15] to [2015-W18]
#> cases: 5829
#> interval: 1 (Monday) week 
#> cumulative: FALSE
#> 
#>    date_index gender hospital                                     outcome count
#>        <yrwk> <fct>  <fct>                                        <fct>   <int>
#>  1   2014-W15 f      Military Hospital                            <NA>        1
#>  2   2014-W16 m      Connaught Hospital                           <NA>        1
#>  3   2014-W17 f      <NA>                                         <NA>        1
#>  4   2014-W17 f      <NA>                                         Death       1
#>  5   2014-W17 f      other                                        Recover     2
#>  6   2014-W17 m      other                                        Recover     1
#>  7   2014-W18 f      <NA>                                         Recover     1
#>  8   2014-W18 f      Connaught Hospital                           Recover     1
#>  9   2014-W18 f      Princess Christian Maternity Hospital (PCMH) Death       1
#> 10   2014-W18 f      Rokupa Hospital                              Recover     1
#> # … with 1,438 more rows
# regroup to just two groups
inci %>% regroup(c(gender, outcome))
#> An incidence object: 320 x 4
#> date range: [2014-W15] to [2015-W18]
#> cases: 5829
#> interval: 1 (Monday) week 
#> cumulative: FALSE
#> 
#>    date_index gender outcome count
#>        <yrwk> <fct>  <fct>   <int>
#>  1   2014-W15 f      <NA>        1
#>  2   2014-W16 m      <NA>        1
#>  3   2014-W17 f      <NA>        1
#>  4   2014-W17 f      Death       1
#>  5   2014-W17 f      Recover     2
#>  6   2014-W17 m      Recover     1
#>  7   2014-W18 f      Death       1
#>  8   2014-W18 f      Recover     3
#>  9   2014-W19 f      <NA>        4
#> 10   2014-W19 f      Death       2
#> # … with 310 more rows
# drop all groups
inci %>% regroup()
#> An incidence object: 56 x 2
#> date range: [2014-W15] to [2015-W18]
#> cases: 5829
#> interval: 1 (Monday) week 
#> cumulative: FALSE
#> 
#>    date_index count
#>        <yrwk> <int>
#>  1   2014-W15     1
#>  2   2014-W16     1
#>  3   2014-W17     5
#>  4   2014-W18     4
#>  5   2014-W19    12
#>  6   2014-W20    17
#>  7   2014-W21    15
#>  8   2014-W22    19
#>  9   2014-W23    23
#> 10   2014-W24    21
#> # … with 46 more rows

cumulate

We also provide a helper function, cumulate() to easily generate cumulative incidences:

inci %>% 
  regroup(hospital) %>% 
  cumulate() %>% 
  facet_plot(n_breaks = 4, nrow = 3)

keep_first() and keep_last()

Once your data is grouped by date, you may want to select the first or last few entries based on a particular date grouping using keep_first() and keep_last():

inci %>% keep_first(3)
#> An incidence object: 6 x 5
#> date range: [2014-W15] to [2014-W17]
#> cases: 7
#> interval: 1 (Monday) week 
#> cumulative: FALSE
#> 
#>   date_index gender hospital           outcome count
#>       <yrwk> <fct>  <fct>              <fct>   <int>
#> 1   2014-W15 f      Military Hospital  <NA>        1
#> 2   2014-W16 m      Connaught Hospital <NA>        1
#> 3   2014-W17 f      <NA>               <NA>        1
#> 4   2014-W17 f      <NA>               Death       1
#> 5   2014-W17 f      other              Recover     2
#> 6   2014-W17 m      other              Recover     1
inci %>% keep_last(3)
#> An incidence object: 63 x 5
#> date range: [2015-W16] to [2015-W18]
#> cases: 103
#> interval: 1 (Monday) week 
#> cumulative: FALSE
#> 
#>    date_index gender hospital           outcome count
#>        <yrwk> <fct>  <fct>              <fct>   <int>
#>  1   2015-W16 f      <NA>               <NA>        1
#>  2   2015-W16 f      <NA>               Death       7
#>  3   2015-W16 f      <NA>               Recover     1
#>  4   2015-W16 f      Connaught Hospital <NA>        1
#>  5   2015-W16 f      Connaught Hospital Death       5
#>  6   2015-W16 f      Connaught Hospital Recover     3
#>  7   2015-W16 f      Military Hospital  Recover     1
#>  8   2015-W16 f      other              <NA>        1
#>  9   2015-W16 f      other              Death       2
#> 10   2015-W16 f      other              Recover     1
#> # … with 53 more rows

Tidyverse compatibility

incidence2 has been written with tidyverse compatibility (in particular dplyr) at the forefront of the design choices we have made. By this we mean that if an operation from dplyr is applied to an incidence object then as long as the invariants of the object are preserved (i.e. groups, interval and uniqueness of rows) then the object returned will be an incidence object. If the invariants are not preserved then a tibble will be returned instead.

library(dplyr)

# create incidence object
inci <- incidence(dat, date_of_onset, interval = "week", groups = c(hospital, gender))

# filtering preserves class
inci %>%  filter(gender == "f", hospital == "Rokupa Hospital")
#> An incidence object: 48 x 4
#> date range: [2014-W18] to [2015-W18]
#> cases: 210
#> interval: 1 (Monday) week 
#> cumulative: FALSE
#> 
#>    date_index hospital        gender count
#>        <yrwk> <fct>           <fct>  <int>
#>  1   2014-W18 Rokupa Hospital f          1
#>  2   2014-W20 Rokupa Hospital f          1
#>  3   2014-W22 Rokupa Hospital f          1
#>  4   2014-W23 Rokupa Hospital f          1
#>  5   2014-W25 Rokupa Hospital f          1
#>  6   2014-W27 Rokupa Hospital f          1
#>  7   2014-W28 Rokupa Hospital f          4
#>  8   2014-W29 Rokupa Hospital f          2
#>  9   2014-W30 Rokupa Hospital f          1
#> 10   2014-W31 Rokupa Hospital f          1
#> # … with 38 more rows
# slice operations preserve class
inci %>% slice_sample(n = 10)
#> An incidence object: 10 x 4
#> date range: [2014-W23] to [2015-W17]
#> cases: 93
#> interval: 1 (Monday) week 
#> cumulative: FALSE
#> 
#>    date_index hospital          gender count
#>        <yrwk> <fct>             <fct>  <int>
#>  1   2015-W17 Military Hospital f          4
#>  2   2014-W25 <NA>              m          3
#>  3   2014-W23 other             f          2
#>  4   2015-W06 Rokupa Hospital   m          2
#>  5   2014-W45 Military Hospital m         20
#>  6   2015-W15 <NA>              m          7
#>  7   2014-W47 <NA>              m         11
#>  8   2014-W42 other             m         20
#>  9   2014-W38 Rokupa Hospital   m          7
#> 10   2015-W03 <NA>              f         17
inci %>%  slice(1, 5, 10)
#> An incidence object: 3 x 4
#> date range: [2014-W15] to [2014-W19]
#> cases: 3
#> interval: 1 (Monday) week 
#> cumulative: FALSE
#> 
#>   date_index hospital          gender count
#>       <yrwk> <fct>             <fct>  <int>
#> 1   2014-W15 Military Hospital f          1
#> 2   2014-W17 other             m          1
#> 3   2014-W19 <NA>              f          1
# mutate preserve class
inci %>%  mutate(future = date_index + 999)
#> An incidence object: 601 x 5
#> date range: [2014-W15] to [2015-W18]
#> cases: 5829
#> interval: 1 (Monday) week 
#> cumulative: FALSE
#> 
#>    date_index hospital                                     gender count   future
#>        <yrwk> <fct>                                        <fct>  <int>   <yrwk>
#>  1   2014-W15 Military Hospital                            f          1 2033-W22
#>  2   2014-W16 Connaught Hospital                           m          1 2033-W23
#>  3   2014-W17 <NA>                                         f          2 2033-W24
#>  4   2014-W17 other                                        f          2 2033-W24
#>  5   2014-W17 other                                        m          1 2033-W24
#>  6   2014-W18 <NA>                                         f          1 2033-W25
#>  7   2014-W18 Connaught Hospital                           f          1 2033-W25
#>  8   2014-W18 Princess Christian Maternity Hospital (PCMH) f          1 2033-W25
#>  9   2014-W18 Rokupa Hospital                              f          1 2033-W25
#> 10   2014-W19 <NA>                                         f          1 2033-W26
#> # … with 591 more rows
# rename preserve class
inci %>%  rename(left_bin = date_index)
#> An incidence object: 601 x 4
#> date range: [2014-W15] to [2015-W18]
#> cases: 5829
#> interval: 1 (Monday) week 
#> cumulative: FALSE
#> 
#>    left_bin hospital                                     gender count
#>      <yrwk> <fct>                                        <fct>  <int>
#>  1 2014-W15 Military Hospital                            f          1
#>  2 2014-W16 Connaught Hospital                           m          1
#>  3 2014-W17 <NA>                                         f          2
#>  4 2014-W17 other                                        f          2
#>  5 2014-W17 other                                        m          1
#>  6 2014-W18 <NA>                                         f          1
#>  7 2014-W18 Connaught Hospital                           f          1
#>  8 2014-W18 Princess Christian Maternity Hospital (PCMH) f          1
#>  9 2014-W18 Rokupa Hospital                              f          1
#> 10 2014-W19 <NA>                                         f          1
#> # … with 591 more rows
# select returns a tibble unless all date, count and group variables are preserved
inci %>% select(-1)
#> # A tibble: 601 × 3
#>    hospital                                     gender count
#>    <fct>                                        <fct>  <int>
#>  1 Military Hospital                            f          1
#>  2 Connaught Hospital                           m          1
#>  3 <NA>                                         f          2
#>  4 other                                        f          2
#>  5 other                                        m          1
#>  6 <NA>                                         f          1
#>  7 Connaught Hospital                           f          1
#>  8 Princess Christian Maternity Hospital (PCMH) f          1
#>  9 Rokupa Hospital                              f          1
#> 10 <NA>                                         f          1
#> # … with 591 more rows
inci %>% select(everything())
#> An incidence object: 601 x 4
#> date range: [2014-W15] to [2015-W18]
#> cases: 5829
#> interval: 1 (Monday) week 
#> cumulative: FALSE
#> 
#>    date_index hospital                                     gender count
#>        <yrwk> <fct>                                        <fct>  <int>
#>  1   2014-W15 Military Hospital                            f          1
#>  2   2014-W16 Connaught Hospital                           m          1
#>  3   2014-W17 <NA>                                         f          2
#>  4   2014-W17 other                                        f          2
#>  5   2014-W17 other                                        m          1
#>  6   2014-W18 <NA>                                         f          1
#>  7   2014-W18 Connaught Hospital                           f          1
#>  8   2014-W18 Princess Christian Maternity Hospital (PCMH) f          1
#>  9   2014-W18 Rokupa Hospital                              f          1
#> 10   2014-W19 <NA>                                         f          1
#> # … with 591 more rows

Accessing variable information

We provide multiple accessors to easily access information about an incidence() objects structure: