R mapping

John Mount

2022-01-22

rqdatatable re-maps a number of symbols for data.table translation (for rquery/SQL re-mappings, please see here). For instance, please take note of the n() and rank() functions in the following code example.

library("rqdatatable")
library("wrapr")

dL <- build_frame(
  "subjectID", "surveyCategory"     , "assessmentTotal"|
    1          , "withdrawal behavior", 5              |
    1          , "positive re-framing", 2              |
    2          , "withdrawal behavior", 3              |
    2          , "positive re-framing", 4              |
    2          , "other"              , 0              )

scale <- 0.237
rquery_pipeline <- local_td(dL) %.>%
  extend_nse(.,
             probability :=
               exp(assessmentTotal * scale)/
               sum(exp(assessmentTotal * scale)),
             count := n(),
             rank := rank(),
             orderby = c("assessmentTotal", "surveyCategory"),
             reverse = c("assessmentTotal"),
             partitionby = 'subjectID')  %.>%
  orderby(., c("subjectID", "probability"))
res <- ex_data_table(rquery_pipeline, tables = list(dL = dL))
knitr::kable(res)
subjectID surveyCategory assessmentTotal probability count rank
1 positive re-framing 2 0.3293779 2 2
1 withdrawal behavior 5 0.6706221 2 1
2 other 0 0.1780446 3 3
2 withdrawal behavior 3 0.3625035 3 2
2 positive re-framing 4 0.4594519 3 1

The common re-mappings are can be found in the package-private variable rqdatatable:::data_table_extend_fns.

str(rqdatatable:::data_table_extend_fns)
## List of 6
##  $ ngroup    :List of 2
##   ..$ data.table_version: chr ".GRP"
##   ..$ need_one_col      : logi TRUE
##  $ rank      :List of 2
##   ..$ data.table_version: chr "cumsum(rqdatatable_temp_one_col)"
##   ..$ need_one_col      : logi TRUE
##  $ row_number:List of 2
##   ..$ data.table_version: chr "cumsum(rqdatatable_temp_one_col)"
##   ..$ need_one_col      : logi TRUE
##  $ n         :List of 2
##   ..$ data.table_version: chr "sum(rqdatatable_temp_one_col)"
##   ..$ need_one_col      : logi TRUE
##  $ random    :List of 2
##   ..$ data.table_version: chr "runif(.N)"
##   ..$ need_one_col      : logi FALSE
##  $ rand      :List of 2
##   ..$ data.table_version: chr "runif(.N)"
##   ..$ need_one_col      : logi FALSE

The column rqdatatable_temp_one_col is introduced (and removed) from intermediate data frames as needed.

These mappings help allow the same operator pipeline to be used in R and in a database. For the database mappings please see here.