comperes
offers a pipe (%>%
) friendly set of tools for storing and managing competition results (hereafter - results). This vignette discusses following topics:
Understanding of competition is quite general: it is a set of games (abstract event) in which players (abstract entity) gain some abstract scores (typically numeric). The most natural example is sport results, however not the only one. For example, product rating can be considered as a competition between products as “players”. Here a “game” is a customer that reviews a set of products by rating them with numerical “score” (stars, points, etc.).
We will need the following packages:
library(comperes)
library(tibble)
Results in long format are stored in object of class longcr
. It is considered to be a tibble
with one row per game-player pair. It should have at least columns with names “game”, “player” and “score”. For example:
cr_long_raw <- tibble(
game = c(1, 1, 1, 2, 2, 3, 3, 4),
player = c(1, NA, NA, 1, 2, 2, 1, 2),
score = 1:8
)
To convert cr_long_raw
into longcr
object use as_longcr()
:
cr_long <- as_longcr(cr_long_raw)
cr_long
#> # A longcr object:
#> # A tibble: 8 x 3
#> game player score
#> <dbl> <dbl> <int>
#> 1 1 1 1
#> 2 1 NA 2
#> 3 1 NA 3
#> 4 2 1 4
#> 5 2 2 5
#> 6 3 2 6
#> 7 3 1 7
#> 8 4 2 8
By default, as_longcr()
repairs its input by applying set of heuristics to extract relevant data:
tibble(
PlayerRS = "a",
gameSS = "b",
extra = -1,
score_game = 10,
player = 1
) %>%
as_longcr()
#> as_longcr: Some matched names are not perfectly matched:
#> gameSS -> game
#> score_game -> score
#> # A longcr object:
#> # A tibble: 1 x 5
#> game player score PlayerRS extra
#> <chr> <dbl> <dbl> <chr> <dbl>
#> 1 b 1 10 a -1
Results in wide format are stored in object of class widecr
. It is considered to be a tibble
with one row per game with fixed amount of players. Data should be organized in pairs of columns “player”-“score”. Identifier of a pair should go after respective keyword and consist only from digits. For example: player1, score1, player2, score2. Order doesn't matter.
Extra columns are allowed. Column game for game identifier is optional.
Example of correct wide format:
cr_wide_raw <- tibble(
player1 = c(1, 1, 2),
score1 = -(1:3),
player2 = c(2, 3, 3),
score2 = -(4:6)
)
To convert cr_wide_raw
into widecr
object use as_widecr()
:
cr_wide <- cr_wide_raw %>% as_widecr()
cr_wide
#> # A widecr object:
#> # A tibble: 3 x 4
#> player1 score1 player2 score2
#> <dbl> <int> <dbl> <int>
#> 1 1 -1 2 -4
#> 2 1 -2 3 -5
#> 3 2 -3 3 -6
By default, as_widecr()
also does repairing of its input:
tibble(
score = 2,
PlayerRS = "a",
scoreRS = 1,
player = "b",
player1 = "c",
extra = -1,
game = "game"
) %>%
as_widecr()
#> as_widecr: Some matched names are not perfectly matched:
#> player -> player1
#> score -> score1
#> player1 -> player2
#> PlayerRS -> player3
#> scoreRS -> score3
#> as_widecr: Next columns are not found. Creating with NAs.
#> score2
#> # A widecr object:
#> # A tibble: 1 x 8
#> game player1 score1 player2 score2 player3 score3 extra
#> <chr> <chr> <dbl> <chr> <int> <chr> <dbl> <dbl>
#> 1 game b 2 c NA a 1 -1
as_longcr()
and as_widecr()
do actual conversion applied to widecr
and longcr
objects respectively:
as_longcr(cr_wide)
#> # A longcr object:
#> # A tibble: 6 x 3
#> game player score
#> <int> <dbl> <int>
#> 1 1 1 -1
#> 2 1 2 -4
#> 3 2 1 -2
#> 4 2 3 -5
#> 5 3 2 -3
#> 6 3 3 -6
# Determines number of players in game as
# actual maximum number of players in games
as_widecr(cr_long)
#> # A widecr object:
#> # A tibble: 4 x 7
#> game player1 score1 player2 score2 player3 score3
#> <dbl> <dbl> <int> <dbl> <int> <dbl> <int>
#> 1 1 1 1 NA 2 NA 3
#> 2 2 1 4 2 5 NA NA
#> 3 3 2 6 1 7 NA NA
#> 4 4 2 8 NA NA NA NA
comperes
expect data that can be a proper input to as_longcr()
, i.e. longcr
object, widecr
object, or raw data aligned with long format.The preferred way to do data analysis with comperes
is to have three data frames:
game
for game identifiers).player
for player identifiers).This way one can operate with games between variable number of players with minimum storage overhead.