Introduction to d3po

The package d3po integrates well with dplyr. All the examples here use the pipe, %>%, both to filter/summarise data and create the charts.

Setup

Let’s start by loading packages.

library(dplyr)
library(igraph)
library(d3po)

Pokemon dataset

The included dataset pokemon has the present structure:

glimpse(pokemon)
#> Rows: 151
#> Columns: 15
#> $ id              <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,…
#> $ name            <chr> "Bulbasaur", "Ivysaur", "Venusaur", "Charmander", "Cha…
#> $ height          <dbl> 0.7, 1.0, 2.0, 0.6, 1.1, 1.7, 0.5, 1.0, 1.6, 0.3, 0.7,…
#> $ weight          <dbl> 6.9, 13.0, 100.0, 8.5, 19.0, 90.5, 9.0, 22.5, 85.5, 2.…
#> $ base_experience <int> 64, 142, 236, 62, 142, 240, 63, 142, 239, 39, 72, 178,…
#> $ type_1          <chr> "grass", "grass", "grass", "fire", "fire", "fire", "wa…
#> $ type_2          <chr> "poison", "poison", "poison", NA, NA, "flying", NA, NA…
#> $ attack          <int> 49, 62, 82, 52, 64, 84, 48, 63, 83, 30, 20, 45, 35, 25…
#> $ defense         <int> 49, 63, 83, 43, 58, 78, 65, 80, 100, 35, 55, 50, 30, 5…
#> $ hp              <int> 45, 60, 80, 39, 58, 78, 44, 59, 79, 45, 50, 60, 40, 45…
#> $ special_attack  <int> 65, 80, 100, 60, 80, 109, 50, 65, 85, 20, 25, 90, 20, …
#> $ special_defense <int> 65, 80, 100, 50, 65, 85, 64, 80, 105, 20, 25, 80, 20, …
#> $ speed           <int> 45, 60, 80, 65, 80, 100, 43, 58, 78, 45, 30, 70, 50, 3…
#> $ color_1         <chr> "#78C850", "#78C850", "#78C850", "#F08030", "#F08030",…
#> $ color_2         <chr> "#A040A0", "#A040A0", "#A040A0", NA, NA, "#A890F0", NA…

Box and Whiskers

To compare the distribution of weight by type_1, the pokemon dataset doesn’t need additional aggregation or transformation, just to use the Pokemon name as the grouping variable and (optionally) the color variable:

d3po(pokemon) %>%
  po_box(daes(x = type_1, y = speed, group = name, color = color_1)) %>%
  po_title("Distribution of Pokemon Speed by Type")

Bar

Let’s start by counting Pokemon by type:

pokemon_count <- pokemon %>% 
 group_by(type_1, color_1) %>% 
 count()

Now we can create a bar chart by using type_1 both for the x axis and the group_by variable provided this data has no year column or similar:

d3po(pokemon_count) %>%
  po_bar(
    daes(x = type_1, y = n, group = type_1, color = color_1)
  ) %>%
  po_title("Count of Pokemon by Type")

Treemap

By using the pokemon_count table created for the bar chart, the logic is exactly the same and we only need to change the function and specify the size instead of x and y:

d3po(pokemon_count) %>%
  po_treemap(
    daes(size = n, group = type_1, color = color_1)
  ) %>%
  po_title("Share of Pokemon by Type")

Pie

Use these plots with caution because polar coordinates has major perceptual problems. Use with EXTREME caution.

This method is exactly the same as treemap but changing the function.

d3po(pokemon_count) %>%
  po_pie(
    daes(size = n, group = type_1, color = color_1)
  ) %>%
  po_title("Share of Pokemon by Type")

Line

Let’s start by obtaining the decile for the Pokemon weight just for the grass, fire and water type:

pokemon_decile <- pokemon %>% 
  filter(type_1 %in% c("grass", "fire", "water")) %>% 
  group_by(type_1 ,color_1) %>% 
  summarise(
    decile = 0:10,
    weight = quantile(weight, probs = seq(0, 1, by = .1))
  )
#> `summarise()` has grouped output by 'type_1', 'color_1'. You can override using the `.groups` argument.

Now we can create an area chart by using the variable and color columns created above:

d3po(pokemon_decile) %>%
  po_line(
    daes(x = decile, y = weight, group = type_1, color = color_1)
  ) %>%
  po_title("Decile of Pokemon Weight by Type")

Area

Let’s start by obtaining the density for the Pokemon weight:

pokemon_density <- density(pokemon$weight, n = 30)

pokemon_density <- tibble(
 x = pokemon_density$x,
 y = pokemon_density$y,
 variable = "weight",
 color = "#5377e3"
)

Now we can create an area chart by using the variable and color columns created above:

d3po(pokemon_density) %>%
 po_area(
  daes(x = x, y = y, group = variable, color = color)
 ) %>%
 po_title("Approximated Density of Pokemon Weight")

Scatterplot

Let’s explore the balance between defense and attack by Pokemon type:

pokemon_def_vs_att <- pokemon %>% 
  group_by(type_1, color_1) %>% 
  summarise(
    mean_def = mean(defense),
    mean_att = mean(attack),
    n_pkmn = n()
  )
#> `summarise()` has grouped output by 'type_1'. You can override using the `.groups` argument.
d3po(pokemon_def_vs_att) %>%
  po_scatter(
    daes(x = mean_att, y = mean_def, size = n_pkmn, group = type_1, color = color_1)
  ) %>%
  po_title("Average Attack vs Average Defense by Type")

Network

This visualization method is different to the rest, as it can work with a single data object or data, nodes and edges objects by separate.

Let’s create an igraph object:

tr <- make_tree(40, children = 3, mode = "undirected")

To visualize is as simple as:

d3po(tr) %>% 
  po_layout() # optional

Another option is to work with a data.frame or tibble:

edges <- igraph::as_data_frame(tr, "edges")

Which is also visualized in a straightforward manner:

d3po() %>% 
  po_edges(data = edges)

Aesthetics

Going back to the treemap example, it is possible to move the labels and also use any font that you like:

d3po(pokemon_count) %>%
  po_treemap(
    daes(size = n, group = type_1, color = color_1, align = "left")
  ) %>%
  po_title("Share of Pokemon by Type") %>% 
  po_labels("left", "top") %>% 
  po_font("Times")