Introduction

Pivot tables are generally used to present raw and summary data. They are generated from spreadsheets and, more recently, also from R (pivottabler).

If we generate pivot tables from our own data, flattabler package is not necessary. But, if we get data in pivot table format and need to represent or analyse it using another tool, this package can be very helpful: It can save us several hours of programming or manual transformation.

flattabler package offers a set of operations that allow us to transform one or more pivot tables into a flat table.

The rest of this document is structured as follows: First, an illustrative example of transforming a pivot table into a flat table is presented. Then, the operations available in flattabler package, classified according to their purpose, are presented. In addition, transformations that can be carried out on a flat table are presented: they are not supported by package flattabler but are alternative or complementary to those of the package, so they are shown here. Finally, the document ends with the conclusions section.

An illustrative example

In this example, given a pivot table and the flat table obtained from it, the transformations performed are presented. Next, a function is defined that groups these transformations. This function is applied to a list of pivot tables to obtain a single flat table. Finally, it is shown how the flat table can be modified using functions from tidyverse package components.

Pivot table

A pivot table contains label rows and columns, and an array of values, usually numeric data. It can contain additional information, such as table header or footer.

	c1	c2	c3	c4	c5	c6	c7	c8	c9
r1	M1		E	D
r2			e1		Total e1	e2		Total e2	Total general
r3	A	B	d1	d2		d1	d2
r4	a1	b1	2,99	1,02	4,01	4,06	1,32	5,38	9,39
r5		b2	3,89	3,65	7,54	5,55		5,55	13,09
r6		b3	2,33		2,33	1,87		1,87	4,2
r7	Total a1		9,21	4,67	13,88	11,48	1,32	12,8	26,68
r8	a2	b1	5,62	1,94	7,56	4,59	2,13	6,72	14,28
r9		b2	3,82	7,72	11,54	4,78	2,94	7,72	19,26
r10		b3	5,36	6,38	11,74	1,69	1,78	3,47	15,21
r11	Total a2		14,8	16,04	30,84	11,06	6,85	17,91	48,75
r12	Total general		24,01	20,71	44,72	22,54	8,17	30,71	75,43

In the table above, columns c1 and c2 contain labels, also rows r2 and r3. The header is made up of row r1 and the intersection between rows and columns of labels (cells of c1 and c2 with r2 and r3). Let us suppose that the content of the cell (r1, c1) is especially relevant because it serves to identify the content of this pivot table with respect to others.

The matrix of values is made up of rows and columns after those containing labels (rows from r4 on, and columns from c3 on). Each value of this matrix is characterized by the combination of labels of the corresponding row and column. It is common to find null values in that matrix because the data is not produced or recorded for the combination of labels that define it.

Flat table

A flat database or flat-file database is a database that only contains a single table. A flat table is a generally denormalized table that is not related to other tables.

page	col1	col2	row1	row2	value
M1	a1	b1	e1	d1	2.99
M1	a1	b1	e1	d2	1.02
M1	a1	b1	e2	d1	4.06
M1	a1	b1	e2	d2	1.32
M1	a1	b2	e1	d1	3.89
M1	a1	b2	e1	d2	3.65
M1	a1	b2	e2	d1	5.55
M1	a1	b3	e1	d1	2.33
M1	a1	b3	e2	d1	1.87
M1	a2	b1	e1	d1	5.62
M1	a2	b1	e1	d2	1.94
M1	a2	b1	e2	d1	4.59
M1	a2	b1	e2	d2	2.13
M1	a2	b2	e1	d1	3.82
M1	a2	b2	e1	d2	7.72
M1	a2	b2	e2	d1	4.78
M1	a2	b2	e2	d2	2.94
M1	a2	b3	e1	d1	5.36
M1	a2	b3	e1	d2	6.38
M1	a2	b3	e2	d1	1.69
M1	a2	b3	e2	d2	1.78

The table above is a flat table whose data has been obtained from the pivot table in the previous section. It only contains raw data and the labels that characterize it. An additional label has been added with the value that identifies the pivot table, the pivot table page. NA values have not been included. Aggregated data can be obtained from raw data.

Transforming a pivot table

The transformation to obtain the flat table from the pivot table has been carried out using flattabler package. The code is shown below. For clarity, the pipe operator syntax provided by tidyr package has been used.

library(flattabler)
library(tidyr)

ft <- pt %>%
  set_page(1, 1) %>%
  define_labels(n_col = 2, n_row = 2) %>%
  remove_top(1) %>%
  fill_labels() %>%
  remove_agg() %>%
  fill_values() %>%
  remove_k() %>%
  replace_dec() %>%
  unpivot()

Starting from the pivot table pt:

We define that the value that identifies it (the pivot table page) is in cell (r1, c1), by means of set_page(1, 1).
The pivot table only contains labels and values, therefore rows or columns with other information have to be removed. The cells between the rows and columns of labels can be ignored. We delete the first row using remove_top(1) because it does not contain labels.
When we only have the cells of the pivot table, we define the number of rows and columns containing labels using define_labels(n_col = 2, n_row = 2). There are two columns and two rows of labels.
The example is a Spanish report that uses thousands and decimal separators in the value matrix. We need to adapt them to the R syntax for numbers. This operation is carried out using remove_k() to remove the thousands separator and replace_dec() to replace the decimal separator.
The array of values has gaps that, instead of having a numeric value, have an empty string. In R it is more appropriate to have NA if the data is not available. This operation is performed through fill_values().
Since there are more than one row or column with labels, the values of the labels of the first row and column have not been repeated. They are filled using fill_labels().
The pivot table contains aggregated data that can be obtained from the raw data. It is removed by remove_agg().
Finally, it is transformed into a flat table by unpivot(): each row corresponds to a value with its combination of labels. NA values have not been considered.

Transforming a list of pivot tables

Once we have defined the necessary transformations for a pivot table, we can apply them to any other with the same structure. Candidate tables can have different number of rows or columns, depending on the number of labels, but they must have the same number of rows and columns of labels, and the same number of header or footer rows, so that the transformations are the same for each table.

To easily perform this operation, we define a function f from the transformations, as shown below.

f <- function(pt) {
  pt %>%
    set_page(1, 1) %>%
    define_labels(n_col = 2, n_row = 2) %>%
    remove_top(1) %>%
    fill_labels() %>%
    remove_agg() %>%
    fill_values() %>%
    remove_k() %>%
    replace_dec() %>%
    unpivot()
}

ft <- flatten_table_list(list_pt_ie, f)

Given a list of pivot tables, list_pt_ie, flatten_table_list() applies the transformation defined by f to each of them, and merges the results into a flat table. In this case, the full result is not shown in this document because it takes up too much space, but a sample is shown below.

page	col1	col2	row1	row2	value
M1	a1	b1	e1	d1	2.99
M1	a1	b2	e1	d1	3.89
M1	a1	b3	e2	d1	1.87
M1	a2	b1	e1	d1	5.62
M1	a2	b1	e1	d2	1.94
M2	a1	b1	e1	d2	12
M2	a1	b3	e1	d1	6
M2	a2	b2	e2	d2	5
M3	a1	b2	e1	d1	54.70
M3	a2	b1	e1	d2	44.62
M3	a2	b2	e2	d1	74.62
M3	a2	b3	e1	d1	70.32
M3	a2	b3	e2	d1	42.25
M4	a1	b1	e2	d2	69.96
M4	a1	b2	e1	d1	970.96
M4	a1	b3	e2	d1	1473.56
M4	a2	b1	e1	d1	3533.70
M4	a2	b1	e2	d1	2455.20
M4	a2	b2	e1	d2	687.32
M4	a2	b2	e2	d1	1927.10
M4	a2	b2	e2	d2	85.26

Flat table transformation

Once we have a flat table, implemented using tibble, we can use tidyverse package components to transform it, as shown below. In this case all results are displayed.

t <- ft %>%
  tidyr::pivot_wider(names_from = page, values_from = value) %>%
  dplyr::rename(A = col1, B = col2, E = row1, D = row2) %>% 
  dplyr::arrange(A, B, E, D)

A	B	E	D	M1	M2	M3	M4
a1	b1	e1	d1	2.99	12	35.88	442.52
a1	b1	e1	d2	1.02	12	12.24	150.96
a1	b1	e2	d1	4.06	31	64.88	1223.64
a1	b1	e2	d2	1.32	7	9.24	69.96
a1	b2	e1	d1	3.89	26	54.70	970.96
a1	b2	e1	d2	3.65	37	52.31	964.57
a1	b2	e2	d1	5.55	19	54.25	586.75
a1	b3	e1	d1	2.33	6	13.98	93.20
a1	b3	e2	d1	1.87	28	52.36	1473.56
a2	b1	e1	d1	5.62	50	111.70	3533.70
a2	b1	e1	d2	1.94	23	44.62	1034.02
a2	b1	e2	d1	4.59	46	105.66	2455.20
a2	b1	e2	d2	2.13	16	34.08	553.80
a2	b2	e1	d1	3.82	42	65.92	1561.52
a2	b2	e1	d2	7.72	18	70.78	687.32
a2	b2	e2	d1	4.78	33	74.62	1927.10
a2	b2	e2	d2	2.94	5	14.70	85.26
a2	b3	e1	d1	5.36	30	70.32	1268.08
a2	b3	e1	d2	6.38	24	71.52	1135.90
a2	b3	e2	d1	1.69	25	42.25	1063.01
a2	b3	e2	d2	1.78	13	23.14	307.94

Pivot table operations

To transform one or more pivot tables into a flat table, the workflow is as follows:

Pivot table import: Import pivot tables into an object or list of objects. We start from data in text or Excel files, or previously imported data in a data frame, and generate pivot table objects from them.
Pivot table definition: Study the structure of the data and define the pivot table. If there are several homogeneous pivot tables, we will focus on one, the definition should be applicable to all of them.
- Define the characteristics of the pivot table: Number of rows and columns with labels, and page value (identifies the pivot table).
- Remove the rows and columns that are not part of the pivot table: It should only contain the rows and columns of labels and an array of values.
Pivot table transformation: Optionally, complete or transform the components (labels and values) of the pivot table.
Flat table generation: Generate the flat table from the definition of the pivot table and the available data. If there are multiple pivot tables, apply the defined operations to all of them and merge the result.

In this section, the operations available in flattabler package, classified according to this workflow, are presented.

Pivot table import

The objective of import operations is to obtain external data that contains one or more pivot tables to transform them. Three formats have been considered: text file, Excel file, and data frame.

In the case of working with files, the situation of jointly treating all the files in a folder has also been considered. In the case of Excel, alternatively, all the sheets in a file can be treated together.

The S3 pivot_table class has been defined in the package. Transform operations are defined for objects of this class. Import operations can be classified into two groups: those that return a pivot_table object, and those that return a list of pivot_table objects. Objects in a list can be transformed together.

Operations that return an object

pivot_table(): Creates a pivot_table object from a data frame. Page information associated with the pivot table can be defined. Data frame data is converted to character type. Example:

df <- data.frame(unclass(pt_m4)[c(1:7)])
pt <- pivot_table(df, page = "M4")

read_text_file(): Reads a text file and creates a pivot_table object. The file is expected to contain one or more pivot tables. Each line in the file corresponds to a row in a table; within each row, columns are defined by a separator character. The file name is included as part of the object attributes. Example:

file <- system.file("extdata", "csv/set_v_ie.csv", package = "flattabler")
pt_set_v_ie <- read_text_file(file)

read_excel_sheet():. Reads an Excel file sheet and creates a pivot_table object. The sheet is expected to contain one or more pivot tables. Each line in the sheet corresponds to a row in a table. The file and sheet names are included as part of the object attributes. Example:

file <- system.file("extdata", "excel/ine2871.xlsx", package = "flattabler")
pt <- read_excel_sheet(file)

Operations that return a list of objects

read_text_folder(): Reads all text files in a folder and creates a list of pivot_table objects, one from each file. Each file is expected to contain a pivot table. Each line in a file corresponds to a row in a table; within each row, columns are defined by a separator character. File name is included as part of each object attributes. Example:

folder <- system.file("extdata", "csvfolder", package = "flattabler")
lpt <- read_text_folder(folder)

read_excel_folder(): Reads one sheet from each of the Excel files in a folder and creates a list of pivot_table objects, one from each sheet or, which is the same in this case, one from each file. Each sheet is expected to contain a pivot table. Each line in a file corresponds to a row in a table. File and sheet names are included as part of each object attributes. Example:

folder <- system.file("extdata", "excelfolder", package = "flattabler")
lpt <- read_excel_folder(folder)

read_excel_file(): Reads sheets from an Excel file and creates a pivot_table object list, one from each sheet. Each sheet is expected to contain a pivot table. Each line in a sheet corresponds to a row in a table. The file and sheet names are included as part of each object attributes. Example:

file <- system.file("extdata", "excel/set_sheets.xlsx", package = "flattabler")
lpt <- read_excel_file(file)

Pivot table definition

Once we have a pivot_table object or list of objects, pivot tables have to be defined. Each object generated by import operations contains a text table, it is expected to contain a pivot table, but may also have more information, generally in the form of a table header or footer. Through this set of operations we transform the text table in the object into a pivot table and define its characteristics.

A pivot_table object should only contain label rows and columns, and an array of values, usually numeric data. Additional information can be used to identify the pivot table relative to other similar tables: can be used to define the pivot table page.

Page: We consider the page of the pivot table as the literal that identifies it with respect to other homogeneous tables; generally it is the value of an attribute (i.e., 2020, 2019,…) or the name of a variable (i.e., amount, profit,…). When multiple pivot tables are integrated into a flat table, the page is essential to distinguish the origin of the data. It is considered as an additional label.

The workflow is generally as follows:

Explore the table to determine its distribution and characteristics.
- If we start from a list of pivot_table objects, we will explore each one of the tables. In order to transform them together, they should have homogeneous structure. We will use member reference (instead of list slicing) to access the objects in the list. Example: pt <- list_pt_ie[[1]].
- Exploration can be carried out using utils::View(). Additionally, view_table_attr() can be used since it also shows the attributes of the object.
In case the text table contains multiple pivot tables, they can be obtained using divide(), which returns a list of pivot_table objects; therefore, we return to the first step.
Define the characteristics of the pivot table: Number of rows and columns with labels, and page value.
Remove the rows and columns that are not part of the pivot table: It should only contain the rows and columns of labels and an array of values.

Therefore, four groups of functions are distinguished: Functions to view the object, to split a text table into several, to define the pivot table characteristics, and to remove the rows and columns that are not part of it.

View an object

view_table_attr(): Based on utils::View() but, unlike it, in addition to the table, it also displays the attributes of the object in another window. Example:

file <- system.file("extdata", "csv/set_v_ie.csv", package = "flattabler")
pt_set_v_ie <- read_text_file(file)

pt_set_v_ie %>% view_table_attr()

Split a text table

divide(): Divides a table into tables separated by some empty row or column. Sometimes multiple pivot tables are placed in a text document or Excel sheet, imported as one text table. This operation recursively divides the initial table into tables separated by some empty row or column. Once a division has been made, it tries to divide each part of the result. An object is generated for each indivisible pivot table. Returns a list of pivot_table objects. Example:

list_pt_ie <- pt_set_v_ie %>% divide()

Define pivot table characteristics

get_page() and set_page(): The page value is defined when importing data, sometimes it is included in the file or spreadsheet name. Using these functions, you can get the defined values and redefine them. The content of a table cell or string can be defined as a page value. Example:

pt <- list_pt_ie[[1]]

pt <- pt %>% set_page(1, 1)

define_labels(): A pivot table should only contain label rows and columns, and an array of values, usually numeric data. This function defines the quantity of rows and columns that contain labels. Example:

pt <- pt %>% define_labels(n_col = 2, n_row = 2)

Remove rows and columns

remove_empty()
remove_rows()
remove_cols()
remove_top()
remove_bottom()
remove_left()
remove_right()

Remove rows and columns that are not part of the pivot table. The most frequent situation will be having to eliminate the header or footer of the table (top and bottom rows), the rest of the functions are defined to try to contemplate all possible cases. Example:

pt <- pt %>% remove_top(1)

Pivot table transformation

Once a pivot_table object only contains pivot table data, and its attributes have been defined, it could be transformed into a flat table. However, we can take advantage of the table structure to modify and complete it. Therefore, optionally, we can complete and transform the components of the pivot table: Labels and values.

Transform labels

fill_labels(): Fills missing values in row and column labels for a pivot table. When there is more than one row or column with labels, the first ones usually do not repeat the values. In the illustrative example, this occurs in column c1 and row r2. In columns they are filled down; in rows to the right. Example:

pt <- pt %>% fill_labels()

remove_agg(): Removes pivot table rows and columns that contain aggregated data. Aggregated data is recognized because the label of the row or column closest to the array of values is empty or has a special value as an indicator. Example:

pt <- pt %>% remove_agg()

extract_labels(): Sometimes a table column includes values of multiple label fields, this is generally known as compact table format. Given a column number and a set of labels, it generates a new column with the labels located at the positions they occupied in the original column and removes them from it. Example:

pt2 <- pt_m4_compact %>%
  extract_labels(col = 1, labels = c("b1", "b2", "b3", "b4", "Total general"))

get_col_values(): To facilitate the study of the labels included in the same column of several tables, this function gets the values of the indicated column in a list of tables. It may be useful to use it before extract_labels(). Example:

df <- get_col_values(list_pt_compact, start_row = 4)
labels <- sort(unique(df$label))

Transform values

fill_values(): The array of values has gaps that, instead of having a numeric value, have an empty string. This operation fills with NA missing values in a pivot table value array. Example:

pt <- pt %>% fill_values()

remove_k(): Values, even though they are numbers, are represented as text and sometimes include a thousands separator that can be removed using this function. Example:

pt <- pt %>% remove_k()

replace_dec(): Values, even though they are numbers, are represented as text and sometimes include a decimal separator different from the one needed; it can be replaced using this function. Example:

pt <- pt %>% replace_dec()

Flat table generation

In order to generate a flat table from a pivot_table object, it is an essential requirement to have properly defined its attributes and that it only contains the pivot table data. Optionally, if the table has a usual structure, we could have transformed the values and labels, if necessary.

We can generate a flat table from a pivot table (a pivot_table object) or from a list of pivot tables (a list of pivot_table objects).

Generation from a pivot table

unpivot(): Transforms a pivot table into a flat table (implemented by a tibble). An additional column with page information can be included. NA values can be excluded from the array of values. Example:

ft_tmp <- pt %>% unpivot()

Generation from a list of pivot tables

flatten_table_list(): Given a list of pivot_table objects and a transformation function that flattens a pivot_table object, transforms each table using the function and merges the results into a flat table. Example:

f <- function(pt) {
  pt %>%
    set_page(1, 1) %>%
    define_labels(n_col = 2, n_row = 2) %>%
    remove_top(1) %>%
    fill_labels() %>%
    remove_agg() %>%
    fill_values() %>%
    remove_k() %>%
    replace_dec() %>%
    unpivot()
}

ft <- flatten_table_list(list_pt_ie, f)

Flat table transformation

In this section, complementary and alternative operations that can be carried out on a flat table are presented. Complementary operations are useful to transform the flat table even after previously transforming the pivot table. Alternative operations are necessary if equivalent operations could not be performed on the pivot table for some reason.

These operations are not supported by package flattabler, they are supported by the components of tidyverse package. Here only a sample is presented. It is recommended to read the R for Data Science book (at least chapters 5 and 12).

Complementary transformations

We start from a flat table ft generated in the previous section through the operation flatten_table_list().

tidyr::pivot_wider(): It is needed when an observation is distributed among several rows. One column contains the name of the observation variables and another contains the values. In the example, we had a pivot table for each variable. The variable name is in page column. Example:

ft <- ft %>% tidyr::pivot_wider(names_from = page, values_from = value)

dplyr::rename(): It allows us to change the name of the columns of the flat table. Example:

ft <- ft %>% dplyr::rename(A = col1, B = col2, E = row1, D = row2)

dplyr::arrange(): It allows us to order the rows of the flat table according to the selected columns. Example:

ft <- ft %>% dplyr::arrange(A, B, E, D)

Alternative transformations

We consider the same data from the illustrative example. This is not the case, but let us suppose a situation in which the transformations cannot be performed on the pivot table (because its structure does not coincide with the assumptions of the operations).

pt <- list_pt_ie[[1]]

ft_at <- pt %>%
  set_page(1, 1) %>%
  define_labels(n_col = 2, n_row = 2) %>%
  remove_top(1) %>%
  unpivot()

The only operations we perform are those for defining the pivot table and generating the flat table. The result obtained is shown below.

page	col1	col2	row1	row2	value
M1	a1	b1	e1	d1	2,99
M1	a1	b1		d2	1,02
M1	a1	b1	Total e1		4,01
M1	a1	b1	e2	d1	4,06
M1	a1	b1		d2	1,32
M1	a1	b1	Total e2		5,38
M1	a1	b1	Total general		9,39
M1		b2	e1	d1	3,89
M1		b2		d2	3,65
M1		b2	Total e1		7,54
M1		b2	e2	d1	5,55
M1		b2		d2
M1		b2	Total e2		5,55
M1		b2	Total general		13,09
M1		b3	e1	d1	2,33
M1		b3		d2
M1		b3	Total e1		2,33
M1		b3	e2	d1	1,87
M1		b3		d2
M1		b3	Total e2		1,87
M1		b3	Total general		4,2
M1	Total a1		e1	d1	9,21
M1	Total a1			d2	4,67
M1	Total a1		Total e1		13,88
M1	Total a1		e2	d1	11,48
M1	Total a1			d2	1,32
M1	Total a1		Total e2		12,8
M1	Total a1		Total general		26,68
M1	a2	b1	e1	d1	5,62
M1	a2	b1		d2	1,94
M1	a2	b1	Total e1		7,56
M1	a2	b1	e2	d1	4,59
M1	a2	b1		d2	2,13
M1	a2	b1	Total e2		6,72
M1	a2	b1	Total general		14,28
M1		b2	e1	d1	3,82
M1		b2		d2	7,72
M1		b2	Total e1		11,54
M1		b2	e2	d1	4,78
M1		b2		d2	2,94
M1		b2	Total e2		7,72
M1		b2	Total general		19,26
M1		b3	e1	d1	5,36
M1		b3		d2	6,38
M1		b3	Total e1		11,74
M1		b3	e2	d1	1,69
M1		b3		d2	1,78
M1		b3	Total e2		3,47
M1		b3	Total general		15,21
M1	Total a2		e1	d1	14,8
M1	Total a2			d2	16,04
M1	Total a2		Total e1		30,84
M1	Total a2		e2	d1	11,06
M1	Total a2			d2	6,85
M1	Total a2		Total e2		17,91
M1	Total a2		Total general		48,75
M1	Total general		e1	d1	24,01
M1	Total general			d2	20,71
M1	Total general		Total e1		44,72
M1	Total general		e2	d1	22,54
M1	Total general			d2	8,17
M1	Total general		Total e2		30,71
M1	Total general		Total general		75,43

The functions that we need to apply to carry out the transformation are the following:

dplyr::mutate_all(): Apply the indicated function to all the columns of the table. We replace empty strings with NA. Example:

ft_at <- ft_at %>% dplyr::mutate_all( ~ dplyr::na_if(., ""))

tidyr::fill(): Fill in the missing values for the columns. The filling direction can be indicated. Example:

ft_at <- ft_at %>% tidyr::fill(col1, row1)

dplyr::filter(): Subset the table, retaining all rows that satisfy the condition. We remove the aggregates (label of the row or column closest to the array of values is empty) and NA values in the array of values. Example:

ft_at <- ft_at %>% dplyr::filter(!is.na(col2) & !is.na(row2) & !is.na(value))

dplyr::mutate_at(): Apply the indicated function to some of the columns of the table. We remove the thousands indicators and change the decimal indicator. Example:

ft_at <- ft_at %>% dplyr::mutate_at(c("value"),
                                    ~ stringr::str_replace_all(., pattern = "\\.", replacement = "")) %>%
  dplyr::mutate_at(c("value"),
                   ~ stringr::str_replace(., pattern = ",", replacement = "\\."))

As in the previous case, they can be combined to define a function and apply the transformations to all the objects in a list.

g <- function(pt) {
  pt %>%
    set_page(1, 1) %>%
    define_labels(n_col = 2, n_row = 2) %>%
    remove_top(1) %>%
    unpivot() %>%
    dplyr::mutate_all(~ dplyr::na_if(., "")) %>%
    tidyr::fill(col1, row1) %>%
    dplyr::filter(!is.na(col2) & !is.na(row2) & !is.na(value)) %>%
    dplyr::mutate_at(c("value"),
                     ~ stringr::str_replace_all(., pattern = "\\.", replacement = "")) %>%
    dplyr::mutate_at(c("value"),
                     ~ stringr::str_replace(., pattern = ",", replacement = "\\."))
}

ft2 <- flatten_table_list(list_pt_ie, g)

The result coincides with the one previously obtained. Below is a sample.

page	col1	col2	row1	row2	value
M1	a1	b2	e1	d1	3.89
M1	a1	b3	e2	d1	1.87
M1	a2	b3	e1	d1	5.36
M2	a1	b2	e1	d1	26
M2	a1	b2	e1	d2	37
M2	a1	b3	e1	d1	6
M2	a2	b1	e1	d1	50
M2	a2	b1	e1	d2	23
M2	a2	b2	e1	d1	42
M2	a2	b2	e1	d2	18
M2	a2	b2	e2	d1	33
M3	a1	b2	e2	d1	54.25
M3	a1	b3	e2	d1	52.36
M3	a2	b2	e1	d2	70.78
M3	a2	b3	e2	d1	42.25
M4	a1	b1	e1	d1	442.52
M4	a1	b1	e1	d2	150.96
M4	a1	b1	e2	d1	1223.64
M4	a1	b2	e1	d2	964.57
M4	a2	b1	e2	d2	553.80
M4	a2	b3	e1	d1	1268.08

Finally, below you can see all the data once transformed.

t2 <- ft2 %>%
  tidyr::pivot_wider(names_from = page, values_from = value) %>%
  dplyr::rename(A = col1, B = col2, E = row1, D = row2) %>% 
  dplyr::arrange(A, B, E, D)

A	B	E	D	M1	M2	M3	M4
a1	b1	e1	d1	2.99	12	35.88	442.52
a1	b1	e1	d2	1.02	12	12.24	150.96
a1	b1	e2	d1	4.06	31	64.88	1223.64
a1	b1	e2	d2	1.32	7	9.24	69.96
a1	b2	e1	d1	3.89	26	54.70	970.96
a1	b2	e1	d2	3.65	37	52.31	964.57
a1	b2	e2	d1	5.55	19	54.25	586.75
a1	b3	e1	d1	2.33	6	13.98	93.20
a1	b3	e2	d1	1.87	28	52.36	1473.56
a2	b1	e1	d1	5.62	50	111.70	3533.70
a2	b1	e1	d2	1.94	23	44.62	1034.02
a2	b1	e2	d1	4.59	46	105.66	2455.20
a2	b1	e2	d2	2.13	16	34.08	553.80
a2	b2	e1	d1	3.82	42	65.92	1561.52
a2	b2	e1	d2	7.72	18	70.78	687.32
a2	b2	e2	d1	4.78	33	74.62	1927.10
a2	b2	e2	d2	2.94	5	14.70	85.26
a2	b3	e1	d1	5.36	30	70.32	1268.08
a2	b3	e1	d2	6.38	24	71.52	1135.90
a2	b3	e2	d1	1.69	25	42.25	1063.01
a2	b3	e2	d2	1.78	13	23.14	307.94

The result obtained is the same. From my point of view, transformations carried out taking advantage of the structure of the pivot table are simpler and more intuitive. However, if necessary, this set of operations can help resolve special situations not yet considered in flattabler package.

Conclusions

flattabler package offers a set of operations that allow us to transform one or more pivot tables into a flat table. Transformation operations have been designed to be intuitive and easy to use. With them, it has been possible to properly transform all the pivot tables found so far by the author.

If an unforeseen situation arises, the proposed operations are also useful and can be supplemented by operations available in the components of tidyverse package, as also shown in this document.

flattabler: Obtaining a Flat Table from Pivot Tables

Jose Samos (jsamos@ugr.es)

2020-06-20

Introduction

An illustrative example

Pivot table

Flat table

Transforming a pivot table

Transforming a list of pivot tables

Flat table transformation

Pivot table operations

Pivot table import

Operations that return an object

Operations that return a list of objects

Pivot table definition

View an object

Split a text table

Define pivot table characteristics

Remove rows and columns

Pivot table transformation

Transform labels

Transform values

Flat table generation

Generation from a pivot table

Generation from a list of pivot tables

Flat table transformation

Complementary transformations

Alternative transformations

Conclusions