Format columns as markdown

This vignette introduces how to format columns in flextable.

library(flextable)
library(ftExtra)

Why markdown?

The flextable package is an excellent package that allows fine controls on styling tables, and export it to variety of formats (HTML, MS Word, PDF). Especially, when output format is MS Word, this package is the best solution in R.

On the other hand, styling texts with the flextable package often require large efforts. The following example subscripts numeric values in chemical formulas.

df <- data.frame(Oxide = c("SiO2", "Al2O3"), stringsAsFactors = FALSE)
ft <- flextable::flextable(df)

ft %>%
  flextable::compose(
    i = 1, j = "Oxide",
    value = flextable::as_paragraph(
      "SiO", as_sub("2")
    )
  ) %>%
  flextable::compose(
    i = 2, j = "Oxide",
    value = flextable::as_paragraph(
      "Al", as_sub("2"), "O", as_sub("3")
    )
  )

The above example has two problems:

  1. This is just a manual re-writing of the table.
    • Basically, users will explicitly input which characters to subscript.
    • For fine formatting, users have to apply compose for each cells one by one.
  2. Users have to learn a lot of functions from the flextable package
    • compose, as_paragraph, and as_sub in the above example

The first point can be solved by using a for loop, however, the code becomes quite complex.

df <- data.frame(Oxide = c("SiO2", "Fe2O3"), stringsAsFactors = FALSE)
ft <- flextable::flextable(df)

for (i in seq(nrow(df))) {
  ft <- flextable::compose(
    ft, i = i, j = "Oxide",
    value = flextable::as_paragraph(
      list_values = df$Oxide[i] %>%
        stringr::str_replace_all("([2-9]+)", " \\1 ") %>%
        stringr::str_split(" ", simplify = TRUE) %>%
        purrr::map_if(
          function(x) stringr::str_detect(x, "[2-9]+"),
          flextable::as_sub
        )
    )
  )
}
ft

The ftExtra package provides easy solution by introducing markdown. As markdown texts self-explain their formats by plain texts, what users have to do is manipulations of character columns with their favorite tools such as the famous dplyr and stringr packages.

  1. Preprocess a data frame to decorate texts with markdown syntax.
  2. Convert the data frame into a flextable object with the flextable function or as_flextable function.
  3. Format markdown columns with colformat_md

The following example elegantly simplifies the prior example.

df <- data.frame(Oxide = c("SiO2", "Fe2O3"), stringsAsFactors = FALSE)

df %>%
  dplyr::mutate(Oxide = stringr::str_replace_all(Oxide, "([2-9]+)", "~\\1~")) %>%
  flextable::flextable() %>%
  ftExtra::colformat_md()

The colformat_md function is smart enough to detect character columns, so users can start without learning its arguments. Of course, it is possible to chose columns.

Another workflow is to read a markdown-formatted table from a external file. Again, markdown is by design a plain text, and can easily be embed in any formats such as CSV and Excel. So users can do something like

readr::read_csv("example.csv") %>%
  flextable::flextable() %>%
  ftExtra::colformat_md()

By default, the ftExtra package employs Pandoc’s markdown, which is also employed by R Markdown. This enables consistent user experience when using the ftExtra package in R Markdown.

Basic examples

The example below shows that colformat_md() function parses markdown texts in the flextable object.

data.frame(
  a = c("**bold**", "*italic*"),
  b = c("^superscript^", "~subscript~"),
  c = c("`code`", "[underline]{.underline}"),
  d = c("*[**~ft~^Extra^**](https://ftextra.atusy.net/) is*",
        "[Cool]{.underline shading.color='skyblue'}"),
  stringsAsFactors = FALSE
) %>%
  as_flextable() %>%
  colformat_md()

The table header can also be formatted by specifying part = "header" or "all" to colformat_md()

Supported syntax are

Notes:

Footnotes

An easy way to add a footnote is inline footnote.

data.frame(
  package = "ftExtra",
  description = "Extensions for 'Flextable'^[Supports of footnotes]",
  stringsAsFactors = FALSE
) %>%
  as_flextable() %>%
  colformat_md() %>%
  flextable::autofit(add_w = 0.5)

Reference symbols can be configured by footnote_options(). Of course, markdown can be used inside footnotes as well.

data.frame(
  package = "ftExtra^[Short of *flextable extra*]",
  description = "Extensions for 'Flextable'^[Supports of footnotes]",
  stringsAsFactors = FALSE
) %>%
  as_flextable() %>%
  colformat_md(
    .footnote_options = footnote_options(ref = "i",
                                         prefix = '[',
                                         suffix = ']',
                                         start = 2,
                                         inline = TRUE,
                                         sep = "; ")
  ) %>%
  flextable::autofit(add_w = 0.5)

In order to add multiple footnotes to a cell, use normal footnotes syntax.

data.frame(x = 
"foo[^a]^,^ [^b]

[^a]: aaa

[^b]: bbb",
stringsAsFactors = FALSE
) %>%
  as_flextable() %>%
  colformat_md()

Images

Images can be inserted optionally with width and/or height attributes. Specifying one of them changes the other while keeping the aspect ratio.

data.frame(
  R = sprintf("![](%s)", file.path( R.home("doc"), "html", "logo.jpg" )),
  stringsAsFactors = FALSE
) %>%
  as_flextable() %>%
  colformat_md() %>%
  flextable::autofit()

The R logo is distributed by The R Foundation with the CC-BY-SA 4.0 license.

Line breaks

By default, soft line breaks becomes spaces.

data.frame(linebreak = c("a\nb"), stringsAsFactors = FALSE) %>%
  as_flextable() %>%
  colformat_md()

Pandoc’s markdown supports hard line breaks by adding a backslash or double spaces at the end of a line.

data.frame(linebreak = c("a\\\nb"), stringsAsFactors = FALSE) %>%
  as_flextable() %>%
  colformat_md()

It is also possible to make \n as a hard line break by extending Pandoc’s Markdown.

data.frame(linebreak = c("a\nb"), stringsAsFactors = FALSE) %>%
  as_flextable() %>%
  colformat_md(md_extensions = "+hard_line_breaks")

Markdown treats continuous linebreaks as a separator of blocks such as paragraphs. However, flextable package lacks the support for multiple paragraphs in a cell. To workaround, colformat_md collapses them to a single paragraph with a separator given to .sep (default: \n\n).

data.frame(linebreak = c("a\n\nb"), stringsAsFactors = FALSE) %>%
  as_flextable() %>%
  colformat_md(.sep = "\n\n")

Citations

Citations is experimentally supported. Note that there are no citation lists. It is expected to be produced by using R Markdown.

First, create a ftExtra.bib file like below.

@Manual{R-ftExtra,
  title = {ftExtra: Extensions for Flextable},
  author = {Atsushi Yasumoto},
  year = {2022},
  note = {https://ftextra.atusy.net},
}

Second, specify it within the YAML front matter.

---
bibliography: ftExtra.bib
---

Finally, cite the references within tables.

data.frame(
  Cite = c("@R-ftExtra", "[@R-ftExtra]", "[-@R-ftExtra]"),
  stringsAsFactors = FALSE
) %>%
  as_flextable() %>%
  colformat_md() %>%
  flextable::autofit(add_w = 0.2)

If citation style such as Vancouver requires citations be numbered sequentially and consistently with the body, manually offset the number for example by colformat_md(.cite_offset = 5).

Math

The rendering of math is also possible.

data.frame(math = "$e^{i\\theta} = \\cos \\theta + i \\sin \\theta$",
           stringsAsFactors = FALSE) %>%
  as_flextable() %>%
  colformat_md() %>%
  flextable::autofit(add_w = 0.2)

Note that results can be insufficient. This feature relies on Pandoc’s HTML writer, which

render TeX math as far as possible using Unicode characters
https://pandoc.org/MANUAL.html#math-rendering-in-html

Emoji

Pandoc’s markdown provides an extension, emoji. To use it with colformat_md(), specify md_extensions="+emoji".

data.frame(emoji = c(":+1:"), stringsAsFactors = FALSE) %>%
  as_flextable() %>%
  colformat_md(md_extensions = "+emoji")

Other input formats

colformat_md supports variety of formats. They can even be HTML despite the name of the function.

data.frame(
  x = "H<sub>2</sub>O",
  stringsAsFactors = FALSE
) %>%
  as_flextable() %>%
  colformat_md(.from = "html")

Note that multiple paragraphs are not supported if .from is not "markdown". Below is an example with commonmark.

data.frame(
  x = "foo\n\nbar",
  stringsAsFactors = FALSE
) %>%
  as_flextable() %>%
  colformat_md(.from = "commonmark")