In the getting started vignette we introduced a variety of predicate functions for interrogating the contents of a parsed R Markdown document. The family of rmd_has_*
functions give a convenient way of checking explicitly for elements of an Rmd, however often we want to check for multiple elements at the same time and provide consistent output for any detected discrepancies.
To support this type of workflow, parsermd
implements the concept of an Rmd template. These are tibble based representations of Rmd elements which can be easily be compared with a parsed document.
hw01
Imagine that a homework assignment has been distributed to students in the form of an Rmd document named hw01.Rmd
. This document describes all of the necessary tasks for the student to complete and also includes scaffolding in the form of Rmd chunks and markdown that indicates where the students are expected to include their solutions.
rmd = parse_rmd(system.file("hw01.Rmd", package = "parsermd")))
(#> ├── YAML [2 lines]
#> ├── Heading [h3] - Load packages
#> │ └── Chunk [r, 1 opt, 2 lines] - load-packages
#> ├── Heading [h3] - Exercise 1
#> │ ├── Markdown [2 lines]
#> │ └── Heading [h4] - Solution
#> │ └── Markdown [2 lines]
#> ├── Heading [h3] - Exercise 2
#> │ ├── Markdown [2 lines]
#> │ └── Heading [h4] - Solution
#> │ ├── Markdown [4 lines]
#> │ ├── Chunk [r, 2 opts, 5 lines] - plot-dino
#> │ ├── Markdown [2 lines]
#> │ └── Chunk [r, 2 lines] - cor-dino
#> └── Heading [h3] - Exercise 3
#> ├── Markdown [2 lines]
#> └── Heading [h4] - Solution
#> ├── Markdown [4 lines]
#> ├── Chunk [r, 1 lines] - plot-star
#> ├── Markdown [2 lines]
#> └── Chunk [r, 1 lines] - cor-star
We can see examples of this templating by extracting the contents of the markdown in the Exercise 1 > Solution section.
rmd_select(rmd, by_section(c("Exercise 1", "Solution")) & has_type("rmd_markdown")) %>%
as_document()
#> [1] "(Type your answer to Exercise 1 here. This exercise does not require any R code.)"
#> [2] ""
#> [3] ""
When a student completes this assignment we want to be able to check that they have included solutions in the appropriate sections. At a minimum this means that we need to check that these sections still exist, and secondarily we might also want to check that the provided content in the solution differs from the provided scaffolding.
We will begin by subsetting the original parsed document to select only the elements that will contain the student’s answers - this assumes the other sections and elements are extraneous and contain things like background, instructions, and question text. Below we use rmd_select
to select all of the elements of the original document contained withing a section matching "Exercise *" and “Solution” which should cover the answers for all three exercises.
rmd_sols = rmd_select(rmd, by_section(c("Exercise *", "Solution"))))
(#> ├── Heading [h3] - Exercise 1
#> │ └── Heading [h4] - Solution
#> │ └── Markdown [2 lines]
#> ├── Heading [h3] - Exercise 2
#> │ └── Heading [h4] - Solution
#> │ ├── Markdown [4 lines]
#> │ ├── Chunk [r, 2 opts, 5 lines] - plot-dino
#> │ ├── Markdown [2 lines]
#> │ └── Chunk [r, 2 lines] - cor-dino
#> └── Heading [h3] - Exercise 3
#> └── Heading [h4] - Solution
#> ├── Markdown [4 lines]
#> ├── Chunk [r, 1 lines] - plot-star
#> ├── Markdown [2 lines]
#> └── Chunk [r, 1 lines] - cor-star
One we have this more limited set of elements we use the rmd_template
function to generate our template. Here we have included keep_content = TRUE
in order to keep the scaffolded content for each answer which will then be compared to the student’s answers.
rmd_tmpl = rmd_template(rmd_sols, keep_content = TRUE))
(#> # A tibble: 9 x 5
#> sec_h3 sec_h4 type label content
#> <chr> <chr> <chr> <chr> <chr>
#> 1 Exercise… Soluti… rmd_mar… <NA> "(Type your answer to Exercise 1 here. Thi…
#> 2 Exercise… Soluti… rmd_mar… <NA> "(The answers for this Exercise are given …
#> 3 Exercise… Soluti… rmd_chu… plot-d… "dino_data <- datasaurus_dozen %>%\n filt…
#> 4 Exercise… Soluti… rmd_mar… <NA> "And next calculate the correlation betwee…
#> 5 Exercise… Soluti… rmd_chu… cor-di… "dino_data %>%\n summarize(r = cor(x, y))"
#> 6 Exercise… Soluti… rmd_mar… <NA> "(Add code and narrative as needed. Note t…
#> 7 Exercise… Soluti… rmd_chu… plot-s… ""
#> 8 Exercise… Soluti… rmd_mar… <NA> "I'm some text, you should replace me with…
#> 9 Exercise… Soluti… rmd_chu… cor-st… ""
One the template is constructed we can then compare it with a new Rmd document via the rmd_check_template
function. Note that we can pass in an rmd_ast
or rmd_tibble
object directly, or the path to an Rmd which will then be parsed and compared.
rmd_check_template(system.file("hw01-student.Rmd", package = "parsermd"), rmd_tmpl)
#> x The following required elements were missing in the document:
#> • Section "Exercise 3" > "Solution" is missing required 'markdown text'.
#> • Section "Exercise 3" > "Solution" is missing required 'markdown text'.
#> x The following document elements were unmodified from the template:
#> • Section "Exercise 2" > "Solution" has a 'code chunk' named 'plot-dino'
#> which has not been modified.
#> • Section "Exercise 2" > "Solution" has 'markdown text' which has not been
#> modified.
#> • Section "Exercise 2" > "Solution" has a 'code chunk' named 'cor-dino'
#> which has not been modified.
From the output we can see that there are several issues with the document submitted by the student, they are missing the two expected markdown text entries for Exercise 3 and it appears that they have not entered any thing new for the chunks or markdown in Exercise 2.
Let assume that our original template was a bit too strict, and we would like to revise the feedback it is giving to students.
If we were to decide that for Exercise 3 the markdown text was not actually necessary, we can remove this requirement by filtering those elements from rmd_sols
or from rmd_tmpl
. (Generally, the former is the suggested workflow and will always work, while the later approach is likely to be somewhat fragile to any changes made to the template format in future releases.) Here we use rmd_select
with the !
operator to remove these specific markdown elements.
%>%
rmd_sols rmd_select( !(by_section(c("Exercise 3", "Solution")) & has_type("rmd_markdown")) )
#> ├── Heading [h3] - Exercise 1
#> │ └── Heading [h4] - Solution
#> │ └── Markdown [2 lines]
#> ├── Heading [h3] - Exercise 2
#> │ └── Heading [h4] - Solution
#> │ ├── Markdown [4 lines]
#> │ ├── Chunk [r, 2 opts, 5 lines] - plot-dino
#> │ ├── Markdown [2 lines]
#> │ └── Chunk [r, 2 lines] - cor-dino
#> └── Heading [h3] - Exercise 3
#> └── Heading [h4] - Solution
#> ├── Chunk [r, 1 lines] - plot-star
#> └── Chunk [r, 1 lines] - cor-star
This new AST can then be passed to rmd_template
and rmd_check_template
to provide the revised feedback,
%>%
rmd_sols rmd_select( !(by_section(c("Exercise 3", "Solution")) & has_type("rmd_markdown")) ) %>%
rmd_template(keep_content = TRUE) %>%
rmd_check_template(system.file("hw01-student.Rmd", package = "parsermd"), .)
#> x The following document elements were unmodified from the template:
#> • Section "Exercise 2" > "Solution" has a 'code chunk' named 'plot-dino'
#> which has not been modified.
#> • Section "Exercise 2" > "Solution" has 'markdown text' which has not been
#> modified.
#> • Section "Exercise 2" > "Solution" has a 'code chunk' named 'cor-dino'
#> which has not been modified.