httptest2
makes it easy for you to write tests that don’t require a network connection. With capture_requests()
, you can record responses from real requests so that you can use them later in tests. A further benefit of testing with mocks is that you don’t have to deal with authentication and authorization on the server in your tests—you don’t need to supply real login credentials for your test suite to run. You can have full test coverage of your code, both on public continuous-integration services like GitHub Actions and when you submit packages to CRAN, all without having to publish secret tokens or passwords.
It is important to ensure that the mocks you include in your test suite do not inadvertently reveal private information as well. HTTP responses may contain things you don’t want to share: credentials in the headers, record IDs in the body or even the URL itself, or even personally identifiable information.
httptest2
provides a framework for sanitizing the responses that capture_requests()
records. By default, it redacts some standard ways that auth secrets may appear in HTTP responses. The framework is extensible and allows you to specify custom redaction policies that match how your API accepts and returns sensitive information.
By default, the capture_requests()
context evaluates the redact_cookies()
function on a response object before writing it to disk. redact_cookies()
redacts the Set-Cookie
response header, which may contain auth credentials. Many APIs don’t return anything in the HTTP response that leaks auth secrets, and while you send secrets in your request, the httr2_request
object isn’t saved in the mocks, only the httr2_response
.
What does “redacting” entail? We aren’t the CIA working with classified reports, taking a heavy black marker over certain details. In our case, redacting means replacing the sensitive content with the string “REDACTED”. Your recorded responses will be just as it was “live”. And only the recorded responses will be affected—the actual response you’re capturing in your active R session is not modified, only the mock that is written out.
To illustrate, here’s a request that has a cookie in the response. Let’s record it.
capture_requests({
<- request("http://httpbin.org/cookies/set") %>%
real_resp req_url_query(token = "12345") %>%
# httpbin normally does a 302 redirect after this request,
# but let's prevent that just to illustrate
req_options(followlocation = FALSE) %>%
req_perform()
})
In the actual response object in our R session, the cookie is there:
resp_headers(real_resp)
## <httr2_headers>
## Date: Wed, 29 Dec 2021 18:14:20 GMT
## Content-Type: text/html; charset=utf-8
## Content-Length: 223
## Connection: keep-alive
## Server: gunicorn/19.9.0
## Location: /cookies
## Set-Cookie: token=12345; Path=/
## Access-Control-Allow-Origin: *
## Access-Control-Allow-Credentials: true
But when we load that recorded response in tests later, the cookie won’t appear because it was redacted:
<- "httpbin.org/cookies/set-5b2631.R"
mockfile <- source(mockfile)$value
mock resp_headers(mock)
## <httr2_headers>
## Date: Wed, 29 Dec 2021 18:14:20 GMT
## Content-Type: text/html; charset=utf-8
## Content-Length: 223
## Connection: keep-alive
## Server: gunicorn/19.9.0
## Location: /cookies
## Set-Cookie: REDACTED
## Access-Control-Allow-Origin: *
## Access-Control-Allow-Credentials: true
with_mock_api({
request("http://httpbin.org/cookies/set") %>%
req_url_query(token = "12345") %>%
req_options(followlocation = FALSE) %>%
req_perform() %>%
resp_header("Set-Cookie")
})
## [1] "REDACTED"
Sensitive or personal information is not limited to response cookies or headers. Sometimes identifiers are built into URLs or response bodies. These may be less sensitive than auth tokens, but you may want to conceal or anonymize your data that is included in test fixtures.
Redacting functions can help with this content as well. You can use redactors on any part of the response object, not just the headers and cookies. A redactor is just a function that takes a response as input and returns a response object, so anything is possible if you write a custom redactor.
For example, in the API for Pivotal Tracker, the agile project management tool, the Pivotal project id is built into many of its URLs. As a result, it would appear in mock file paths you record. The id is also often included in the response body.
We’d rather not have that information leak in our test fixtures, so in the pivotaltrackR package, which wraps this API, we need to tell capture_requests()
to scrub this id when we record mocks.
Note:
pivotaltrackR
useshttr
andhttptest
, nothttr2
andhttptest2
, but the redacting behavior is consistent betweenhttptest
andhttptest2
.
To do this, we’ll use set_redactor()
to supply a custom function. The project id is stored in the R session in options(pivotal.project)
, so we can identify it and find-and-replace it with gsub_response()
. The function takes response
as its first argument and then passes the rest to gsub()
, which is called on both the response URL and the response body.
set_redactor(~ gsub_response(., getOption("pivotal.project"), "123"))
Note the formula shorthand: this follows the syntax in the purrr
package for defining anonymous functions. It is equivalent to function(response) gsub_response(response, getOption("pivotal.project"), "123")
.
Valid inputs to set_redactor()
include:
response
, and returning a valid response
object.
as the “response” argumentNULL
, to override the default redact_cookies()
and do no redactingTo see this in action, let’s record a request:
set_redactor(redact_pivotal)
start_capturing()
<- getStories(search = "mnt")
s stop_capturing()
Note that the actual project id appears in the data returned from the search.
1]]$project_id
s[[## [1] "my-project-name"
However, the project id won’t be found in the recorded file. If we load the recorded response in with_mock_api()
, we’ll see the value we substituted in:
with_mock_api({
<- getStories(search = "mnt")
s
})1]]$project_id
s[[## [1] "123"
Nor will the project id appear in the file path: since the redactor is evaluated before determining the file path to write to, if you alter the response URL, the destination file path will be generated based on the modified URL. In this case, our mock is written to “…/projects/123/stories-fb1776.json”, not “…/projects/my-project-name/stories-fb1776.json”.
We can do more response cleaning with custom functions. All of the redactors in httptest2
take the “response” object as their first argument and return the response object modified in some way. This lends them to pipelining, as with the magrittr
package.
Continuing with the pivotaltrackR
example, let’s also prune the domain and API root path from the URLs we’re recording so that we’re making shorter file paths:
function(response) {
require(magrittr, quietly=TRUE)
%>%
response # Shorten the URL
gsub_response("https://www.pivotaltracker.com/services/v5/", "", fixed = TRUE) %>%
# Remove my project ID
gsub_response(getOption("pivotal.project"), "123")
}
If you’re writing a package that wraps an API and you need a custom redactor to safely record API responses, you’ll want to make sure that you always record with that redactor. You don’t want to forget to call set_redactor()
in your R session and end up recording fixtures that contain your auth secrets.
To make sure that your redactor is “always on” for your package, httptest2
enables you to define a package-level redactor. To do this, put a redacting function in inst/httptest2/redact.R
in your package. (In fact, the function in the above example is the package redactor in pivotaltrackR
.)
Any time you record requests while your package is loaded, as when running tests or building vignettes, this function will be called on the response
object before writing it to disk. It’s automatic: set it there once and you never have to remember.
Finally, depending on how long the URLs are in the API requests you make, you may need to programmatically shorten them if you’re planning on submitting your package to CRAN because CRAN requires file names to be 100 characters or less. Long file names throw a “non-portable file paths” message in R CMD check
.
A redactor can help solve this. For example, if all of your API endpoints sit beneath https://language.googleapis.com/v1/
, you could:
set_redactor(function (x) {
gsub_response(x, "https\\://language.googleapis.com/v1/", "api/")
})
This will replace that string in all parts of the mock file that is saved, including in the file path that is written–that is, paths will start with “api/” rather than “language.googleapis.com/v1/”, saving you (in this case) 23 characters. The function will also be called when loading mocks in with_mock_api()
so that the shortened file paths are found.