gh generally sends a Personal Access Token (PAT) with its requests. Some endpoints of the GitHub API can be accessed without authenticating yourself. But once your API use becomes more frequent, you will want a PAT to prevent problems with rate limits and to access all possible endpoints.
This article describes how to store your PAT, so that gh can find it (automatically, in most cases). The function gh uses for this is gh_token()
.
More resources on PAT management:
usethis::gh_token_help()
and usethis::git_sitrep()
help you check if a PAT is discoverable and has suitable scopesusethis::create_github_token()
guides you through the process of getting a new PATgitcreds::gitcreds_set()
helps you explicitly put your PAT into the Git credential storegh::gh()
allows the user to provide a PAT via the .token
argument and to specify a host other than “github.com” via the .api_url
argument. (Some companies and universities run their own instance of GitHub Enterprise.)
However, it’s annoying to always provide your PAT or host and it’s unsafe for your PAT to appear explicitly in your R code. It’s important to make it possible for the user to provide the PAT and/or API URL directly, but it should rarely be necessary. gh::gh()
is designed to play well with more secure, less fiddly methods for expressing what you want.
How are .api_url
and .token
determined when the user does not provide them?
.api_url
defaults to the value of the GITHUB_API_URL
environment variable and, if that is unset, falls back to "https://api.github.com"
. This is always done before worrying about the PAT.gh_token(.api_url)
. That is, the token is looked up based on the host.gh now uses the gitcreds package to interact with the Git credential store.
gh calls gitcreds::gitcreds_get()
with a URL to try to find a matching PAT. gitcreds::gitcreds_get()
checks session environment variables and then the local Git credential store. Therefore, if you have previously used a PAT with, e.g., command line Git, gh may retrieve and re-use it. You can call gitcreds::gitcreds_get()
directly, yourself, if you want to see what is found for a specific URL.
If you see something like this:
#> <gitcreds>
#> protocol: https
#> host : github.com
#> username: PersonalAccessToken
#> password: <-- hidden -->
that means that gitcreds could get the PAT from the Git credential store. You can call gitcreds_get()$password
to see the actual PAT.
If no matching PAT is found, gitcreds::gitcreds_get()
errors.
If you don’t have a Git installation, or your Git installation does not have a working credential store, then you can specify the PAT in an environment variable. For github.com
you can set the GITHUB_PAT_GITHUB_COM
or GITHUB_PAT
variable. For a different GitHub host, call gitcreds::gitcreds_cache_envvar()
with the API URL to see the environment variable you need to set. For example:
On a machine used for interactive development, we recommend:
Store your PAT(s) in an official credential store.
Do not store your PAT(s) in plain text in, e.g., .Renviron
. In the past, this has been a common and recommended practice for pragmatic reasons. However, gitcreds/gh have now evolved to the point where it’s possible for all of us to follow better security practices.
If you use a general-purpose password manager, like 1Password or LastPass, you may also want to store your PAT(s) there. Why? If your PAT is “forgotten” from the OS-level credential store, intentionally or not, you’ll need to provide it again when prompted.
If you don’t have any other record of your PAT, you’ll have to get a new PAT whenever this happens. This is not the end of the world. But if you aren’t disciplined about deleting lost PATs from https://github.com/settings/tokens, you will eventually find yourself in a confusing situation where you can’t be sure which PAT(s) are in use.
On a headless system, such as on a CI/CD platform, provide the necessary PAT(s) via secure environment variables. Regular environment variables can be used to configure less sensitive settings, such as the API host. Don’t expose your PAT by doing something silly like dumping all environment variables to a log file.
Note that on GitHub Actions, specifically, a personal access token is automatically available to the workflow as the GITHUB_TOKEN
secret. That is why many workflows in the R community contain this snippet:
This makes the automatic PAT available as the GITHUB_PAT
environment variable. If that PAT doesn’t have the right permissions, then you’ll need to explicitly provide one that does (see link above for more).
If there is no PAT to be had, gh::gh()
sends a request with no token. (Internally, the Authorization
header is omitted if the PAT is found to be the empty string, ""
.)
What do PAT-related failures look like?
If no PAT is sent and the endpoint requires no auth, the request probably succeeds! At least until you run up against rate limits. If the endpoint requires auth, you’ll get an HTTP error, possibly this one:
GitHub API error (401): 401 Unauthorized
Message: Requires authentication
If a PAT is first discovered in an environment variable, it is taken at face value. The two most common ways to arrive here are PAT specification via .Renviron
or as a secret in a CI/CD platform, such as GitHub Actions. If the PAT is invalid, the first affected request will fail, probably like so:
GitHub API error (401): 401 Unauthorized
Message: Bad credentials
This will also be the experience if an invalid PAT is provided directly via .token
.
Even a valid PAT can lead to a downstream error, if it has insufficient scopes with respect to a specific request.