CRAN Task View: Web Technologies and Services

Maintainer:Mauricio Vargas Sepulveda
Contact:mavargas11 at uc.cl
Version:2022-01-23
URL:https://CRAN.R-project.org/view=WebTechnologies
Source:https://github.com/cran-task-views/WebTechnologies/
Contributions:Suggestions and improvements for this task view are very welcome and can be made through issues or pull requests on GitHub or via e-mail to the maintainer address. For further details see the Contributing guide.
Citation:Mauricio Vargas Sepulveda (2022). CRAN Task View: Web Technologies and Services. Version 2022-01-23. URL https://CRAN.R-project.org/view=WebTechnologies.
Installation:The packages from this task view can be installed automatically using the ctv package. For example, ctv::install.views("WebTechnologies", coreOnly = TRUE) installs all the core packages or ctv::update.views("WebTechnologies") installs all packages that are not yet installed and up-to-date. See the CRAN Task View Initiative for more details.

This task view contains information about to use R and the world wide web together. The base version of R does not ship with many tools for interacting with the web. Thankfully, there are an increasingly large number of tools for interacting with the web. This task view focuses on packages for obtaining web- based data and information, frameworks for building web-based R applications, and online services that can be accessed from R. A list of available packages and functions is presented below, grouped by the type of activity. The rOpenSci task view: Open Data provides further discussion of online data sources that can be accessed from R.

If you have any comments or suggestions for additions or improvements for this task view, please submit an issue or a pull request in the GitHub repository linked above. If you can’t contribute on GitHub, please send an e-mail to the maintainer address above. If you have an issue with one of the packages discussed below, please contact the maintainer of that package.

Thanks to all contributors to this task view, especially to Scott Chamberlain, Thomas Leeper, Patrick Mair, Karthik Ram, and Christopher Gandrud who maintained this task view up to 2021.

Tools for Working with the Web from R

Core Tools For HTTP Requests

There are three main packages that should cover most use cases of interacting with the web from R. crul is an R6-based HTTP client that provides asynchronous HTTP requests, a pagination helper, HTTP mocking via webmockr, and request caching for unit tests via vcr. crul targets R developers more so than end users. httr provides more of a user facing client for HTTP requests and differentiates from the former package in that it provides support for OAuth. Note that you can pass in additional curl options when you instantiate R6 classes in crul, and the config parameter in httr. curl is a lower-level package that provides a closer interface between R and the libcurl C library, but is less user-friendly. curl underlies both crul and httr. curl may be useful for operations on web-based XML or to perform FTP operations (as crul and httr are focused primarily on HTTP). curl::curl() is an SSL-compatible replacement for base R’s url() and has support for http 2.0, SSL (https, ftps), gzip, deflate and more. For websites serving insecure HTTP (i.e. using the “http” not “https” prefix), most R functions can extract data directly, including read.table and read.csv; this also applies to functions in add-on packages such as jsonlite::fromJSON() and XML::parseXML. For more specific situations, the following resources may be useful:

Handling HTTP Errors/Codes

Parsing Structured Web Data

The vast majority of web-based data is structured as plain text, HTML, XML, or JSON (javascript object notation). Web service APIs increasingly rely on JSON, but XML is still prevalent in many applications. There are several packages for specifically working with these format. These functions can be used to interact directly with insecure web pages or can be used to parse locally stored or in- memory web files.

Tools for Working with URLs

Tools for Working with Scraped Webpage Contents

Security

Other Useful Packages and Functions

Web and Server Frameworks

Web Services

Cloud Computing and Storage

Document and Code Sharing

Data Analysis and Processing Services

Social Media Clients

Web Analytics Services

Web Services for R Package Development

Other Web Services

CRAN packages

Core:crul, curl, httr, shiny, vcr, webmockr, xml2.
Regular:abbyyR, ajv, analogsea, aRxiv, aws.signature, AzureAuth, AzureContainers, AzureCosmosR, AzureGraph, AzureKusto, AzureQstor, AzureRMR, AzureStor, AzureTableStor, AzureVision, AzureVM, beakr, bigrquery, boilerpipeR, boxr, brandwatchR, captr, clarifai, crunch, crunchy, dataone, datarobot, dataverse, discgolf, downloader, duckduckr, europepmc, FastRWeb, fauxpas, fbRads, fiery, ganalytics, geonapi, geosapi, ggmap, gh, gistr, git2r, gitlabr, gmailr, googleAnalyticsR, googleAuthR, googleCloudStorageR, googleComputeEngineR, googleLanguageR, googlesheets4, googleVis, graphTweets, gsheet, gtrendsR, hackeRnews, htm2txt, htmltidy, htmltools, httpcache, httpcode, httping, httpRequest, httptest, httpuv, imguR, instaR, ipaddress, iptools, jqr, js, jsonlite, jsonvalidate, jstor, languagelayeR, longurl, magrittr, mailR, mapsapi, mathpix, Microsoft365R, mime, mscstexta4r, mscsweblm4r, nanonext, ndjson, notifyme, oai, OAIHarvester, openadds, opencage, opencpu, OpenML, osmplotr, osrm, ows4R, paws, pdftables, plotKML, plotly, plumber, postlightmercury, pubmed.mineR, pushoverr, qualtRics, radiant, RAdwords, randNames, rapiclient, rapport, rcoreoa, Rcrawler, rcrossref, RCurl, rdatacite, rdrop2, redcapAPI, REDCapR, repmis, reqres, request, rerddap, restfulr, Rexperigen, Rfacebook, rfigshare, rgeolocate, RgoogleMaps, rhub, rio, rjson, RJSONIO, Rlinkedin, rLTP, roadoi, ROAuth, robotstxt, Rook, rorcid, rosetteApi, routr, rpinterest, rplos, RPushbullet, rrefine, RSclient, rscopus, rsdmx, RSelenium, Rserve, RSiteCatalyst, RSmartlyIO, RStripe, rtweet, rvest, RYandexTranslate, scholar, searchConsoleR, selectr, seleniumPipes, sendmailR, servr, slackr, spiderbar, streamR, swagger, tidyRSS, transcribeR, twitteR, uaparserjs, urltools, V8, vkR, W3CMarkupValidator, WebAnalytics, webreadr, webshot, webutils, whisker, WikidataQueryServiceR, WikidataR, wikipediatrend, WikipediR, WufooR, XML, XML2R, xslt, yhatr, zen4R.
Archived:dash, RGA.

Related links

Other resources