Introduction

The French official open data portal offers a huge quantity of information. They also provide a well structured API. The BARIS package allows you to exploit this API in order to get the required data from the portal.

Within the portal there is the concept of a data set which contains one or several data frames or resources. So, if I use the resource term, you need to apprehend it as the data frame inside a data set.

The package is available on CRAN, you can also install the development version from Github:


install.packages("BARIS")

Too much talking, let’s dive into a reproducible example.

BARIS_explain()

The BARIS_explain() function provides a description of a data set. The function takes one argument which is the ID of the data set:


BARIS_explain(datasetId = "5cebfa8306e3e77ffdb31ef5")
#> [1] "Monuments historiques situés sur le territoire de Marseille, avec adresse, numéro de base Mérimée (base de données du Ministère de la Culture recensant les monuments historiques de toute la France) et points de géolocalisation"

Don’t panic if you’re not a french speaker. You can always use the great googleLanguageR.

Now, it’s time to list the resources contained within this data set !!!

BARIS_resources()

The BARIS_resources function displays the available resources or data frames within a data set. The function takes as argument the ID of the data set:

BARIS_resources(datasetId = "5cebfa8306e3e77ffdb31ef5")
#> # A tibble: 2 x 6
#>   id         title       format published   url             description         
#>   <chr>      <chr>       <chr>  <chr>       <chr>           <chr>               
#> 1 59ea7bba-~ MARSEILLE_~ csv    2019-05-27~ https://trouve~ Monuments historiqu~
#> 2 6328f8b3-~ Plan des M~ pdf    2019-05-27~ https://trouve~ Edition Janvier 2013

You can see from above that the data set has two resources, a csv and a pdf. Now, we’ve reached the interesting part: extracting the data frame that you’ll work on !

BARIS_extract()

Using BARIS_extract() you can extract directly into your R session the needed data set. Currently, “only” theses formats are supported: json, csv, xls, xlsx, xml, geojson and shp, nevertheless you can always rely on the url of the resource to download it manually.

In order to use the function you’ll have to specify two arguments: The ID of the resource and its format.

You can visually catch the structure difference between the ID of a data set and the ID of a resource.


data <- BARIS_extract(resourceId = "59ea7bba-f38a-4d75-b85f-2d1955050e53", format = "csv")

head(data)
#> # A tibble: 6 x 10
#>   n_base_merimee date_de_protection_a~ denomination        adresse   code_postal
#>   <chr>          <chr>                 <chr>               <chr>           <int>
#> 1 PA00081336     Classement : liste d~ Ancienne église de~ "/"             13002
#> 2 PA00081340     Classement: 13/09/19~ Eglise Saint-Laure~ "Esplana~       13002
#> 3 PA00081331     Classement: 29/01/19~ Chapelle et Hospic~ "2, Rue ~       13002
#> 4 PA00081344     Classement: 16/06/19~ Fort Saint-Jean     ""              13002
#> 5 PA00081325     Inscription : 23/11/~ Les deux bâtiments~ "Quai du~       13002
#> 6 PA00081334     Inscription : 07/07/~ Clocher des Accoul~ "Montée ~       13002
#> # ... with 5 more variables: proprietaire_du_monument <chr>,
#> #   epoque_de_construction <chr>, date_de_construction <chr>, longitude <dbl>,
#> #   latitude <dbl>

End of the vignette.