OneMapSGAPI R Package

The OneMapSGAPI package provides useful wrappers for the OneMapSG API client. It allows users to easily query spatial data from the API in a tidy format and provides additional functionalities to allow easy data manipulation.

library(onemapsgapi)

Authentication

This function is a wrapper for the Authentication API endpoint In order to query data, most API endpoints in the OneMapSG API require a token. First-time users can register themselves using the OneMapSG registration form. Subsequently, they can retrieve their tokens using the get_token() function with their email and password, for example:

token <- get_token("user@example.com", "password")
#> Warning in get_token(Sys.getenv("onemap_email"), Sys.getenv("onemap_pw")): The
#> request produced a 401 error

The function will also print a message informing users of the token’s expiry date and time.

Themes

These functions are wrappers for the Themes API endpoints. Themes in the OneMap SG API refer to types of locations, such as kindergartens, parks, hawker centres etc.

Check Theme Status

The get_theme_status() function allows users to check if data associated with a theme has been updated after a certain time. It returns a named logical.

# returns named logical
get_theme_status(token, "kindergartens")
get_theme_status(token, "2020-01-01", "12:00:00", "hotels")
# returns NULL, warning message shows status code
get_theme_status("invalid_token", "blood_bank")
# returns NULL, warning message shows error
get_theme_status(token, "invalid_theme")

Get Theme Information

The get_theme_info() function allows users to get information related to a specific theme. It returns a named character vector with Theme Name and Query Name.

# returns named character vector
get_theme_status(token, "kindergartens")

If an error occurs, the function returns NULL, along with a warning message.

# returns NULL, warning message shows status code
get_theme_status("invalid_token", "blood_bank")
# returns NULL, warning message shows error
get_theme_status(token, "invalid_theme")

Find Themes of Interest

The search_themes() function allows users to find details of themes of interest. It returns a tibble of themes matching user’s search terms. Alternatively, if no search terms are added, a tibble of all themes available through the API is returned. The variable THEMENAME in the output tibble serves as the input for getting theme data.

# return all themes related to "hdb" or "parks"
search_themes(token, "hdb", "parks")
# return all possible themes
search_themes(token)
# return all possible themes, with less variables
search_themes(token, more_info = FALSE)

If an error occurs, the function returns NULL, along with a warning message.

search_themes("my_invalid_token")

Get Theme Data

The function get_theme() returns data related to a particular theme, often location coordinates and other information. It returns the output as a tibble, or prints a warning message when an error is encountered. All tibbles will contain the variables: NAME, DESCRIPTION, ADDRESSPOSTALCODE, ADDRESSSTREETNAME, Lat, Lng, ICON_NAME, and some provide additional information; for example, query hawker centres gives additional information about the completion date of each hawker centre.

# return all hotel data
get_theme(token, "hotels")
# return all monuments data within a bounding area
get_theme(token, "monuments", extents = "1.291789,%20103.7796402,1.3290461,%20103.8726032")
# returns a list of status tibble and output tibble
get_theme(token, "lighting", return_info = TRUE) %>%
  str()
# error: output is NULL, warning message shows status code
get_theme("invalid_token", "hotels")

# error: output is NULL, warning message shows error message from request
get_theme(token, "non-existent-theme")

# error: output is query_info, warning message query did not return any records
get_theme(token, "ura_parking_lot", "1.291789,%20103.7796402,1.3290461,%20103.8726032")

Planning Areas

These functions are a wrapper for the Planning Area API. Planning Area API endpoints allow users to get spatial data and data related to the planning areas in Singapore. This package provides users with the ability to query the data and optionally handles necessary spatial data wrangling on behalf of the user.

Get Planning Areas

The function get_planning_areas() allows users to query all the spatial polygons associated with Singapore’s planning areas, for certain years. The function also optionally helps users transform raw geojson strings into sf or sp objects.

If the parameter read is not specified, the function returns a raw JSON object with planning names and geojson string vectors.

# returns raw JSON object
get_planning_areas(token)
get_planning_areas(token, 2008)

If read is specified, any missing geojson objects will be dropped (this affects the “Others” planning area returned by the API). The returned outputs are NOT projected.

If read = “sf”, the function returns a single “sf” dataframe with 2 columns: “name” (name of planning area) and “geometry”, which contains the simple features.

# returns dataframe of class "sf"
get_planning_areas(token, read = "sf")

If read = “rgdal”, the function returns a SpatialPolygonsDataFrame of “sp” class. The names of each planning area is recorded in the “name” column of the dataframe.

# returns SpatialPolygonsDataFrame ("sp" object)
get_planning_areas(token, read = "rgdal")

Note that if the user specifies a read method but does not have the corresponding package installed, the function will return the raw JSON and print a warning message.

If an error occurs, the function returns NULL and a warning message is printed.

# error: output is NULL, warning message shows status code
get_planning_areas("invalid_token")

Get Planning Area Names

The function get_planning_names() allows users to query all planning area names for certain years. The function returns a tibble with planning area code and planning area name.

# returns tibble
get_planning_names(token)
get_planning_names(token, 2008)

If an error occurs, the function returns NULL and a warning message is printed.

# error: output is NULL, warning message shows status code
get_planning_names("invalid_token")

Get Planning Area Polygon

The function get_planning_polygon() allows users to query a particular planning area polygon containing the specified location point. The function also optionally helps users transform raw geojson string output into sf or sp objects.

If the parameter read is not specified, the function returns a raw JSON object a list containing the planning area name and a geojson string representing the polygon.

# returns raw JSON object
get_planning_polygon(token)
get_planning_polygon(token, 2008)

If read = “sf”, the function returns a 1 x 2 “sf” dataframe: “name” (name of planning area) and “geometry”, which contains the simple feature.

# returns dataframe of class "sf"
get_planning_polygon(token, read = "sf")

If read = “rgdal”, the function returns a SpatialPolygonsDataFrame of “sp” class. The names of the planning area is recorded in the “name” column of the dataframe. If an error occurs, the function returns NULL and a warning message is printed.

# returns SpatialPolygonsDataFrame ("sp" object)
get_planning_polygon(token, read = "rgdal")

Note that if the user specifies a read method but does not have the corresponding package installed, the function will return the raw JSON and print a warning message.

If an error occurs, the function returns NULL and a warning message is printed.

# error: output is NULL, warning message shows status code
get_planning_polygon("invalid_token")
get_planning_polygon(token, "invalidlat", "invalidlon")

Population Query

These functions are a wrapper for the Population Query API. Population Query API endpoints allow users to pull socio-economic datasets by planning area, which each endpoint representing a dataset (e.g. getPopulationAgeGroup provides age group summary statistics by planning area). This package combines querying different Popquery API endpoints into single functions.

Multiple Queries

The function get_pop_queries() allows users to query multiple datasets for multiple towns, years and genders.

The gender parameter is only valid for the getEconomicStatus, getEthnicGroup, getMaritalStatus and getPopulationAgeGroup endpoints. If specified for other endpoints, the parameter will be dropped.

If gender is not specified for endpoints with a gender parameter, records for total, male and female will be returned. The notable exception to this is for the “getEthnicGroup” endpoint, which only returns the total record if gender is not specified. This is because by default, this is the only API endpoint with a gender parameter that does not return gender breakdown by default.

The function returns a tibble with each row representing a town in a particular year for a particular gender, and columns with the variables returned by the API endpoint. If any API call returns no data, the values will be NA but the row will be returned. However, if all data_types do not return data for that town and year, no row will be returned for it.

# example: returns output with no NA
get_pop_queries(token, c("getOccupation", "getLanguageLiterate"), 
                c("Bedok", "Yishun"), "2010")

# example: returns output with no NA and gender field
get_pop_queries(token, c("getEconomicStatus", "getEthnicGroup"), 
                "Yishun", "2010", "female")

If data types requested is a mix of those that accept gender parameters and does that do not, only gender = "Total" rows will have all records. The data types that does not accept gender params will be in gender = Total.

# example: gender not specified
get_pop_queries(token, c("getEconomicStatus", "getOccupation", "getLanguageLiterate"),
                "Bedok", "2010")
# example: gender specified
get_pop_queries(token, c("getEconomicStatus", "getOccupation", "getLanguageLiterate"),
                "Bedok", "2010", gender = "female")

If all data_types do return data for that town and year, no row will be returned for it. A warning message will show data_type/town/year/gender for which an error occurred.

# example: no records for 2012
get_pop_queries(token, c("getEconomicStatus", "getOccupation"),
                "Bedok", c("2010", "2012"))

Finally, to allow for faster computation, API calls can be called in parallel using parallel = TRUE. This is recommended for large requests.

Route Service

These functions are a wrapper for the Route Service API. The Route Service API provides users a way to query the route taken from one point to another. It provides information about the total time and distance taken for the route, route instructions and other infomation e.g. elevation, for a variety of routes (public transport, drive, walk, cycle). This package provides three different functions associated with this API, each serving different purposes.

All Route Information

The get_route() function returns all API output but with standardized column names, which allows for subsequent merging if desired. This is particularly useful as API output variable names may vary depending on parameters (e.g. start point is named differently between route = drive and route = pt).

It returns the full route data in a tibble format, or a list containing a tibble of results and list of status information if desired.

# example: only route data, route = drive
get_route(token, c(1.319728, 103.8421), c(1.319728905, 103.8421581), "drive")
# example: only route data, route = pt
get_route(token, c(1.319728, 103.8421), c(1.319728905, 103.8421581), "pt",
          mode = "bus", max_dist = 300)
# example: returns list of status list and output tibble
get_route(token, c(1.319728, 103.8421), c(1.319728905, 103.8421581), 
          "drive", status_info = TRUE) %>%
  str()
# example: error, warning message shows status code
get_route("invalid_token", c(1.319728, 103.8421), c(1.319728905, 103.8421581), "drive")

# example: error, warning message shows error message from request
get_route(token, c(300, 300), c(400, 500), "cycle")
get_route(token, c(1.319728, 103.8421), c(1.319728905, 103.8421581), "fly")

Total time and distance matrix

The function get_travel() allows the calculation of total travel time and distance for a tibble of start and end points. Users input a tibble of start and end points (and potentially other variables) and the function returns the tibble with additional columns, total_time and total_dist. Recognising that this API is most valuable for calculating total time travelled (as a improved measure of spatial distance compared to Euclidean distance), this function produces a cleaner output containing only the main variables of interest.

The function also accepts multiple arguments for route and pt_mode, allowing users to compare various route options.

Note that if as_wide = TRUE is selected, any columns with identical names as the additional output columns will be overwritten.

# example: create sample df with start and end coordinates
sample <- data.frame(start_lat = c(1.3746617, 1.3567797, 1.3361976, 500),
    start_lon = c(103.8366159, 103.9347695, 103.6957732, 501),
    end_lat = c(1.429443081, 1.380298287, 1.337586882, 601),
    end_lon = c(103.835005, 103.7452918, 103.6973215, 600),
    add_info = c("a", "b", "c", "d"))
# example: multiple routes
get_travel(token, sample[1:3, ],
    "start_lat", "start_lon", "end_lat", "end_lon",
    routes = c("cycle", "walk"))
# example: multiple routes + multiple pt modes 
get_travel(token, sample[1:3, ],
    "start_lat", "start_lon", "end_lat", "end_lon",
    routes = c("drive", "pt"), pt_mode = c("bus", "transit"))

By default, the data appears in a wide format, but users can specify for the output to be in long format.

# example: long format
# no error, long format
get_travel(token, sample[1:3, ],
    "start_lat", "start_lon", "end_lat", "end_lon",
    routes = c("walk", "pt"), pt_mode = c("bus", "transit"),
    as_wide = FALSE)

If an error occurs, the output row will be have NAs for the additional variables, along with a warning message. The warning message will show start/end/route/pt_mode for which an error occurred.

# example: with error
get_travel(token, sample,
    "start_lat", "start_lon", "end_lat", "end_lon",
    routes = c("cycle", "walk"))

Lastly, it recommended for users working with large matrices to set parallel = TRUE for more efficient computation.