Browsing Movebank within R

Marco Smolla, Bart Kranstauber & Anne Scharf

2022-05-31


0.1 Introduction

Movebank (www.movebank.org) is a free online database of animal tracking data, where data owners can manage their data and have the option to share it with colleagues or the public. If the public or a registered user has permission to see a study, the study can be downloaded as a .csv file and imported directly into R (see “An introduction to the ‘move’ package” vignette for details). Those with Movebank accounts can also log in, browse and download data they have access to directly within the Move package.
This vignette gives examples of how to login, search for studies, get sensor, animal and study IDs, download location data and non-location data, and access data from the Movebank Data Repository. All results will be based on the studies for which your account has permission to view or download. Movebank users retain ownership of their data and choose whether and with whom to share it.
To get more details of the options of a specific function, see the functions help file.

A possible workflow could look like this:


1 Login to Movebank

There are two ways to login. Either you login every time you use the functions that are presented in this vignette, or you use the movebankLogin() function to login to Movebank and create an object that stores your login information. You can pass this object on to every function you use to skip the login process. Use the username and password which you use to login on Movebank. Note that the password is stored in this object meaning that if you store the session or objects, these are stored in plain text. If you want to hide the password while typing it might be worth to have a look at the package: “getPass”.

If you do not have a Movebank account, you can register at https://www.movebank.org or download csv files of publicly available studies directly from Movebank and read them into the Move package as described in the help file of the move() function.

library("move")
loginStored <- movebankLogin(username="user", password="password")


2 Search for a study name and MovebankID

2.1 Search for a study name with keywords

You can use the searchMovebankStudies() function to search within the study names for a specific study. For example, if you want to find all studies that worked with goose try the following:

You may rather use the search term without the first letter, e.g. ‘oose’ instead of ‘Goose’ or ‘goose’, to find studies with both ways of writing.

2.2 Get the Movebank ID of a study

All of the functions presented here can work with the study’s Movebank ID or the study name to find information within the database. Note that study names can be changed by the user, while the Movebank ID is created by the database and will always remain the same. If you want to work with the short Movebank ID instead of the longer study name, use getMovebankID(). This number can also be found on the ‘Study Details’ page of the study on Movebank.


3 Get information about the study, tags, animals and deployments

3.1 Get general information of a study

You found a study you are interested in, let’s say ‘Ocelots on Barro Colorado Island, Panama’. To get more information about this study, e.g. the authors of the study, license type, citation and more, use:

3.2 Get information about sensors

If you want to know which sensor types were used for each tag in this study you can use

To see all available sensor types on Movebank, use the same function leaving the study argument empty.

To get a list of the attributes that are available for the sensors of a particular study, use

3.3 Get information about the animals of a study

NOTE: Agreement to license terms: to be able to download data from a study on Movebank in R, you might first have to accept the license terms for the study. For this go to www.movebank.org, search for your study of interest, click on the download tab, and accept the license terms. This only has to be done once per study.

A list of the animals names, their tag ids, sensors used and more information about to each individual is returned with this command

Notice that some information about animals are stored as deployment-level information, for example animal-life-stage, which might vary across multiple deployments for the same individual.

3.4 Get all reference data of a study

NOTE: Agreement to license terms: to be able to download data from a study on Movebank in R, you might first have to accept the license terms for the study. For this go to www.movebank.org, search for your study of interest, click on the download tab, and accept the license terms. This only has to be done once per study.

Get a table with all information associated to the animals, tags and deployments. This table is equivalent to the table obtained on the Movebank webpage trough the option ‘Download Reference Data’ of the study.


4 Download the location data of a study as a ‘move/moveStack’ object

NOTE: Agreement to license terms: to be able to download data from a study on Movebank in R, you might first have to accept the license terms for the study. For this go to www.movebank.org, search for your study of interest, click on the download tab, and accept the license terms. This only has to be done once per study.

4.1 Download location data of a study

An entire study can be downloaded:

4.2 Download location data for selected individuals of a study

Or you can specify to download data for one or several individuals:

4.3 Download location data for a selected time range

You can also limit your download to a given time range. The timestamp has to be in format ‘yyyyMMddHHmmssSSS’ or as a POSIXct, then it is converted to the character using the UTC timezone:

4.4 Dealing with duplicated timestamps

In case the study contains duplicated timestamps, you can set the argument removeDuplicatedTimestamps=TRUE. This will retain the first of multiple records with the same animal ID and timestamp, and remove any subsequent duplicates.

Duplicated timestamps can occur for many different reasons. In some cases, one duplicate record might contain more complete information or a better location estimate than the other(s). In case you want to control which of the duplicate timestamps are kept and which are deleted, we recommend to download the data as a .csv file from Movebank or to use the function getMovebankLocationData(), find the duplicates using e.g. getDuplicatedTimestamps(), decide which of the duplicated timestamp to retain, and than create a move/moveStack object with the function move(). These flagged duplicated records can be also marked as outliers in Movebank, by adding an attribute like e.g. “manually_marked_outlier” to the study. Another option is to edit the records in Movebank and mark the appropriate records as outliers.

5 Download location data of a study as a ‘data.frame’

NOTE: Agreement to license terms: to be able to download with R the data of a study on Movebank, you might first have to accept the license terms for the study. For this go to www.movebank.org, search for your study of interest, click on the download tab, and accept the license terms. This only has to be done once per data set.

To download the location data from a study you can use the getMovebankLocationData() function. It returns a data.frame. Data from different sensors can be downloaded by specifying the sensor name in the sensorID argument. The valid names for the sensorID argument are those of the column ‘name’ or ‘id’ of the table returned by (getMovebankSensors(login=loginloginStored). Location sensors (e.g. GPS, Radio Transmitter,…) are those marked as ‘true’ in the ‘is_location_sensor’ column of this table.

5.1 Download location data of a study

Download location data for a specific individual for a specific sensor. A vector of several sensors can be stated in the argument sensorID and/or of several individuals in the argument animalName. If the animalName argument is left empty, the data of all individuals is downloaded. If the sensorID argument is left empty, the data of all available location sensors of the study is downloaded:

5.2 Download location data for a selected time range

Location data can be also downloaded for a specific time range (See Section 4.3. for more details):

5.3 Download location data including locations marked as outliers in Movebank

There is also the option to include the locations marked as outliers in Movebank by setting the argument includeOutliers=TRUE. This is based on the column visible of the data.

6 Download non-location data of a study as a ‘data.frame’

NOTE: Agreement to license terms: to be able to download with R the data of a study on Movebank, you might first have to accept the license terms for the study. For this go to www.movebank.org, search for your study of interest, click on the download tab, and accept the license terms. This only has to be done once per data set.

To download the non-location data from a study you can use the getMovebankNonLocationData() function. It returns a data.frame. Data from different sensors can be downloaded by specifying the sensor name in the sensorID argument. The valid names for the sensorID argument are those of the column ‘name’ or ‘id’ of the table returned by (getMovebankSensors(login=loginloginStored). Non-location sensors (e.g. Acceleration, Magnetometer,…) are those marked as ‘false’ in the ‘is_location_sensor’ column of this table.

6.1 Download non-location data of a study

Download non location data for a specific individual for a specific sensor. A vector of several sensors can be stated in the argument sensorID and/or of several individuals in the argument animalName. If the animalName argument is left empty, the data of all individuals is downloaded. If the sensorID argument is left empty, the data of all available non location sensors of the study is downloaded:

6.2 Download non-location data for a selected time range

Non location data can be also downloaded for a specific time range (See Section 4.3. for more details):

6.3 Download non-location data with the ‘getMovebankData’ function

There is also the option to download non-location data with the getMovebankData() function. With the argument includeExtraSensors=TRUE data of all non-location sensors available for that study will be downloaded and stored in the unusedRecords slot. With this option it is not possible to select specific sensors.


7 Download data from the Movebank Data Repository as a ‘move/moveStack’ object

This function downloads data from the ‘Movebank Data Repository’. It returns a move object (if only one individual is included) or a moveStack (if several individuals are included). If data of non-location sensors are included in the data set, these will be stored in the unUsedRecords slot of the move object.

getDataRepositoryData("doi:10.5441/001/1.2k536j54")

Visit the dataset’s repository page by preceding the DOI with https://dx.doi.org/<\(doi\)>, for example https://dx.doi.org/10.5441/001/1.2k536j54. From here you can view citations and download a README that might contain additional details needed to properly understand and reference the data. If analyzing these published datasets, always consult the related papers and cite the paper and dataset. If preparing analysis for publication, also contact the data owner if possible for their contribution.