ausplotsR: quickstart guide to basic analysis of TERN AusPlots vegetation data

Greg Guerin & Bernardo Blanco-Martin

2021-11-23

Introduction

TERN AusPlots is a national plot-based terrestrial ecosystem surveillance monitoring method and dataset for Australia (Sparrow et al. 2020). Through ausplotsR, users can directly access AusPlots data collected by on-ground observers on vegetation and soils, including physical sample/voucher details and barcode numbers. The dataset can be downloaded in its entirety or as individual modules, and can be subsetted by geographic bounding box or species name search. The package also includes a series of bespoke functions for working with AusPlots data, including visualisation, creating tables of species composition, and calculation of tree basal area, fractional cover or vegetation cover by growth form/structure/strata and so on.

This is a short guide for getting started with analysis of AusPlots data through the ausplotsR R package. More information on making use of AusPlots data in ausplotsR is available through the package help files and manual. Below, we demonstrate installing the package, accessing some AusPlots data, generating matrices and running simple example analyses.

More comprehensive tutorials on accessing and analysing AusPlots data (Blanco-Martin 2019) are available at: https://github.com/ternaustralia/TERN-Data-Skills/tree/master/EcosystemSurveillance_PlotData

Installing the package and accessing raw data

The latest version of ausplotsR can be installed directly from github using the devtools package, which must be installed first.

library(devtools)
install_github("ternaustralia/ausplotsR", build_vignettes = TRUE, dependencies = TRUE)

Once installed, load the package as follows. Note, packages vegan, maps and mapdata are required for ausplotsR to load, and functions are also imported from packages: plyr, R.utils, simba, httr, jsonlite, sp, maptools, ggplot2, gtools, jose, curl and betapart, while knitr and rmarkdown are required to build this package vignette (i.e., if ‘build_vignettes’ is set to TRUE above).

library(ausplotsR)

We can now access live data, starting here with basic site information and vegetation point-intercept modules and using a bounding box to spatially filter the dataset to central Australia. All data modules are extracted via a single function, get_ausplots:

#See ?get_ausplots to explore all data modules available
my.ausplots.data <- try(get_ausplots(bounding_box = c(125, 140, -40, -10)))

The output of the above call is a list with the following $elements:

names(my.ausplots.data)
#> [1] "site.info" "veg.vouch" "veg.PI"    "citation"

The ‘site.info’ table contains basic site and visit details. Here are a selected few of the many fields:

head(my.ausplots.data$site.info[,c("site_location_name", "site_unique", "longitude", "latitude", "bioregion_name")])
#>   site_location_name      site_unique longitude  latitude bioregion_name
#> 1         NTADAC0001 NTADAC0001-53518  130.7779 -13.15835            DAC
#> 2         NTASSD0015 NTASSD0015-53565  135.6168 -25.12393            SSD
#> 3         QDAMII0002 QDAMII0002-53546  138.1606 -20.00789            MII
#> 4         SATSTP0005 SATSTP0005-53513  138.8488 -29.45660            STP
#> 5         SATSTP0005 SATSTP0005-58639  138.8488 -29.45660            STP
#> 6         NTTDAB0001 NTTDAB0001-53580  131.6740 -13.96288            DAB

Each survey is identified by the ‘site_unique’ field, which is unique combination of site ID (‘site_location_name’) and visit ID (‘site_location_visit_id’). The ‘site_unique’ field therefore links all tables returned from the get_ausplots function.

The ‘site.info’ table and can be used to identify, subset or group surveys in space and time, for example:

#count plot visits per Australian States:
summary(as.factor(my.ausplots.data$site.info$state))
#>  NT QLD  SA  WA 
#> 164  48 198  32

Map AusPlots sites and visualise data

The package has an in-built function - see ?ausplots_visual - to rapidly map AusPlots over Australia and to visualise the relative cover/abundance of green vegetation, plant growth forms and species. Maps can also be generated manually using the longitude and latitude fields in the $site.info table.

#Sites are coded by IBRA bioregion by default. 
map_ausplots(my.ausplots.data)

Alternatively, the following call generates a pdf with a map of all sites and attribute graphics for selected AusPlots: ausplotsR::ausplots_visual()

Here is a snippet of the raw point-intercept data that will be used in the following examples to derive vegetation attributes:

head(subset(my.ausplots.data$veg.PI, !is.na(herbarium_determination)))
#>          site_unique site_location_name site_location_visit_id transect
#> 37  NTAMGD0002-53466         NTAMGD0002                  53466    E5-W5
#> 40  NTAMGD0002-53466         NTAMGD0002                  53466    E5-W5
#> 64  NTAMGD0002-53466         NTAMGD0002                  53466    E5-W5
#> 76  NTAMGD0002-53466         NTAMGD0002                  53466    W4-E4
#> 86  NTAMGD0002-53466         NTAMGD0002                  53466    W4-E4
#> 103 NTAMGD0002-53466         NTAMGD0002                  53466    W4-E4
#>     point_number herbarium_determination substrate in_canopy_sky  dead
#> 37            46          Cleome viscosa    Gravel         FALSE FALSE
#> 40            50          Cleome viscosa      Bare         FALSE FALSE
#> 64            87          Cleome viscosa      Bare         FALSE FALSE
#> 76             5          Cleome viscosa      Bare         FALSE FALSE
#> 86            26          Cleome viscosa      Bare         FALSE FALSE
#> 103           50        Portulaca digyna    Gravel         FALSE FALSE
#>     growth_form height veg_barcode standardised_name        family     genus
#> 37         Forb   0.10 NTA  006351    Cleome viscosa    Cleomaceae    Cleome
#> 40         Forb   0.05 NTA  006351    Cleome viscosa    Cleomaceae    Cleome
#> 64         Forb   0.20 NTA  006351    Cleome viscosa    Cleomaceae    Cleome
#> 76         Forb   0.20 NTA  006351    Cleome viscosa    Cleomaceae    Cleome
#> 86         Forb   0.10 NTA  006351    Cleome viscosa    Cleomaceae    Cleome
#> 103        Forb   0.02 NTA  006311  Portulaca digyna Portulacaceae Portulaca
#>     specific_epithet infraspecific_rank infraspecific_epithet taxa_status
#> 37           viscosa               <NA>                  <NA>    Accepted
#> 40           viscosa               <NA>                  <NA>    Accepted
#> 64           viscosa               <NA>                  <NA>    Accepted
#> 76           viscosa               <NA>                  <NA>    Accepted
#> 86           viscosa               <NA>                  <NA>    Accepted
#> 103           digyna               <NA>                  <NA>   Unchecked
#>      taxa_group    genus_species authorship       published_in    rank
#> 37  angiosperms   Cleome viscosa         L.   Sp. Pl. 672 1753 SPECIES
#> 40  angiosperms   Cleome viscosa         L.   Sp. Pl. 672 1753 SPECIES
#> 64  angiosperms   Cleome viscosa         L.   Sp. Pl. 672 1753 SPECIES
#> 76  angiosperms   Cleome viscosa         L.   Sp. Pl. 672 1753 SPECIES
#> 86  angiosperms   Cleome viscosa         L.   Sp. Pl. 672 1753 SPECIES
#> 103 angiosperms Portulaca digyna   F.Muell. Fragm. 1: 170 1859 SPECIES
#>     hits_unique
#> 37     E5-W5 46
#> 40     E5-W5 50
#> 64     E5-W5 87
#> 76      W4-E4 5
#> 86     W4-E4 26
#> 103    W4-E4 50

Note that ‘veg_barcode’ links species hits to the vegetation vouchers module, while the ‘hits_unique’ field identifies the individual point-intercept by transect and point number (see help(ausplotsR) and references for more details on the plot layout and survey method). At each point, plant species (if any), growth form and height are recorded along with substrate type.

Example 1: latitudinal pattern in proportional vegetation cover

Let’s visualise basic vegetation cover as a function of latitude. First, we call the fractional_cover function on the extracted point-intercept data ($veg.PI). The function converts the raw data to proportional cover of green/brown vegetation and bare substrate. Note the calculation may take a few minutes for many AusPlots, so for this example we will pull out a subset of 100 randomly drawn sites to work with.

sites100 <- my.ausplots.data$veg.PI[which(my.ausplots.data$veg.PI$site_unique  %in% sample(my.ausplots.data$site.info$site_unique, 100)), ]
my.fractional <- fractional_cover(sites100)

head(my.fractional)
#>                       site_unique NA.  bare brown green
#> NTAARP0001-58422 NTAARP0001-58422 0.0  3.27 28.32 68.42
#> NTABRT0001-53616 NTABRT0001-53616 0.0 18.22 33.27 48.51
#> NTABRT0002-58862 NTABRT0002-58862 0.0 14.85 41.29 43.86
#> NTABRT0005-58863 NTABRT0005-58863 0.0 57.82 15.15 27.03
#> NTADAC0002-58039 NTADAC0002-58039 0.2  9.21 24.55 66.04
#> NTAFIN0001-53519 NTAFIN0001-53519 0.0 39.31 53.27  7.43

Next, we need to merge the fractional cover scores with longlat coordinates from the site information table. We use the ‘site_unique’ field (unique combination of site and visit IDs) to link tables returned from the get_ausplots function:

my.fractional <- merge(my.fractional, my.ausplots.data$site.info, by="site_unique")[,c("site_unique", "bare", "brown", "green", "NA.", "longitude", "latitude")]

my.fractional <- na.omit(my.fractional)

head(my.fractional)
#>        site_unique  bare brown green NA. longitude  latitude
#> 1 NTAARP0001-58422  3.27 28.32 68.42 0.0  132.2701 -13.55729
#> 2 NTABRT0001-53616 18.22 33.27 48.51 0.0  133.2473 -22.28360
#> 3 NTABRT0002-58862 14.85 41.29 43.86 0.0  133.2506 -22.28367
#> 4 NTABRT0005-58863 57.82 15.15 27.03 0.0  133.6121 -22.29108
#> 5 NTADAC0002-58039  9.21 24.55 66.04 0.2  132.3403 -12.73922
#> 6 NTAFIN0001-53519 39.31 53.27  7.43 0.0  133.4679 -24.12430

Now we can plot out the continental relationship, e.g., between the proportion of bare ground with no kind of vegetation cover above and latitude.

plot(bare ~ latitude, data=my.fractional, pch=20, bty="l")

There appears to be a hump-backed relationship, with a higher proportion of bare ground in the arid inland at mid-latitudes. We can add a simple quadratic model to test/approximate this:

my.fractional$quadratic <- my.fractional$latitude^2

LM <- lm(bare ~ latitude + quadratic, data=my.fractional)
summary(LM)
#> 
#> Call:
#> lm(formula = bare ~ latitude + quadratic, data = my.fractional)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -43.108  -8.740  -3.239  13.782  45.275 
#> 
#> Coefficients:
#>               Estimate Std. Error t value Pr(>|t|)    
#> (Intercept) -172.14145   24.75605  -6.954 4.32e-10 ***
#> latitude     -17.31208    2.04263  -8.475 2.77e-13 ***
#> quadratic     -0.34645    0.03996  -8.670 1.06e-13 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 17.07 on 96 degrees of freedom
#> Multiple R-squared:  0.4428, Adjusted R-squared:  0.4312 
#> F-statistic: 38.14 on 2 and 96 DF,  p-value: 6.449e-13

#generate predicted values for plotting:
MinMax <- c(min(my.fractional$latitude), max(my.fractional$latitude))
ND <- data.frame(latitude=seq(from=MinMax[1], to=MinMax[2], length.out=50), quadratic=seq(from=MinMax[1], to=MinMax[2], length.out=50)^2)
ND$predict <- predict(LM, newdata=ND)
#
plot(bare ~ latitude, data=my.fractional, pch=20, bty="n")
points(ND$latitude, ND$predict , type="l", lwd=2, col="darkblue")

Example 2: Species by sites table

Aside from ‘gross’ values from plots such as fractional cover, many analyses in community ecology begin with species abundance information. With ausplotsR you can generate this easily from the more complex vegetation point-intercept data. The first step to work with species-level AusPlots data is to create a species occurrence matrix. The species_table function in the ausplotsR package can be used to create this type of matrix. This function takes a data frame of individual raw point-intercept hits (i.e. a $veg.PI data frame) generated using the get_ausplots function and returns a ‘species against sites’ matrix:

#The species_table function below can also take the `$veg.voucher` module as input, but `m_kind="PA"` must be specified to get a sensible presence/absence output.
#The 'species_name' argument below specifies use of the "standardised_name" field to identify species, which is based on herbarium_determination names (i.e., "HD" option in species_name) matched to accepted scientific name according to a standard (http://www.worldfloraonline.org/).
my.sppBYsites <- species_table(my.ausplots.data$veg.PI, m_kind="percent_cover", cover_type="PFC", species_name="SN")

#check the number of rows (plots) and columns (species) in the matrix
dim(my.sppBYsites)
#> [1]  392 1753

#look at the top left corner (as the matrix is large)
my.sppBYsites[1:5, 1:5] 
#>                  Abutilon Abutilon.fraseri Abutilon.halophilum Abutilon.hannii
#> NTAARP0001-58422        0                0                   0               0
#> NTAARP0002-58423        0                0                   0               0
#> NTAARP0003-58424        0                0                   0               0
#> NTABRT0001-53616        0                0                   0               0
#> NTABRT0002-53617        0                0                   0               0
#>                  Abutilon.leucopetalum
#> NTAARP0001-58422                     0
#> NTAARP0002-58423                     0
#> NTAARP0003-58424                     0
#> NTABRT0001-53616                     0
#> NTABRT0002-53617                     0

We can crudely pull out the 10 highest ranking species in terms of their percent cover cumulative across all plots they occur in:

rev(sort(colSums(my.sppBYsites)))[1:10]
#>     Triodia.basedowii    Aristida.holathera    Eucalyptus.obliqua 
#>              651.4038              457.0111              359.5146 
#>     Cenchrus.ciliaris         Eulalia.aurea    Eucalyptus.baxteri 
#>              344.7106              340.4958              335.0025 
#>     Triodia.bitextura       Triodia.pungens       Acacia.shirleyi 
#>              325.2499              291.3861              282.0792 
#> Schizachyrium.fragile 
#>              264.3876

A simple example of downstream visualisation and analysis of species-level AusPlots data is Rank-Abundance Curves (also known as Whittaker Plots). Rank-Abundance Curves provide further information on species diversity. They provide a more complete picture than a single diversity index. Their x-axis represents the abundance rank (from most to least abundant) and in the y-axis the species relative abundance. Thus, they depict both Species Richness and Species Evenness (slope of the line that fits the rank; steep gradient indicates low evenness and a shallow gradient high evenness).

#Whittaker plots for some selected AusPlots with alternative relative abundance models fitted to the plant community data:
par(mfrow=c(2,2), mar=c(4,4,1,1))
for(i in c(1:4)) {
  plot(vegan::radfit(round(my.sppBYsites[9+i,], digits=0), log="xy"), pch=20, legend=FALSE, bty="l")
  legend("topright", legend=c("Null", "Preemption", "Lognormal", "Zipf", "Mandelbrot"), lwd=rep(1, 5), col=c("black", "red", "green", "blue", "cyan"), cex=0.7, bty="n")
}

Example 3: Quick species lists

Perhaps you simply want to browse which plant species have been recorded in AusPlots, without all the associated raw data? Here, the species_list function is your friend:

#The species_list function is designed to take $veg.voucher as input but can also take $veg.PI
#print a list of genus_species-only records from selected plots:
species_list(subset(my.ausplots.data$veg.vouch, site_unique %in% unique(site_unique)[1:2]), grouping="by_site", species_name="GS")
#> $QDAMII0002
#>  [1] Abutilon leucopetalum    Acacia coriacea          Acacia cowleana         
#>  [4] Acacia lysiphloia        Aristida latifolia       Atalaya hemiglauca      
#>  [7] Boerhavia repens         Bonamia media            Bothriochloa ewartiana  
#> [10] Brachiaria subquadripara Bulbostylis barbata      Carissa spinarum        
#> [13] Chrysopogon benthamianus Cleome viscosa           Convolvulus clementii   
#> [16] Corymbia aparrerinja     Corymbia terminalis      Crotalaria medicaginea  
#> [19] Cucumis melo             Dichanthium sericeum     Enneapogon polyphyllus  
#> [22] Enneapogon purpurascens  Enneapogon robustissimus Eucalyptus leucophloia  
#> [25] Eucalyptus pruinosa      Eulalia aurea            Euphorbia tannensis     
#> [28] Gossypium australe       Indigofera colutea       Indigofera linifolia    
#> [31] Ipomoea coptica          Ipomoea polymorpha       Iseilema macratherum    
#> [34] Mnesithea formosa        Paspalidium rarum        Portulaca oleracea      
#> [37] Rhynchosia minima        Salsola kali             Senna notabilis         
#> [40] Sida cleisocalyx         Solanum quadriloculatum  Sporobolus australasicus
#> [43] Tephrosia                Themeda triandra         Triodia pungens         
#> [46] Ventilago viminalis      Vigna                   
#> 
#> $SAASTP0016
#>  [1] Abutilon leucopetalum      Acacia salicina           
#>  [3] Astrebla pectinata         Atriplex angulata         
#>  [5] Atriplex incrassata        Atriplex vesicaria        
#>  [7] Calocephalus platycephalus Calotis hispidula         
#>  [9] Centipeda thespidioides    Cullen australasicum      
#> [11] Cyperus alterniflorus      Digitaria                 
#> [13] Dissocarpus biflorus       Enchylaena tomentosa      
#> [15] Eragrostis leptocarpa      Eragrostis setifolia      
#> [17] Eremophila longifolia      Eriachne ovata            
#> [19] Eulalia aurea              Euphorbia drummondii      
#> [21] Frankenia                  Goodenia lunata           
#> [23] Iseilema vaginiflorum      Leiocarpa leptolepis      
#> [25] Malvastrum americanum      Marsilea                  
#> [27] Minuria rigida             Nitraria billardierei     
#> [29] Osteocarpum dipterocarpum  Panicum decompositum      
#> [31] Plantago drummondii        Podolepis                 
#> [33] Rhagodia spinescens        Schenkia australis        
#> [35] Sclerolaena divaricata     Sclerolaena ventricosa    
#> [37] Sida fibulifera            Stemodia florulenta       
#> [39] Streptoglossa adscendens   Swainsona campylantha     
#> [41] Tecticornia tenuis

#overall species list ordered by family (for demonstration we print only part):
species_list(my.ausplots.data$veg.vouch, grouping="collapse", species_name="SN", append_family=TRUE)[1:20]
#>  [1] Acanthaceae--Dicliptera armata              
#>  [2] Acanthaceae--Dipteracanthus australasicus   
#>  [3] Acanthaceae--Hygrophila ringens var. ringens
#>  [4] Acanthaceae--Nelsonia canescens             
#>  [5] Acanthaceae--Rostellularia adscendens       
#>  [6] Aizoaceae--Carpobrotus rossii               
#>  [7] Aizoaceae--Carpobrotus virescens            
#>  [8] Aizoaceae--Disphyma clavellatum             
#>  [9] Aizoaceae--Gunniopsis                       
#> [10] Aizoaceae--Gunniopsis calcarea              
#> [11] Aizoaceae--Gunniopsis kochii                
#> [12] Aizoaceae--Gunniopsis quadrifida            
#> [13] Aizoaceae--Gunniopsis septifraga            
#> [14] Aizoaceae--Gunniopsis zygophylloides        
#> [15] Aizoaceae--Mesembryanthemum crystallinum    
#> [16] Aizoaceae--Mesembryanthemum nodiflorum      
#> [17] Aizoaceae--Sarcozona praecox                
#> [18] Aizoaceae--Tetragonia                       
#> [19] Aizoaceae--Tetragonia eremaea               
#> [20] Aizoaceae--Tetragonia implexicoma

Explore TERN AusPlots

In addition to the key site info and vegetation point-intercept modules introduced above, get_ausplots is your gateway to raw data modules for vegetation structural summaries, vegetation vouchers (covers the full species diversity observed at the plot and includes tissue sample details), basal wedge, and soils subsites, bulk density and pit/characterisation (including bulk and metagenomics soil samples).

References

Blanco-Martin, B. (2019) Tutorial: Understanding and using the ‘ausplotsR’ package and AusPlots data. Terrestrial Ecology Research Network. Version 2019.04.0, April 2019. https://github.com/ternaustralia/TERN-Data-Skills/

Sparrow, B., Foulkes, J., Wardle, G., Leitch, E., Caddy-Retalic, S., van Leeuwen, S., Tokmakoff, A., Thurgate, N., Guerin, G.R. and Lowe, A.J. (2020) A vegetation and soil survey method for surveillance monitoring of rangeland environments. Frontiers in Ecology and Evolution, 8:157.