Introduction to the benthos-package

Biodiversity measures

Several biodiversity measures have been implemented in the benthos-package. In the sections below, we will demonstrate how to calculate these measures. To simplify things, all analysis will be performed on a single sampling unit:

d <- oosterschelde %>% 
    filter(HABITAT == "Polyhaline-Subtidal", YEAR == 2010, POOLRUN == 1, POOLID == 1) %>%
    select(TAXON, COUNT) %>%
    arrange(TAXON)
d
# A tibble: 41 x 2
   TAXON                  COUNT
   <chr>                  <dbl>
 1 Angulus tenuis             1
 2 Aphelochaeta marioni       1
 3 Aphelochaeta marioni      25
 4 Bathyporeia                2
 5 Capitella capitata         2
 6 Cossura longocirrata       1
 7 Echinocardium cordatum     1
 8 Ensis                      1
 9 Lanice conchilega          4
10 Magelona johnstoni         2
# … with 31 more rows

Measures of species abundance

Total abundance

The total abundance is the total number of individuals in a sampling unit, and is computed by:

d %>% total_abundance(count = COUNT)
[1] 131

Abundance

The abundance is the total number of individuals per taxon in a sampling unit. It can be computed by means of the abundance-function:

d %>% abundance(taxon = TAXON, count = COUNT) %>% as.matrix
                       [,1]
Angulus tenuis            1
Aphelochaeta marioni     26
Bathyporeia               2
Capitella capitata        2
Cossura longocirrata      1
Echinocardium cordatum    1
Ensis                     1
Lanice conchilega         4
Magelona johnstoni        3
Magelona papillicornis    1
Malmgreniella lunulata    1
Mytilus edulis            1
Nemertea                  1
Nephtys                   8
Nephtys hombergii         1
Oligochaeta               8
Ophiura albida            6
Peringia ulvae            1
Poecilochaetus serpens    1
Pseudopolydora pulchra    1
Retusa obtusa             4
Scoloplos armiger        22
Spio martinensis          1
Spiophanes bombyx        17
Spisula                   1
Streblospio benedicti     1
Tellimya ferruginosa      1
Terebellidae              1
Urothoe poseidonis       12

Measures of species richness

Species richness

Species richness \(S\) is the number of different species in a (pooled) sample. It can be computed by means of

d %>% species_richness(taxon = TAXON, count = COUNT)
[1] 29

Margalef’s index of diversity

Species richness \(S\) is strongly dependent on sampling size. Margalef’s diversity index \(D_\mathrm{M}\) takes sampling size into account. It is given by \[ D_\mathrm{M} = \frac{S-1}{\ln(N)} \] where \(N\) is the total abundance, i.e, the total number of individuals in the sampling unit. In case \(N=1\), this index will be set to zero.

It can be computed for a specific sampling unit by:

d %>% margalef(taxon = TAXON, count = COUNT)
[1] 5.743357

Rygg’s index of diversity

Species richness \(S\) is strongly dependent on sampling size. Like Margalef’s diversity index \(D_\mathrm{M}\), Rygg’s index of diversity takes sampling size into account (Rygg, 2006). It is given by \[ SN = \frac{\ln{S}}{\ln(\ln(N))} \] where \(N\) is the total abundance, i.e, the total number of individuals in the sampling unit.

It can be computed for a specific sampling unit by:

d %>% rygg(taxon = TAXON, count = COUNT)
[1] 2.125603

Rygg’s index shows some inconsistencies for small N and S ((S=2, N=2), (S=2, N=3) and (S=3, N=3)). This is illustrated in the third figure below. As a reference, also Margalef’s index is given in the top figure.

The second figure shows a graph based on the adjusted version of Rygg’s index. It is given by:

\[ SNA = \frac{\ln{S}}{\ln(\ln(N+1)+1)} \]

The adjusted version of Rygg’s index can be computed by means of:

d %>% rygg(taxon = TAXON, count = COUNT, adjusted = TRUE)
[1] 1.900244

Hurlbert’s \(\mathrm{E}(S_n)\)

Hurlbert (1971) gives the expected number of species in a sample of n individuals selected at random (without replacement) from a collection of N individuals and S species:

\[ \mathrm{E}(S_n) = \sum_{i=1}^S \left[1 - \frac{\binom{N-N_i}{n}}{\binom{N}{n}} \right] \]

Contrary to species richness, this measure is not dependent on the number of individuals. It can be computed for a specific sampling unit by:

d %>% hurlbert(taxon = TAXON, count = COUNT, n = 100)
[1] 24.85011

\(\mathrm{E}(S_n)\) can be computed for \(n \in {1, 2, \dots, N}\), where \(N\) is the total abundance. This has been done in the figure below.

Note that \(\mathrm{E}(S_n)\) can be computed for \(n \leq N\). Extrapolation, i.e. \(n > N\), is not possible.

Measures of heterogeneity/evenness

Simpson’s Measure of Concentration

Simpson’s Measure of Concentration gives the probability that two individuals selected at random from a sample will belong to the same species. For an infinite sample Simpson’s Index is given by: \[ \lambda = \sum_{i=1}^S \pi_i^2 \] where \(\pi_i\) the proportion of the individuals in species \(i\). For a finite sample, Simpson’s index is: \[ L = \sum_{i=1}^S \frac{n_i (n_i-1)}{N (N-1)} \] where \(n_i\) the number of individuals in species \(i\) and \(N\) the total number of individuals.

The finite sample case can be computed by:

d %>% simpson(taxon = TAXON, count = COUNT)
[1] 0.09935408

Hurlbert’s Probability of Interspecific Encounter (PIE)

Related to Simpson’s index is Hurlbert’s probability of inter-specific encounter (PIE). It gives the probability that two individuals selected at random (without replacement) from a sample will belong to different species (Hurlbert, 1971, p.579, Eq. 3): \[ \Delta_1 = \sum_{i=1}^S \left(\frac{N_i}{N}\right)\left(\frac{N-N_i}{N-1}\right) = \left(\frac{N}{N-1}\right)\Delta_2 \] where \(\Delta_2\) (Hurlbert, 1971, p.579, Eq. 4) is the probability that two individuals selected at random (with replacement) from a sample will belong to different species: \[ \Delta_2 = 1 - \sum_{i=1}^S \pi_i^2 \] where \(N_i\) is the number of individuals of the \(i\)th species in the community, \(N\) is the total number of individuals in the community, \(\pi_i = N_i/N\), and \(S\) is the number of species in the community.

Hurlbert’s PIE can be computed by means of:

d %>% hpie(taxon = TAXON, count = COUNT)
[1] 0.9006459

Note that it is the complement of Simpson’s Measure of Concentration (for finite sample sizes):

1 - d %>% simpson(taxon = TAXON, count = COUNT)
[1] 0.9006459

Shannon’s Index

Shannon’s index (or entropy) is given by:

\[ H' = -\sum_i p_i \log_2 p_i \] where \(p_i\) is the proportion of individuals found in taxon \(i\). It can be computed for a specific sampling unit by:

d %>% shannon(taxon = TAXON, count = COUNT)
[1] 3.818995

Hill’s Diversity Numbers

According to Hill (1973): ‘a diversity number is figuratively a measure of how many species are present if we examine the sample down to a certain depth among its rarities. If we examine superficially (e.g., by using \(N_2\)) we shall see only the more abundant species. If we look deeply e.g. by using \(N_0\) we shall see all the species present.’. His diversity number is given by: \[ N_a = \left(\sum_{i=1}^S p_i^a\right)^{1/(1-a)} \]

Depending on parameter \(a\), Hill’s numbers gradually give more weight to the rarest species (small \(a\)) or most common species (large \(a\)).

Special cases are:

\(N_0\): total number of species present
\(N_1\): \(\exp(H')\), where \(H'\): Shannon’s index
\(N_2\): reciprocal of Simpson’s index (\(\frac{1}{\lambda}\))

d %>% hill(taxon = TAXON, count = COUNT, a = 0)
[1] 29
d %>% hill(taxon = TAXON, count = COUNT, a = 1)
N_a(a=1) is undefined. Therefore N_a(lim a->1) will be returned
[1] 14.11342
d %>% hill(taxon = TAXON, count = COUNT, a = 2)
[1] 9.413604

or (efficient) short cuts:

d %>% hill0(taxon = TAXON, count = COUNT)
[1] 29
d %>% hill1(taxon = TAXON, count = COUNT)
[1] 14.11342
d %>% hill2(taxon = TAXON, count = COUNT)
[1] 9.413604

The figure below shows Hill’s Diversity Number as function of \(a\). From right to left, the focus is more and more on rare species.

Measures of species sensitivity

AZTI Marine Biotic Index (AMBI)

Borja et al. (2000) introduced the Biotic Coefficient. The expression in their paper can be rewritten as: \[ c_\mathrm{b} = \frac{3}{2} \sum_{i=2}^5 (i-1) p_i \] where \(\mathrm{p}\) is a vector of length 5 containing the proportions of species in the sensitivity classes (I, II, III, IV, V) respectively.

It can be computed for a specific sampling unit by:

d %>% 
    ambi(taxon = TAXON, count = COUNT)
[1] 2.734615

The accuracy of the AMBI depends (among other things) on the number of taxa for which a sensitivity group is available. The has_ambi function indicates if a group has been assigned to a taxa or not: Taxa with an AMBI sensitivity group are

d %>%
    mutate(HAS_GROUP = has_ambi(taxon = TAXON))
# A tibble: 41 x 3
   TAXON                  COUNT HAS_GROUP
   <chr>                  <dbl> <lgl>    
 1 Angulus tenuis             1 TRUE     
 2 Aphelochaeta marioni       1 TRUE     
 3 Aphelochaeta marioni      25 TRUE     
 4 Bathyporeia                2 TRUE     
 5 Capitella capitata         2 TRUE     
 6 Cossura longocirrata       1 TRUE     
 7 Echinocardium cordatum     1 TRUE     
 8 Ensis                      1 TRUE     
 9 Lanice conchilega          4 TRUE     
10 Magelona johnstoni         2 TRUE     
# … with 31 more rows

The percentage of the total abundance without an AMBI group is given below

d %>%
    mutate(HAS_GROUP = has_ambi(taxon = TAXON)) %>%
    summarise(percentage = 100 * sum(COUNT[!HAS_GROUP]) / sum(COUNT)) %>%
    as.numeric
[1] 0.7633588

Infaunal Trophic Index (ITI)

The infaunal trophic index (ITI) is calculated as: \[ \mathrm{ITI} = 100 \sum_{i=1}^3 \frac{(4-i)}{3} p_i \] where \(p_i\) is the proportion of species in class \(i\), where

class 1 are suspension feeders (highest quality);
class 2 are interface feeders;
class 3 are surface deposit feeders and
class 4 are subsurface deposit feeders (lowest quality).

See Gittenberger & van Loon (2013) for more information.

We can estimate the ITI by means of:

d %>% 
    iti(taxon = TAXON, count = COUNT)
[1] 22.82051

The accuracy of the ITI depends (among other things) on the number of taxa for which a sensitivity group is available. The has_iti function indicates if a group has been assigned to a taxa or not: Taxa with an ITI sensitivity group are

d %>%
    mutate(HAS_GROUP = has_iti(taxon = TAXON))
# A tibble: 41 x 3
   TAXON                  COUNT HAS_GROUP
   <chr>                  <dbl> <lgl>    
 1 Angulus tenuis             1 TRUE     
 2 Aphelochaeta marioni       1 TRUE     
 3 Aphelochaeta marioni      25 TRUE     
 4 Bathyporeia                2 TRUE     
 5 Capitella capitata         2 TRUE     
 6 Cossura longocirrata       1 TRUE     
 7 Echinocardium cordatum     1 TRUE     
 8 Ensis                      1 TRUE     
 9 Lanice conchilega          4 TRUE     
10 Magelona johnstoni         2 TRUE     
# … with 31 more rows

The percentage of the total abundance without an ITI group is given below

d %>%
    mutate(HAS_GROUP = has_iti(taxon = TAXON)) %>%
    summarise(percentage = 100 * sum(COUNT[!HAS_GROUP]) / sum(COUNT)) %>%
    as.numeric
[1] 0.7633588

Introduction to the benthos-package

Dennis Walvoort

2017-11-19

Introduction

Loading the package

Sample data set

Preprocessing

Selecting variables and observations

Standardization of taxon names

Genus to species conversion

Data pooling

Biodiversity measures

Measures of species abundance

Total abundance

Abundance

Measures of species richness

Species richness

Margalef’s index of diversity

Rygg’s index of diversity

Hurlbert’s \(\mathrm{E}(S_n)\)

Measures of heterogeneity/evenness

Simpson’s Measure of Concentration

Hurlbert’s Probability of Interspecific Encounter (PIE)

Shannon’s Index

Hill’s Diversity Numbers

Measures of species sensitivity

AZTI Marine Biotic Index (AMBI)

Infaunal Trophic Index (ITI)

Calculating multiple biodiversity measures in one go

Advanced topics

Number of pool runs and species richness

References