Overview of functClust

Benoît Jaillard

2020-11-30

Aim of functional clustering

An interactive system is here understood as a collection of components that interact when they co-occur, and whose interactions generate an emergent, collective, system-specific performance, function or property. When we subsample different components that belong to the system, we observe that different component assemblages generate different emergent, collective, system-specific performances. We thus assume that different values of performances are associated with different composition of component assemblages, that is to say with different co-occurring components.

The aim of functional clustering is to identify the role played by each component belonging to the system on the genesis of the emergent, collective, system-specific performance. For doing that, we need a collection of subsamples, i.e. sub-systems, of different elemental composition, i.e. different assemblages of components, of which emergent, collective, system-specific performances are observed.

A functional clustering groups the components of the interactive system on the basis of their effects on the system-specific performance. The effect on assemblage performance can be induce when the components occur alone or in combination with other components. The procedure groups first the components that induce similar effects on the performance when they co-occur with the same other components within the system: a functional group clusters together components that are functionally redundant for the performance in consideration.

We term assembly motif a combination of functional clusters, more precisely speaking a combination of components that belong to different functional clusters. Each assembly motif describes therefore a kind of component assemblage. We assume that each assembly motif is associated with a mean value of observed performances. Clustering components in functional groups generates a classification of component assemblages based on their assembly motif. We evaluate the quality of each component clustering by the coefficient of determination of the performance modelled by the classification of component assemblages. An iterative process then enables identifying the component clustering in functional groups that best accounts for the observed performance, i.e. that maximizes the coefficient of determination of the observed performance.

The combinatorial approach therefore generates a tree that groups functionally redundant components of the system for the system-specific performance in consideration. It is a functional clustering of components that belong to the interactive system for the performance in consideration.

Method for clustering functionally redundant components of an interactive system

The functional clustering of components belonging to an interactive system works in three steps:

Method for clustering functionally redundant components: (a) Component clustering, (b) Identification of assembly motifs, (c) Classification of sub-systems by assembly motif, (d) Computation of convergence criterion, (e) Iterative ajustment of component clustering

Method for clustering functionally redundant components: (a) Component clustering, (b) Identification of assembly motifs, (c) Classification of sub-systems by assembly motif, (d) Computation of convergence criterion, (e) Iterative ajustment of component clustering

Format of dataset

The dataset consists of a collection of different assemblages of components belonging to the system in consideration, of which the elemental composition is known, and one or several emergent, collective, system-specific performances are observed. The system in consideration is composed of all the components filled. Each assemblage can achieve several performances, for instance a same performance observed at different times or under different conditions (monitoring of biomass production over time or on various places), or performances of different natures for a multi-functional analysis.

The format of the dataset is as follows. On a first line: assemblage identity, a list of components identified by their names, a list of performances identified by their names. On following lines, a line by assemblage, name of the assemblage, a sequence of 0 (absence) and 1 (presence of component within the assemblage), a sequence of numeric values for each observed performances, over time, over places, or over performances of different natures.

Here, for instance, the famous experiment Biodiversity II done at Cedar Creek, University of Minnesota, USA, by David Tilman and his collaborators. The interactive system is a meadow, composed of 16 grassland species, observed on 91 plots over 3 years. The dataset includes 91 lines, a line by assemblage identified by “Plot”. The occurrence of 16 species (identifed as “Achmi”, “Agrsm”, “Amocan”, “Andge”, “Asctu”, “Elyca”, “Koecr”, “Lesca”, “Liaas”, “Luppe”, “Monfi”, “Panvi”, “Petpu”, “Poapr”, “Schsc” and “Sornu”) are noted by “0” (absent or FALSE) or “1” (present or TRUE). The assemblage performances are identified as “y2004”, “y2005” and “y2006”.

library(functClust)

# production of biomass in 2004, 2005 and 2006 in Biodiversity II experiment
data(CedarCreek.2004.2006.dat)
dim(CedarCreek.2004.2006.dat)
#> [1] 91 20
colnames(CedarCreek.2004.2006.dat)
#>  [1] "Plot"   "Achmi"  "Agrsm"  "Amocan" "Andge"  "Asctu"  "Elyca" 
#>  [8] "Koecr"  "Lesca"  "Liaas"  "Luppe"  "Monfi"  "Panvi"  "Petpu" 
#> [15] "Poapr"  "Schsc"  "Sornu"  "y2004"  "y2005"  "y2006"
rownames(CedarCreek.2004.2006.dat)
#>  [1] "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10" "11" "12" "13" "14"
#> [15] "15" "16" "17" "18" "19" "20" "21" "22" "23" "24" "25" "26" "27" "28"
#> [29] "29" "30" "31" "32" "33" "34" "35" "36" "37" "38" "39" "40" "41" "42"
#> [43] "43" "44" "45" "46" "47" "48" "49" "50" "51" "52" "53" "54" "55" "56"
#> [57] "57" "58" "59" "60" "61" "62" "63" "64" "65" "66" "67" "68" "69" "70"
#> [71] "71" "72" "73" "74" "75" "76" "77" "78" "79" "80" "81" "82" "83" "84"
#> [85] "85" "86" "87" "88" "89" "90" "91"

# the data for each component assemblage
line <- 10
CedarCreek.2004.2006.dat[line, ]
#>    Plot Achmi Agrsm Amocan Andge Asctu Elyca Koecr Lesca Liaas Luppe Monfi
#> 10   24     0     0      0     0     1     1     0     0     0     0     0
#>    Panvi Petpu Poapr Schsc Sornu    y2004    y2005    y2006
#> 10     1     0     0     1     0 48.94467 54.86133 46.62233

The main functions of functClust

The package functClust contains two groups of main functions. The first group includes 4 functions (fclust, fclust_plot, fclust_write and fclust_read) which allow a basic combinatorial analysis, its plotting and its saving.

Both the functions fclust and fclust_plot are managed by a set of options.

The second group of functions also includes 4 functions (ftest, ftest_plot, ftest_write and ftestread) which allow to test the significance (i) of different components that belong to the interactive system, basic combinatorial analysis, (ii) of different assemblages that compose the dataset, (iii) of different observed performances if it is the case, and (iv) to evaluate the robustness of component clustering if several performances were observed.

All the functions are managed by a set of options. However, the package functClust shows all the functions used for analysing a dataset by a functional clustering and plot the results. Each function can be directly used for specific computation or plotting.

Note

Note that some computations are time-consuming. To facilitate the monitoring of the smooth running of the computations, informations are written on the Console and graphs are drawn on the Plots panel. The writting are enable or disable by the “verbose” option.

getOption("verbose")
#> [1] FALSE
# to follow the computations
options(verbose = TRUE)
# to deactivate the option
options(verbose = FALSE)

Availability and installation of functClust

The package functClust (version 0.1.0) is available on :

It can be load and install with:

#install.packages("functClust")

References

Jaillard, B., Richon, C., Deleporte, P., Loreau, M. and Violle, C. (2018) An a posteriori species clustering for quantifying the effects of species interactions on ecosystem functioning. Methods in Ecology and Evolution, 9:704-715. https://doi.org/10.1111/2041-210X.12920

Jaillard, B., Deleporte, P., Loreau, M. and Violle, C. (2018) A combinatorial analysis using observational data identifies species that govern ecosystem functioning. PLoS ONE 13(8): e0201135. https://doi.org/10.1371/journal.pone.0201135

Jaillard, B., Deleporte, P., Isbell, F., Loreau, M. and Violle, C. (submitted) Identifying plant functional groups that govern ecosystem functioning in a long-term biodiversity experiment.