Blog

Functions implemented in biomonitoR

Here the list of functions implemented in biomonitoR. Some of them have not been extensively tested, please check carefully the results!

FUNCTIONDESCRIPTION
abutotal abundance
abuTaxcalculates the absolute or relative abundance of a Taxon or of a set Taxa
asptAverage Score Per Taxon
bmwpBiological Monitoring Working Party
combTaxareturns all the combinations of Taxa at the desired taxonomic resolution
epsiEmpyrically-weighted Proportion of Sediment-sensitive Invertebrates index
eptRichness of Ephemeroptera, Plecoptera, Trichoptera
eptdbase 10 logarithm of the abundance of the selected Ephemeroptera, Plecoptera, Trichoptera and Diptera plus 1
famNumbfamily richness
genNumbgenus richness
gold1 minus the relative abundance of Gastropoda, Oligochaeta and Diptera
lifeLotic-invertebrate Index for Flow Evaluation
margalefMargalef's diversity Index
menhinickMenhinick's diversity index
pielouPielou evenness index
psiProportion of Sediment-sensitive Invertebrates index
quickRenamecompare the user taxa list with the reference database from freshwaterecology.info and suggest correct names if wrong names are present
ricTaxcalculates the richness of a Taxon or of a set Taxa at a user provided taxonomic level.
shannonShannon–Wiener index
simpsonSimpson's Diversity Index
speNumbspecies richness
taxNumbtaxa richness
whptWalley Hawkes Paisley Trigg index

Apply a list of functions to a biomonitoR object

Sometimes it can be useful to apply a list of functions to a biomonitoR object. For example with a properly setted list of functions you can calculate more indices at once.


library(biomonitoR)

macro <- read.csv(url("http://www.biomonitor.it/wp-content/uploads/2017/10/macro_example.csv"))

macro.asb <- asBiomonitor(macro)

macro.agg <- aggregatoR(macro.asb)


# creating a list of functions funs

myshan <- function(x)(shannon(x))

mybmwpi <- function(x)(bmwp(x, method = "i"))

mybmwpb <- function(x)(bmwp(x, method = "b"))

funs <- list(shannon = myshan, ibmwp = mybmwpi, bmwp = mybmwpb)


# apply the list of functions on macro.agg

indices <- lapply(funs, function(f) f(macro.agg))


# transform the list indices into a data.frame

do.call(cbind.data.frame, indices)

Retrieve taxonomic information with taxize

Taxize is a very powerful R package

that allows the user to search over many taxonomic data sources for species names (scientific and common) and download up and downstream taxonomic hierarchical information – among other things.

Here an example for retrieving taxonomic information from itis:

 

library(taxize)
library(plyr)

# download an example dataset
macro <- read.csv(url("http://www.biomonitor.it/wp-content/uploads/2017/10/macro_example.csv"))

# extract the Taxa column
taxa <-  macro$Taxa

# use the function classification from taxize 
# to retrieve taxonomic information
# you will be asked to chose the right
# taxon name from a list of possible candidates
# classification return a list of data.frame
# data about some taxon could be missing

taxa.cla <- classification(taxa, db = "itis")

# transpose the elements of the list in order to have
# a wide rather than a long dataframe format

taxa.cla.t <- lapply(taxa.cla, function(y){data.frame(t(y[1]), stringsAsFactors = F)})

# change column name of the data.frame

for(i in 1:length(taxa.cla.t)){
  names(taxa.cla.t[[i]]) <- t(taxa.cla[[i]][2])
  if(is.na(colnames(taxa.cla.t[[i]][1]))==T & length(names(taxa.cla.t[[i]]) == 1)){
  # if taxonomic information are missing for
  # a taxa add a data.frame with one column
  # called kingdom filled with Animalia
  taxa.cla.t[[i]] <- data.frame(kingdom = "Animalia", stringsAsFactors = F ) 
  }
}

# use rbind.fill from plyr package to
# reduce the list to a data.frame

taxa.df <- rbind.fill(taxa.cla.t)

 

With a little effort it is possible to modify the dataframe to be used as custom database in the asBiomonitor function.

Manage species groups in biomonitoR

Sometimes happens to have species groups in your data (e.g Rhyacophila nubila/obliterata). To overcome this issue you can import your data and putting the problematic species in the user’ custom database (and of course in the Taxa column of the database to analyse). Then you have to move to R where you must replace the slash (as in our example) with a space from both the database to analyse and from the custom database. This should assure that your species groups will be recognised by asBiomonitor.

 


# use gsub to replace slash with space 
# (macro is the database to analyze and user_db 
# the custom database to add to the reference database)
# for example we want to change species groups like
# Rhyacophila nubila/obliterata to Rhyacophila nubila obliterata

macro$Taxa <- gsub("/", "_", macro$Taxa, fixed = T)
user_db$Taxa <- gsub("/", "_", user_Db$Taxa, fixed = T)
user_db$Species <- gsub("/", "_", user_db$Species, fixed = T)

macro.asb <- asBiomonitor(macro, dfref = user_db, overwrite = F)

 

Reference database

Reference database is a modified version of the database of freshwatercology.info and is structured as follow:

Column Taxa must contain unique elements, in other words you can not have the same name repeated twice or more. You can add missing taxa by using the same structure of the reference database and by specifying it in the dfref option:

 


# macro is the data.frame provided by the user

macro.asb <- asBiomonitor(macro, dfref = user_db, overwrite = F)

 

You must assure that the taxa you add are not already present in the default database.
Alternatively you can replace the default database with your custom database, specifying it in the overwrite option:

 


macro.asb <- asBiomonitor(macro, dfref = user_db, overwrite = T)

 

The possibility to use your own reference database is very important. As an example, if you want to correctly calculate Iberian ASPT (IASPT) you have to provide a modified version of the database (ask us for this database). This is because biomonitoR consider Ancylus and Ferrissia as Planorbidae while the IASPT consider them as belonging to the families Ancylidae and Ferrissidae, respectively.

You must assure that the custom database contains all the names of the taxa that you want anaylize with biomonitoR. WARNING!!! The current version of biomonitoR does not support uncertain classification of taxa (e.g Rhyacophila nubila/obliterata) but see this post for a workaround.

asBiomonitor

The function asBiomonitor is the core of the package biomonitoR. The default settings allows the user to compare and merge his dataset to the database taken from freshwaterecology.info. Compare means that the taxa list of the user’s data.frame is compared with that of reference database, after that the two are merged.

 


macro.asb <- asBiomonitor(macro, dfref = NULL, overwrite = F)

 

where macro is the user’s data.frame. This data.frame must have a column named “Taxa” where put species, genus, family, etc. names and one or more columns of samples.
When user’s taxa list contains spelling errors asBiomonitor provides suggestions to correct the wrong name if a similar name exists in the reference database. If suggestions are not provided the user has to exit from the script by pressing esc on the keyboard. This behaviour is required to assure compatibility between user data.frame and reference database. To overcome this issue the user can provide is own reference database in order to modify or replace the existing reference database.

Add custom reference data.frame to the default reference database.


macro.asb <- asBiomonitor(macro, dfref = user_db, overwrite = F)

 

Replace the default reference database with the custom database.


macro.asb <- asBiomonitor(macro, dfref = user_db, overwrite = T)

 

The structure of the custom data.frame is the topic of this post.