January 2016 Biodiversity Spotlight

Fri, 2016-01-08 13:58 -- kevinlove

Graygreen Reindeer Moss (Cladonia rangiferina)

Image courtesy of Jim Kuhn

Graygreen Reindeer Moss (Cladonia rangiferina) is a lichen, not a moss. A lichen is composed of two organisms that form a partnership. One partner is a fungus (called the mycobiont), and the other is a photobiont. Photobionts are either green algae or blue-green algae. This close relationship between separate organisms is referred to as symbiosis1.

Cladonia rangiferina is widespread in the Arctic and boreal regions of North America. Like other lichens, this species has the ability to absorb water and nutrients from the air. This adaptation makes Cladonia rangiferina, and lichens in general, susceptible to pollutants in the environment making them valuable bioidicators2.

Lichens serving as sensitive indicators for environmental change is the premise of the digitization project Lichens, Bryophytes, and Climate Change. The Lichens, Bryophytes, and Climate Change digitization project is  working to provide high quality digital specimen data to better understand how species distributions change over time. Learn more about the project by visiting their website. Also, Libraries of Life featured Cladonia rangiferina in their recently released mobile app and card set. Visit their website or the Apple App Store to download the app for free. You can visit Island Ecology at CBSP’s website to find more educational resources associated with Cladonia rangiferina (including lesson plans).

Specimen of Cladonia rangiferina from the Academy of Natural Sciences of Drexel University

The iDigBio Portal has 6,910 specimen records of Cladonia rangiferina. If you are interested in examining Cladonia rangiferina records to understand how their distribution has been altered over time due to climate change you will first need to understand how the museum records themselves vary in distribution over space and time. Read on to learn how to visualize “date collected” using R programming in the Coding Corner below!

Coding Corner

This month’s coding corner will introduce readers to R programming and utilizing the iDigBio Search API endpoints for querying specimen data in aggregate. A PDF version of this lesson can be found here: greygreen-reindeer-lichen-jan2016.pdf. Code examples that you can run in the R console will be formatted like this:

print('Hello World!')

Code Example #1

And the output of the console will be formatted like this:

## [1] "Hello World!"

Code Example #1 Output


Before we begin, we need to prepare our work space and load the packages we will need for this project:



The iDigBio API has resources, or “endpoints”, for querying data in the aggregate. To facilitate discovery, some of the endpoints provide summary statistics or summary data: https://github.com/iDigBio/idigbio-search-api/wiki#summary

In this coding corner, we will use the “Date Histogram” to begin our data exploration.

To begin, let’s tell R what API summary endpoint we would like to use by creating a vector representing the endpoint:

apiEndpoint <- "http://search.idigbio.org/v2/summary/datehist/"


We need to set up our query to follow the API query format: https://github.com/iDigBio/idigbio-search-api/wiki/Query-Format . This call to the API will take its arguments in JSON, similar to this example: https://github.com/iDigBio/idigbio-search-api/wiki/Query-Format#searching-for-a-value-within-a-field

rq <- toJSON(list(scientificname="Cladonia rangiferina"))


We can now construct a URL to query the endpoint and assign it to a vector that we will pass to our parsing function:

queryURL <- URLencode(paste(apiEndpoint,"?rq=",rq,sep=""))


And see what response we get back:

res <- fromJSON(queryURL)
##            Length Class  Mode   
## dates      166    -none- list   
## itemCount    1    -none- numeric
## rangeCount   1    -none- numeric


The API has returned a nested list of years and counts. Lets create a tidy data frame from the response so that we can create a plot:

df <- data.frame(unlist(res$dates),as.Date(names(res$dates)))
names(df)[1:2] <- c("count","year")
## 'data.frame':    166 obs. of  2 variables:
##  $ count: int  1 18 1 1 1 1 1 1 1 1 ...
##  $ year : Date, format: "1746-01-01" "1800-01-01" ...


Plot time! We’ll make a scatter plot, using R’s base plot package, of the data frame we just created with dates on the x axis and counts on the y axis:

     main=paste("iDigBio Date Histogram Endpoint (Cladonia rangiferina) \n as of ",Sys.Date(),sep=""),

Now that we have an idea of the distribution of collection dates in the data, let’s take a further look into how these collection events are distributed by locality, using the “ridigbio” package:



Let’s query the iDigBio API for a response that contains locality information, along with our collection dates, and restrict it to our species of interest:

lichenData <- idig_search_records(rq = list(scientificname="Cladonia rangiferina"), fields = c("datecollected","country","countrycode","institutioncode","uuid"))


We’re going to want to add some dimension to out plots, so lets calculate the “season” the specimen was collected

lichenData$seasons <- mkseas(as.Date(lichenData$datecollected),"DJF")


Plot the result, starting with a histogram:

ggplot(lichenData,aes(x=year(datecollected),fill=as.factor(country))) + 
        geom_histogram() +
        labs(x="Year Collected",y="Count", title="Cladonia rangiferina in iDiBio")

Subset by countries with 90% of records and create a new data frame:

tt <- as.data.frame(table(lichenData$country))
tt$Pct <- tt$Freq / sum(tt$Freq)
tt <- tt[tt$Pct>quantile(tt$Pct, .9),]
df2 <- lichenData[lichenData$country %in% tt$Var1,]


Create fancy visualization from this 90th percentile data:

ggplot(df2, aes(x=country, y=year(datecollected)))+geom_violin()+
                    position = position_jitter(width = .2))+
        labs(x="Country",y="Year Collected", title=paste("Cladonia rangiferina in iDigBio\n(90th Percentile)\n as of ",Sys.Date(),sep=""))+
        theme(legend.title = element_text(size=12, face="bold"))+
        scale_color_discrete(name="Meteorological \nSeasons", labels=c("Winter","Spring","Summer","Fall"))