SPNHC 2013 - Special Feature: iDigBio all-day symposium sponsored by iDigBio and the Natural Science Collections Alliance

by Deb Paul

From June 17-21, seven members of iDigBio (Gil NelsonPam Soltis, Joanna McCaffrey, Larry Page, Bruce MacFadden, Kevin Love and Deborah Paul) participated in SPNHC 2013, which is the annual meeting of the Society for the Preservation of Natural History Collections (http://www.spnhc.org/), fondly referred to as “spinach.” If you are new to SPNHC, they focus on development and dissemination of best practices for care of physical collections as well working on governmental policy issues for scientific specimen transport (for example) and educating the public about collections.  According to the SPNHC website, their mission is to “improve the preservation, conservation and management of natural history collections to ensure their continuing value to society.”

At this year's meeting, iDigBio presented a full-day two-part symposium designed to share the mission of NSF’s ADBC program with the SPNHC audience, which included honors undergraduates, graduate students, taxonomists, systematists, collection managers, curators and museum administrators representing all collection types and a variety of organismal groups. This meeting brought together a near-perfect group for iDigBio to share the work we are doing to facilitate collections imaging and digitization and outline how to get involved.


Morning Session: Introduction to Digitization and Dissemination of Natural History Data: iDigBio, BISON and Other Initiatives (click to view audio/video recording)

Presentation Title


Alphabet Soup: Overview of NIBA – ADBC – iDigBio – TCNs

Larry Page - iDigBio

NCSA and the State of Collections

Larry Page - iDigBio

Intro to iDigBio, Survey of TCNs, PENs, RDCN

Joanna McCaffrey - iDIgBio

Introduction to Digitization: Metadata & Standards, Workflows, Photography, and Applied Training

Gil Nelson - iDigBio

Integrating Augmented OCR and Georeferencing in Natural History Collection (NHC) Digitization

Deborah Paul - iDigBio

Engaging downstream users of paleocolletions through iDigBio E&O (Education and Outreach)

Bruce MacFadden - iDigBio

Introduction to a TCN: PaleoNICHES

Una Farrell - University of Kansas

Introduction to the Biodiversity Information Serving Our Nation (BISON) project and the Interagency Working Group on Scientific Collections (IWGSC)

Elizabeth Martin - U.S. Geological Survey / BISON (by video)

Symposium topics flowed from a general overview of the origins of the iDigBio project to the specific work of various iDigBio and other working groups. We started with Alphabet Soup where iDigBio Lead PI, Larry Page, carefully explained the project history, taking us on a journey from NIBA to ADBC to iDigBio & the TCNs. Larry followed this with an update on the current State of Collections.

  • American Institute of Biological Sciences (AIBS)
  • Network Integrated Biocollections Alliance (NIBA)
  • Advancing Digitization of Biodiversity Collections (ADBC)
  • Integrated Digitized Biocollections (iDigBio)
  • Thematic Collection Networks (TCN)
  • Partners to Existing Networks (PEN)
  • Regional/Related/Relevant/Relational/Registered/Recognized Data Collection Network (RDCN)
  • Natural Science Collections Alliance (NSCA)
  • Biodiversity Information Serving Our Nation (BISON)
  • Interagency Working Group on Scientific Collections (IWGSC)
  • Research Coordination Networks (RCN)

Joanna MCaffrey shared more detail about iDigBio and each of the related TCNs, PENs, and RDCNs. From this point, the symposium presentations went into more specific issues surrounding digitization including:

  • Digitization workflows, data standards and biodiversity informatics (Gil Nelson),
  • Working groups studying uses for optical character recognition (OCR) and georeferencing legacy data (Deb Paul),
  • Engaging the public with fossil collections (Bruce MacFadden),
  • What is the PaleoNICHES TCN and the complexities in georeferencing a fossil (Una Farrell), and
  • An introduction to the BISON project to mobilize observational and specimen data and the work of the IWGSC (Elizabeth Martin by video).

After learning so many new acronyms and more about these concerted efforts to digitize and image natural history collections specimens, we were hungry! Now well-fed, we returned for the afternoon symposium and panel discussion.


Afternoon Session: Diverse Uses for Natural History Collections (click to view audio/video recording)

Presentation Title


Understanding the use and users of natural history collections data: Why this matters

Elizabeth Martin - U.S. Geological Survey / BISON (by video)

Creative ways to use botanical specimens in climate change research

Richard Primack - Boston University

Mining natural history collections for invasive species data in New York

Jennifer Dean - New York Natural Heritage Program

Plant fossils and plastid genomes: Integrating molecular and morphological data sets for reconstructing phylogeny and biogeography in Icacinaceae

Greg Stull - Florida Museum of Natural History, University of Florida

Challenges to making paleontological data usable for a broader audience

Pat Holroyd - Museum of Paleontology, University of California - Berkeley

Inside zoological collections: Perspectives of the academic (re)user

Ixchel Franiel - OCLC Research

Herbarium data in support of biodiversity research: Opportunities and challenges

Michael Denslow - NEON

Increasing research use of biodiversity collections through ontology-based data integration across biodiversity databases

Hank Bart - Tulane University

The Botanical Information and Ecology Network (BIEN): A research and collections collaboration to investigate the ecological impacts of global climate change on plant biodiversity

Barbara Thiers - New York Botanical Garden

Some highlights were Richard Primack's compelling examples of how good specimen data can predict climate change. For example, Richard showed that the same predictive formula for estimating flowering/fruiting/leafing times results whether one uses extant specimens in the arboretum with current weather information or weather and specimen label data gleaned from plants dried long ago. Intriguingly, Richard included more about how old and new photos can be compared to get at this kind of climate change predictive data as well as the utilization of climate and morphology observations made in the journals of naturalists long departed.

Other memorable and fascinating bits include talks about work being done by several groups to engage citizen scientists via education and outreach. For example:

  • Bruce MacFadden described plans to create a national network from existing fossil clubs.
  • Pat Holroyd succinctly identified a key issue in our community when it comes to the need for elegant, insightful visualization of data when working to make the data useful to a broader audience. To quote Pat, “…Geologic Time confuses people…"(including this writer)! How do we find those with the gifts to create powerful, persuasive graphics from millions of columns of digital data? How do we successfully communicate geologic time?
  • Greg Stull showed us how he is combining plastid genome molecular data and plant fossil specimen morphology data together to illuminate the history of the Icacinaceae plant family phylogeny and biogeographical distributions through time.

Some significant issues raised in various talks were how one might deal with presence-only data (Michael Denslow), inherent biases in natural history collections data (Primack, Thiers, Denslow), and the many ways natural history collections data is reused by scientists (Franiel) as evidenced in papers on phenology, niche modeling, species ranges (Denslow), and for invasive species monitoring (Jennifer Dean). Michael Denslow pointed out several key issues in his summary: researchers don't always know how to get data from many places – hence the great need for mobilization and exposure of the data, outreach is needed to educate researchers about biases and data limitations and strengths, and a final point that fewer new (herbarium) specimens may be getting deposited in collections because there is a decline in plant collection and education.

Barbara Theirs discussed that the central goal of the Botanical Information and Ecology Network (BIEN) is to bring together data from throughout the Americas on the distribution, abundance and traits of plant species, with the ultimate goal of predicting and mitigating the effects of climate change. The BIEN database provides a common schema for merging georeferenced observations of individuals and species from herbarium specimens, vegetation inventories, and regional checklists, with measurements of species-level traits such as size, growth form, wood density, specific leaf area, etc. Associated with the database are a series of tools for standardizing and scrubbing data, most notably the Taxonomic Names Resolution Service. Currently the BIEN database includes about 12 million observational records and nine million specimen records for about 200,000 species of vascular plants and bryophytes from throughout the Americas. Research projects completed or currently underway using the BIEN database include an exploration of latitudinal patterns of range size and species richness of New World woody plants, quantitative assessment of species range size, and understanding spatial variation in biodiversity along latitudinal and elevation gradients. The BIEN project envisions an active partnership between data users and data providers, with a strong commitment to acknowledgement, compliance with data use policies, and feedback on data use statistics and data improvements for  the more than 300 data providers.

Hank Bart described how biodiversity collections, wherever they are housed, are taxonomically organized, and effectively siloed by the standards practices governing the different taxonomic groups. This creates a challenge to using the data effectively in a broader context across different domains. He described the work of the CollectionsWeb RCN in exploring ways of achieving greater integration and efforts to network biodiversity databases. He cautioned that the effort of the workshop stopped short of recommending an approach to integration. He also explored how ontologies can be used across different groups of organisms to achieve broad data integration, and to increase the research use of biodiversity collections.

Ixchel Franiel, from Online Computer Library Center, Inc. (OCLC) Research, provided us with a unique viewpoint. Studying zoologists, Ixchel uses interviews, surveys, notes, observations, field notebooks to study how scientists re-use existing data collected by other scientists. Her work gives us a potentially essential awareness of how scientists decide what data to re-use (or not) and why. Scientists are not only looking at the publications of those whose data they are considering, but also at such primary sources as a collector's field notebook when deciding what data to include in their research.

A short panel discussion followed this absorbing line-up of talks, moderated by iDigBio co-PI, Pam Soltis. We could have gone on much longer – but, we got hungry – again. We briefly touched on the need for specimen-use metrics for reporting, the use of field notes (see above reference to Ixchel's talk), the need (requirement?) for data to be cited (attributed) and the need to stress the difference between copyright and attribution in the broader community, and the underlying theme of getting from accessible specimen data to computable specimen data.

We'd like to say a special thank you to the SPNHC 2013 organizers and Sally Shelton for the wonderful welcome to South Dakota. It was the first trip for many of us and we enjoyed seeing bison in their element with prairie dogs for neighbors, wild flowers blooming, Wind Cave, Mount Rushmore, and …