To understand the ocean - we need to get wet

Anisogammarus pugettensisby Deborah Paul (iDigBio), Gustav Paulay (UF, FLMNH) from #DigIn Workshop

An enormous amount of data amassed by worldwide collections digitization efforts provides everyone with new research opportunities to better understand our planet. As an added benefit of this global undertaking, we can also begin to better understand and express the data gaps (e.g. taxonomic, geographic, time, collector, ...). Noting that while our planet is an astonishingly wet one, data on marine life are sparse in currently available biocollections resources (GBIF, iDigBio, etc.). About 75% of all described marine organisms are invertebrates, so understanding the biodiversity of the ocean critically hinges on access to information on invertebrate biodiversity. It’s clear, the need to mobilize marine invert data is crucial.

In a packed 2-day workshop, over 30 participants gathered to talk about marine IZ collections, challenges for marine invert wet collections data capture, nomenclature, taxonomy, genetic information, georeferencing, imaging, and ideas for what kinds of research might be possible if this data were available. Representatives from 17 collections gave brief overviews of their accessions and their presentations (a few linked below) reveal an ocean in jars.

AMNH Chris Johnson, Lily Berniker

ANSP Gary Rosenberg

AUMNH Briane Varnerin

CASIZ Christina Piotrowski

CMN Jean-Marc Gagnon

FLMNH Amanda Bemis

FMNH Rudiger Bieler

FWRI Paul Larson

MCZ Adam Baldinger

NCMNS Megan McCuller

NHMLA Trina Roberts, Libby Ellwood

NMNH William Moser

RBCM Henry Choong

TAMU Mary Wicksten

VIMS Jennifer Dreyer

Gustav Paulay summarized marine IZ data needs and availability (or lack of) in his introduction: DigIn Overview. Participants contributed their knowledge, expertise, and curiosity to the Marine Invertebrate Digitization Wiki (DigIn).

An important part of mobilizing and being able to use marine IZ data involves the need for consistent use of relevant taxonomic names (including so-called “temporary names”).  WoRMS provides a poweful backbone to described marine species that should be reflected in collections data. Participants discussed how to achieve this, including options and activities going on at WoRMS and Catalog of Life+ and related upcoming symposia at Biodiversity Next. Find out more about WoRMS taxonomy in the presentation by Bart Vanhoorne (WoRMS software developer).

Adriana Radulovici shared how BoLD works, explaining their data mobilization and how Barcode Index Numbers (BINs) work to help separate cryptic species and reveal the individual species in what are currently species complexes. (Link to Adriana’s talk). With BIN data, the opportunity to expand our knowledge of just what is in those jars, and the chance to use that data to recreate communities in time and space, is now possible.

Marine invertebrates are usually held in lots of one or more individuals per species in a jar. However beyond these lots, often there is a community in the jar as well. James Carlton suggests we consider the need to capture the Epibiota that comes along with many specimens. Of course, just how we might all do that, was another part of this conversation. Think of GLOBI and how amazing it would be to add this marine data to this resource. But in order to do that, we continue to need new ways (and potentially new data standards) to capture and share trait data such as species associations.

No discussion of digitization would be complete without talking about collections management software. Our participants use a variety including: Specify, Arctos, Access, Axiell EMu, FileMaker Pro, and a few home-grown options. Edward Gilbert introduced this marine invertebrate-focused audience to Symbiota. Beyond an online (web-based) collections management system, this software incorporates generation of interactive keys, checklists, quizzes, a web portal for data access, image linking, and submission of data to GenBank. Two examples marine data in Symbiota can be seen in the Macroalgae Portal and in the Smithsonian Tropical Research Institute (STRI) Bocas del Toro: Species Database (available in English and Spanish).

Please visit the wiki to find out more about our conversations on georeferencing, genetic data and linking genetic data, various marine IZ workflows shared (from FLMNH IZ, NMNH, and CMN), imaging, digitization, data mobilization, and challenges for marine IZ collection storage.

Our participants worked hard to think about what marine resources are needed and also how to increase human diversity and inclusion in any efforts to mobilize wet collections specimen data. At the end of day 2, our focus changed to how to manage such data mobilization projects and what skills, temperament, and knowledge are needed to do this job well.

Next, the organizers and participants plan to form several working groups to take some of these subjects, like comparing wet collections data capture methods. If you’d like to know more, please get in touch with Gustav Paulay at FLMNH.

Teamwork made this workshop - many people contributed to the content, the organization, and the implementation. We all look forward to everyone's future efforts to mobilize this ocean of collections data. If you've got marine invert data to add, be sure to let Gustav Paulay know and make sure your collection is in the US Collections Resource.