Digitization of Biological Collections: A Global Focus

Photo: Minky Faber (ALA)

iDigBio’s Summit IV, held in November 2014, produced opportunities for collaboration well beyond initial expectations. Subsequent conversations between international invitees John LaSalle and Nicole Fisher of the Atlas of Living Australia (ALA), Keping Ma of China’s National Specimen Information Infrastructure of China (NSII), and iDigBio representatives Larry Page, Pam Soltis, David Jennings, and Gil Nelson resulted in plans for an international summit focused on data sharing and strategies for leveraging common digitization practices and protocols. ALA agreed to host the 13-17 April event at CSIRO (Commonwealth Scientific and Industrial Research Organisation), headquartered at Black Mountain Laboratories in Canberra. Other key attendees included CSIRO’s director of collections Andrew Young and research operations manager Beth Mantle, CSIRO’s Dan Gledhill and Changming Sun, ALA’s Peter Doherty, iDigBio PIs Jose Fortes and Greg Riccardi, iDigBio senior scientist Austin Mast, Vince Smith of the Natural History Museum in London, Paul Flemons of the Australian Museum, Alexis Tindall, South Australian Museum, Alison Vaughan, Royal Botanic Gardens in Melbourne, Smithsonian’s Paul Kimberly, and the Tasmania Biodiversity Hub’s Dan Gledhill.

First day events focused on overviews from represented institutions and programs followed by collections tours of the Australian National Insect Collection and Australian National Herbarium. Tuesday attracted a larger audience of Australian representatives and served as a showcase of digitization, data sharing, public participation, outreach, and data visualization, with time allotted for questions and discussion. Tuesday’s agenda included:

  • Larry Page: Overview of iDigBio, its genesis, development, funding, and future,
  • Pam Soltis: Role in advancing scientific research with digitized data,
  • Gil Nelson, Austin Mast, and Pam Soltis: iDigBio - Education and outreach,
  • Greg Riccardi: iDigBio's role in facilitating and enabling digitization,
  • Jose Fortes: Cyberinfrastructure, portal and data,             
  • David Jennings: Project management,
  • Austin Mast: Onsite public engagement in digitization,            
  • Gil Nelson: Community building, digitization workflow development, training and product workshops,
  • Paul Kimberly: Smithsonian Institution's National Museum of Natural History: Rapid capture techniques,
  • Keping Ma: National Specimen Information Infrastructure of China (NSII),
  • Vince Smith: NHM Digital Collections Programme and the NHM Data Portal,
  • Paul Flemons: DigiVol: Taking crowdsourcing to the next level with structured volunteering,
  • Alexis Tindall & Alison Vaughan: Community coordination for Australian herbaria and zoological collections,
  • Dan Gledhill: Digital tools for taxonomy: maps, images, x-rays and the ALA,
  • Changming Sun: Feature extraction from insect wings (dragonfly example), and
  • Stuart Anderson: Putting 3D Insect Scans to Work.
Wednesday and Thursday featured short presentations followed by extended, 45 to 90 minute discussion sessions centered on specific topics. Volunteer scribes recorded notes in a publicly accessible Google document, augmented by attendees. Topics included advancing scientific research, cyberinfrastructure, workflow, worldwide engagement in support of digitization, and education and outreach. Friday included a tour of the Australian National Wildlife Collection and visits to several field sites and related facilities, which afforded important opportunities for informal conversation about the week’s activities and the way forward for further collaboration and joint iniatives. 

Of principal interest to iDigBio were access to additional specimen-based records to facilitate research and outreach, and access to or information about existing or emerging tools from international partners to help iDigBio meet its research and outreach goals.

Principal objectives for ALA and other participants were to gain a better understanding of the digitization workflows, working groups, and community building strategies developed by iDigBio and to gain better understanding of the ADBC funding model for potential replication.

Workshop participants reached five important conclusions and associated action items. First, we agreed that data sharing across international boundaries is a top priority and that a jointly developed roadmap and vision for achieving this goal is imperative. All represented projects and institutions are dedicated to improving our understanding of biodiversity and agree that sharing data maximizes the potential for research and increases data security by fostering redundancy.

Second, we agreed that search portals and data stores could benefit from the greater uniformity to be achieved through the development or enhancement of interoperable APIs that facilitate data access and sharing. We also recognized several data-sharing challenges, including methods for keeping all versions of a dataset up-to-date in all of the various portals through which it might be served as well as in the databases of its providers, methods for facilitating data attribution, and methods for ensuring that value added to datasets (e.g. georeferences, taxonomic annotations, genomic conclusions) are shared with data providers and across data aggregators.

Third, we recognized the pressing need to document the important uses of digitized specimen data in its various forms, partly to help institutions obtain funding for digitization and partly to demonstrate the value of data aggregation. Some documented examples already exist, notably in NIBA documents and published papers. Annotated compilations of these references and citations are important, as are explorations of new emphases, such as extracting trait data and other information from specimen images.

Fourth, attendees agreed to produce and publish a series of papers, the first to be in a high profile journal and focused on the importance of global collaboration for biodiversity digitization and data mobilization, the remainder in a special edition of another journal and focused on developing and sharing tools for enhancing digitization, biodiversity research utilizing digital data, greater coordination between iDigBio, Smithsonian, and ALA, strategies for encouraging citizen science and public participation, issues related to sustainability, delineation of research questions, and the coordination of access to internationally applicable digitization and research workflows.

Fifth, workshop participants agreed to form the nucleus of a potentially worldwide consortium of leading digitizing institutions and aggregators to share knowledge, methods, and strategies for enhancing global accessibility of biodiversity collections data. Areas of concentration would include methods for demonstrating to governments the value of biodiversity and, consequently, the value of digitization of biodiversity information. Current plans call for a 2016 meeting in China to formalize a charter. Potential products from the Australia and China meetings include a global strategic plan for digitization of collections, initiation and oversight of activities to demonstrate capabilities, solidification of the willingness to work together and demonstrate benefits, exploration of how research uses of digital data might stimulate global engagement, and an outline of strategies for linking individual goals in fundable ways.