Data Without Borders Symposium at XXV International Congress of Entomologists ICE 2016

by Deb Paul

Every four years, entomologists worldwide gather for the International Congress of Entomologists (ICE). Over 6682 entomologists from 102 countries traveled to Orlando, Florida for ICE 2016, the 25th such meeting. iDigBio took part in this historic event through an exhibit booth, participation in the ICE 2016 Insect EXPO, two invited talks, and our Symposium: Data Without Borders, convened by iDigBio, the Australian National Insect Collection (ANIC)*, and the Australian Museum Research Institute**.

Read more about iDigBio at ICE 2016, the Exhibit Booth, Insect EXPO, and invited talks in Molly Phillip's blog post.

Data Without Borders talks centered around producers and consumers of collections data. Topics covered ranged from: current trends in collecting and vouchering of specimens and field data; examining speciation hypotheses; species conservation; ecology of species; DNA barcoding; methods, models and tools for digitizing specimen data; to tools and skills needed for producing, consuming, and transforming this data. Presentations included examples of the data in use and the need for international collaboration for biodiversity research. About 60 people attended our Data Without Borders session and folks found it tough to choose between it and the concurrent symposium:  Building the Biodiversity Knowledge Graph.

All Data Without Borders presentations recorded. To listen, please go to the vimeo Data Without Borders ICE 2016 recordings.

Summary of Session. Data Without Borders symposium wiki

First, Pam Soltis (iDigBio PI) presented an overview and some research examples of natural history collections data-in-action. She stressed that “perhaps the greatest value of such data-enabled science will lie in the unanticipated patterns that emerge," we have "access to new data sources providing unparalleled opportunities for mobilizing and integrating massive amounts of information from organismal biology, ecology, genetics, climatology, and other disciplines."

Derek Woller is interested in the speciation of the Puer Group (PG) from the grasshopper genus Melanoplus and uses specimen data to study species ranges as par of this work. Next, Robert Kula described work to compare contemporary data with historical insect data records (n=1049) from the Hopkins Notes and Records System for insects found in association with Castanea dentata (American Chesnut), Castanea mollissima (Chinese Chesnut), and red oak trees. Jonathan Koch shared how bee specimen data provides historic abundance and distribution of bee species and shows an alarming decline. While bee data across multiple institutions informs bee conservation, Jonathan also notes some limitations and biases of digital specimen data that must be considered when characterizing bee communities. Sarah Schmits and Derek Sikes both discussed ways in which producers of specimen data might enhance the data mobilization process. Sarah’s talk shows how using the collecting event as the center-piece of the digitization effort speeds up digitization of specimens, reduces error, and provides a way to share absence data. You can read more about it on their Short Research Group website and look at the database: Collection Resources for Aquatic Coleoptera (CreAC). Derek discusses the legacy data generation issue and urges everyone to take care not to make the problem worse. He suggests we all adopt the database before you label approach. This method, used at the University of Alaska Museum Insect Collection, is similar to those established by Costa Rica's INBio in the 1990s.

From Derek: Digitization of millions of historic entomology specimens remains an enormous challenge. Our community should not make this challenge worse by generating newly collected, undigitized specimens. Entomologists in North America currently generate many tens of thousands of new specimens annually, that get added to our undigitized backlog.

Neil Cobb clearly illustrated the importance and scope of Derek’s insight above. Neil showed us data that indicates we “must go [digitize] 5x faster” if we are to catch-up, keep-up, with the pace of collecting and meet the goal of “digitizing all [insect] specimens by 2050.” To succeed, Neil emphasizes we will need collaboration, technological, and socal networking enhancements. Nicole Fisher (ANIC) and Vladimir Blagoderov (NHM-UK) provided more hard data about the realities of collections digitization. Vlad’s talk focused on scaling up insect digitization – sharing the industrial processes now in use at the Natural History Museum in London and building on the knowledge gathered from the Digital collections programme at NHM-UK. From Vlad:

Success is impossible without an organised approach to project management, staff buy-in and administrative support on all levels. Key elements of industrial digitisation are: detailed yet flexible workflows which can accommodate different kinds of digitised material; automation through software and hardware; appropriate staff management; and community involvement.

Nicole shared not only the digitization methods currently employed at ANIC, but also highlighted the importance of digization’s role in mobilizing key data such as pollinator data – beneficial to international biodiversity efforts. Her talk emphasized the need to show how digitizing this data “boosts the impact of collections for global research and society through improved access.” Nicole highlighted that:

both ANIC and ALA have called for and put a policy in place to end legacy data, and that all future data coming into the museums must be born digital.

Andrei Soukarov shared his vision for the importance DNA barcoding as part of our digitization efforts. He stressed DNA barcoding as the “best taxonomic tool developed in the last 100 years for resolving current taxonomic conundrums, for revealing cryptic species and for describing biodiversity.”

Tracy Teal (Data Carpentry) and Amber Budden (DataONE) presented talks on data skills. Tracy focused on the challenges of using ever larger datasets and the need for data skills to effectively integrate collections data in entomological research.

From Tracy’s talk: research shows “scientists want skills more than money!” (from EMBL Australia Bioinformatics Resource – Community Survey 2013).

What about managing data across all stages of the data life cycle? DataONE’s Amber Budden takes the long view and presented information on how to think about and plan for taking the very best care of data to give it the longest life possible – and how to enhance the potential for its re-use. Norman Johnson was not able to present his talk – we hope he can do so in the future. Norm would share his insights for current issues with digitizing and using collections data for taxonomic / systematic research including a discussion of the costs. And as for skills needs, Norm shares:

“the introduction and adoption of informatics technologies has altered the needed skill set for the next generation of taxonomists, but this is not yet reflected in formal training programs.”

What about caring for the digital assets we are creating? Listen to Larwrence Gall’s (YPM) talk for insights on different ways to manage and archive digital assets and why it’s important.

Kari Harris (Arkansas State U) inspired us all – with the audience clapping spontaneously – 3 times! Listen as Kari describes her story and vision for revitalizing collections and students are leading the way.

We are all looking forward to ESA 2017 in Denver, CO. Those who organized symposia at ICE 2016 would like to combine efforts next year, including: Nico Franz, Katja Seltmann, Miles Zhang, Ana Dal Molin, Barbara Sharnowski, Deborah Paul, Nicole Fisher, Pam Soltis, and Paul Flemons.

Many thanks to: Nicole Fisher, Pam Soltis, and Paul Flemons for their work to pull together this symposium and to Kevin Love for recording the talks. And kudos to Nicole, who it seems, can read minds – muchas gracias Nicole!

All presentations recorded! To listen, please go to the vimeo Data Without Borders ICE 2016 recordings.

*ANIC is part the National Research Collections Australia (NRCA), in the Commonwealth Scientific and Industrial Research Organisation (CSIRO).

**Australian Museum Research Institute, Australian Museum  1 William Street Sydney NSW 2010 Australia