Join iDigBio and colleagues from GBIF, Université de Montréal, New York Botanical Garden, and NHM London in a colloquium at the Botany 2021 conference highlighting the impact of and potential for community data curation on herbaria.
Data aggregation over the last 20 years has led to an impressive amount of collection data available for public use but this represents only a small percentage of the specimens currently in museums. While digitization of these collections must remain a focus, there is growing realization that the quality of data already digitized can be improved. There is a demand for high quality data with over 5,000 peer-reviewed publications using data mediated by the Global Biodiversity Information Facility (GBIF) in the past 10 years and with increasing use every year. However some aggregated data lack the level of metadata and precision required for rigorous scientific use. The most common ways for researchers to provide feedback to data providers about potential data errors is directly through emails or through data aggregator helpdesk mechanisms such as GitHub which are indirect and slow. Aggregated herbarium data is used for many research purposes and after download researchers undertake several rounds of cleaning to improve fitness for purpose. Unfortunately, there is no clear way to roundtrip this data back to the data provider, therefore the added value of the work is lost and doomed to be repeated. Several recent national and international reports and strategies have emphasized the need to better utilize collection data to provide scientific based strategies to protect biodiversity from climate change and other human impacts. Improving the quality of the data extracted from collections is the quickest, highest impact first step in enhancing collection data use. Currently data is held in thousands of independent collection management systems and aggregated at multiple geographic and taxonomic levels. This system is not well integrated and precludes simple data annotations from workers outside the immediate data holding institution. This colloquium will describe the work of several initiatives to build expert community curated annotations projects and systems to improve data quality. These include plant taxonomists and ecologists curating expert taxonomies and occurrence distributions for a particular clade of interest and initiatives by data aggregators to improve data quality as it enters and exits their domains. The colloquium will also outline progress on recent work to provide mechanisms to build an integrated global annotation system built around persistent unique identifiers in the extended/digital specimen framework and current data integrations performed by iDigBio and GBIF. A goal of the colloquium is to inform botanists on this progress and encourage participation in the process.
This colloquium is organized by Joe Miller (Global Biodiversity Information Facility), Gil Nelson (University of Florida, Florida Museum of Natural History), Erica Krimmel (Florida State University, iDigBio), Anne Bruneau (Université de Montréal), Barbara Thiers (New York Botanical Garden), and Sandy Knapp (NHM London).
Talks included in the colloquium are:
- Identifying and clustering duplicate vouchers across herbarium collections using GBIF tools (Miller, Joe; Nicolson, Nicky; Robertson, Tim)
- Data access and enhancement: thoughts from a specialist (Rabeler, Richard)
- Integrating human expertise and automated processes to enhance herbarium specimen data at the aggregate level (Krimmel, Erica; Rejack, Nicholas; Mast, Austin)
- The Legume Data Portal, a community effort to facilitate sharing and collaboration in Leguminosae (Bruneau, Anne; Sinou, Carole; Miller, Joe; Le Roux, Margaretha Marianne; Hughes, Colin E; Borges, Leonardo; Javadi, Firouzeh; de la Estrella, Manuel; Høfft, Morten; Raymond, Mélianie; Robertson, Tim)
- A collective effort to update the legume checklist (Le Roux, Margaretha Marianne; Govaerts, Rafael; Miller, Joe; Bruneau, Anne; Lewis, Gwilym; Sinou, Carole)
- Herbaria for understanding diversity and distribution - does size matter? (Delves, Jay; Knapp, Sandra; Moonlight, Peter)
- Extending U.S. Biodiversity Collections to Address National Challenges (Monfils, Anna; Thiers, Barbara M.)