Workshop Report: Georeferencing for Paleo

The Paleo Digitization Working Group held a virtual workshop, Georeferencing for Paleo: Refreshing the approach to fossil localities, on April 28-29, 2020 with 52 participants based at 29 institutions. These participants represented a mix of institutions (museums, university collections, and agencies) as well as careers (collections professionals, researchers, informatics professionals) and experience with digitization under ADBC funding (40% of participants having never been involved with a TCN). We also had a range of georeferencing experience, with 30% of participants representing collections that began georeferencing prior to 2010, and another 30% representing collections that have yet to begin georeferencing.

Screenshot of our virtual workshop participants, including work-from-home companions.

The original impetus for the workshop was the recognition that a significant amount of digitization work remains to be done and that georeferencing is one of the next big roadblocks, both within the United States and for our colleagues around the world. Across all collection types, there are major issues with the quality of georeference data currently available on biodiversity data aggregators such as iDigBio and GBIF. For paleo collections, there are additional issues related to applying existing georeferencing workflows in the paleontological context, as well as to sharing georeference data publicly. We identified three broad goals for a workshop related to these issues:

  1. Address the lack and poor quality of specimen georeference data shared on biodiversity aggregators, e.g. the iDigBio Portal or GBIF, by determining recommendations for the paleo collections community on best practices and workflows for generating and sharing this data.
  2. Identify technical barriers to implementing these recommendations and discuss a strategy for communicating them to standards organizations, aggregators, collection management software solutions, and georeferencing software tools.
  3. Disseminate the findings of this workshop widely, both within the paleo collections community (including to collectors) and as a resource discoverable by other domains. Findings will include a “toolkit” to share the recommendations on best practices and workflows determined by this workshop.

Two online listening sessions were hosted in January 2020 to gauge interest in specific topics for the workshop. During these sessions, we asked about the biggest challenges related to creating, managing, sharing, and using georeference data for paleo specimens. The resulting discussion was rich and wide-ranging, as documented in summaries of the listening session shared notes and conversation.

Coronavirus had a significant impact on the format of and participation in this workshop. While we had planned on hosting a three day, in-person event at the Natural History Museum of Utah in Salt Lake City, in late March we recognized that the workshop would need to move online. The format was adjusted to provide an extended time for asynchronous participation (three weeks) followed by a compressed time for synchronous interaction (two 90-minute Zoom meetings over two days). The majority (90%) of workshop participants were able to attend both of the synchronous sessions. Four participants who were scheduled to attend the in-person workshop were unable to attend the virtual workshop due to a mix of employment furloughs and personal circumstances. However, we also gained 14 new participants by moving to a virtual venue. 

During the asynchronous participation period, we requested that workshop participants share any georeferencing resources (internal or external) that they were aware of. In particular, we encouraged sharing screenshots of your database’s locality fields, and protocols/workflows for doing georeferencing in your own collection. We also solicited brief recorded presentations from participants. Twenty people contributed content during this period, and it was collated and disseminated via an online resource hub hosted in collaboration with the TDWG Earth Sciences and Paleobiology Interest Group. 78% of respondents in a post-workshop survey said that as a component of this virtual workshop, receiving access to content (prerecorded talks, documents, etc.) prior to the synchronous sessions was very effective or extremely effective. In addition, having these resources available in a central place is an ongoing benefit to the paleo collections community.

The two synchronous sessions were designed to be a mix of webinar-style content delivery and subsequent discussion about the content. The workshop organizers decided to focus on baseline content that we felt everyone would benefit from knowing and that we could provide ourselves. This differed significantly from the design of the in-person workshop, where we had asked various experts who were attending as participants to also lead certain sections. Once the workshop moved to virtual and everyone was affected by coronavirus precautions, this ask seemed unreasonable and so we shifted the burden of content to the workshop organizers. We also set up a collaborative notes document in Google Docs to facilitate live, text-based interaction as a complement to presentations and discussion in Zoom. This document can be accessed for the full scope of the discussions described below.  

Day 1 of the synchronous sessions included Holly Little presenting a high-level overview of data standards (Darwin Core, ABCD) relevant to georeferencing, as well as Erica Krimmel presenting on how these standards are currently in use for paleo specimens that are mobilized on the iDigBio portal. Discussion centered on challenges and solutions specific to using standard fields for sharing paleo georeference data, e.g. those related to regulations on masking coordinates. One of the workshop participants is based in the Bureau of Land Management and was able to offer his perspective on that agency’s recent policy discussions about marking coordinates. Tracking land ownership, particularly as it changes over time, was also a major point of discussion. This issue is particularly relevant to paleo collections, where specimens collected on federal land fall under Paleontological Resources Preservation Act (PRPA) regulations.

Slide from presentation on Day 2, highlighting the opportunity we have for getting together as a paleo collections community and deciding on guidelines for how to use data standards.

Day 2 began with another presentation by Holly and Erica, this time looking at how community guidelines can help people implement data standards. Discussion centered on ways that we might be able to move forward on creating such community guidelines. Guidelines specific to paleo collections are important because the geologic context of a fossil is essential to its research value, and information about geologic context intersects with georeferencing. In addition to needing georeferencing guidelines for ourselves as collections professionals, we also need them to communicate best practices with external partners like researchers and mitigation companies who have repository agreements with us. There was also talk about how different collection management databases handle the same information, and how understanding the ways that each other’s databases work can help all of us ask for improvements. Participants recognized that many of their own problems are shared, and there was a general sense of wanting to continue working on common solutions.

Across industries and around the world, we are all figuring out how to transition more of our activities to online spaces, and this workshop was in large part an experiment. Given that, we asked respondents in a post-workshop survey to provide constructive criticism. The feedback was overwhelmingly positive, with 87% of respondents saying that the live Zoom session was very or extremely effective, and 91% agreeing that there were sufficient opportunities for questions/interaction. Providing opportunities for interaction can be challenging in a virtual space. We attempted to address this by not only setting up the collaborative notes document, but also by using the polling feature in Zoom, and by posing a daily icebreaker question in the collaborative notes document. Participants responded favorably to these efforts, both live and as indicated by their survey responses. One survey respondent pointed out that for learning content, they actually preferred the virtual format. That said, many participants expressed regret that they had little to no time for peer-to-peer conversation and casual networking. Adding virtual breakout sessions to our workshop might have helped to facilitate more of that.

In answer to the statement “I felt comfortable participating in this workshop environment,” 83% of respondents said they agreed or strongly agreed. However 17% said they neither agreed nor disagreed. We need to be vigilant about considering inclusivity and opportunities to increase the comfort level of our participants in online spaces. One survey respondent pointed out that even in a 90-minute session, scheduled breaks are essential for those trying to balance work-from-home distractions. 

Although the majority (87%) of survey respondents agreed that the depth of coverage in this workshop was appropriate given the revised format, this was definitely a short-changed version of what the workshop was originally intended to be, and 91% of survey respondents expressed interest in possibly participating in a future in-person workshop. We envision such a workshop building on not only the conversations and work resulting from the January 2020 listening sessions and April 2020 virtual workshop, but also from ongoing participation via a new “Paleo Digitization Happy Hour” biweekly call and Google Group (anyone who is interested can subscribe here). This is a series of time for conversation between colleagues working on digitization in paleo collections. These conversations are much more informal than this workshop was, but the workshop organizing team has suggested themes for the first five Happy Hour sessions based on topics identified as being of interest during the workshop and pre-workshop listening sessions. Join us for the next Paleo Digitization Happy Hour on June 18th where the topic is “tools for georeferencing.”

This workshop was organized by Carrie Levitt-Bussian (Natural History Museum of Utah), Holly Little (Smithsonian National Museum of Natural History), Talia Karim (University of Colorado Boulder), Erica Krimmel (iDigBio), and Deborah Paul (iDigBio). We sincerely thank the Natural History Museum of Utah for being our host prior to the workshop moving to an entirely virtual format. Stay tuned for a potential in-person extension of this workshop in 2021!