Data Sharing Data Standards and Demystifying the IPT

From iDigBio
Jump to navigation Jump to search
Data Sharing Data Standards and Demystifying the IPT
Iptbanner2 740x 2.png

Quick Links for Data Sharing Data Standards and Demystifying the IPT
Data Sharing Data Standards and Demystifying the IPT Agenda
Data Sharing Data Standards and Demystifying the IPT Biblio Entries
Data Sharing Data Standards and Demystifying the IPT Report

This wiki supports the Data Sharing, Data Standards and Demystifying the IPT Workshop held simultaneously at the University of Florida at iDigBio and at the Canadian Biodiversity Information Facility (CBIF) on 13-14 January 2015. It is the second in a series of biodiversity informatics workshops planned in the fiscal year (2014-2015). The first one was Data Carpentry. The next one is Field to Database (March 9 - 12, 2015).

General Information

Description and Overview of Workshop. Are you a taxonomist collecting biological specimens in your research and vouchering them in collections? How does your specimen data get published? How does it get into collections databases? Are you a collection manager or data manager who would like to use the GBIF Integrated Publishing Toolkit v.2 (IPT) to publish your collection's datasets?

This workshop is for you if you:

  • have a dataset and need/want to get it into a standard format for sharing
  • manage data for a museum collection and would like to learn how to use the GBIF IPT
  • want to understand more about Darwin Core and Data Sharing Standards
  • would like to understand just what is meant by "Darwin Core Archive file (DwC-A)"
  • want to learn how to create or update Darwin Core Archive files using the IPT
  • would like to understand just a bit more about where data goes and how it gets there once it leaves your collection
  • are a taxonomist with an occurrence dataset who would like publish your dataset as a DwC-A with your related taxonomic publication

We'll discuss and focus on the concepts, skills, and tools we need to share biodiversity occurrence data and related data such as genomics, and media. Datasets will be taken from organismal and evolutionary biology, biodiversity science, ecology, and environmental science. The workshop format includes lectures and hands-on work, so participants are required to bring their own laptops. We will provide information and instructions on software installations and a pre-workshop online meeting is required for those participants wishing to install the IPT software on their own laptop.

Note this workshop does not focus on Installation and set up of an IPT instance, rather on the use of the installed software for data sharing and data publishing. There's a pre-workshop webinar that is going to cover an Overview of the Considerations and Steps for IPT Install and Set up on 7 January 2015 from 12 noon to 2 PM EST.

To Do For You: We encourage people attending this webinar to submit questions prior to the event so we can be prepared to answer some of them ahead of time. Note: this is not an installation demonstration. We do plan to have an installation video available to provide one example, one use case, for seeing a sample (typical) installation process.

Updates will be posted to this website as they become available.

Pre-Workshop WEBINAR on IPT Install / Set up

Those interested in knowing more about the installation and configuration of the IPT will have an opportunity to be introduced to the topic during this 2-hour webinar. The event is open for anyone interested so feel free to share the information with others. The official announcement and webinar link is available here.

The webinar will be held on Jan 7th, 2015. NOON to 2 PM EST, through the iDigBio Adobe Connect Platform.

Webinar Resources

GBIF IPT Install and Set up Webinar

Planning Team

Collaboratively brought to you by: (in alphabetical order) Reed Beaman (NSF), Cathy Bester (iDigBio),Kyle Braak (GBIF), Matt Collins (ACIS - iDigBio), Shari Ellis (iDigBio), Alberto González-Talaván (GBIF), Chris Lewis (CBIF), Anissa Lybaert (CBIF), Kevin Love (iDigBio), James Macklin (CBIF), Derek Masaki (USGS - BISON), Andrea Matsunaga (ACIS - iDigBio), Joanna McCaffrey (iDigBio), Deborah Paul (FSU - iDigBio), Bénédicte Rivière (Canadensys), Laura Russell (VertNet), David Shorthouse (Canadensys), Dan Stoner (ACIS - iDigBio), Alex Thompson (ACIS - iDigBio)


Instructors (iDigBio): Laura Russell (VertNet), Derek Masaki (USGS), Alberto González-Talaván (GBIF), Deborah Paul (FSU - iDigBio)

Instructors (CBIF/Canadensys): David Shorthouse (Canadensys), James Macklin (CBIF)

Assistants (iDigBio): Andrea Matsunaga (ACIS - iDigBio), Joanna McCaffrey (iDigBio), Dan Stoner (ACIS - iDigBio), Matt Collins (iDigBio)

Assistants (CBIF/Canadensys): Christian Gendreau (Canadensys), Heather Cole (CBIF), Chris Lewis (CBIF), Anissa Lybaert (CBIF), Joel Sachs (CBIF), Allan Jones (CBIF), Glen Newton (CBIF), Satpal Bilkhu (CBIF)

Who: Regardless of title, if you manage data for biological specimens collections, perhaps as Data Manager or Collection Manager, this workshop is for you if you have a dataset, and want to learn about data sharing and standards, and how to share data using the DwC-A format the IPT tool produces. If you are a taxonomist and would like to publish your specimen data in a DwC-A format with your taxonomic publications, you're also welcome to apply.

Skill Level: While we don't expect experts, we do expect computer skills commensurate with a Data Manager / Collection Manager.

Where: iDigBio in Gainesville, FL and (CBIF in Ottawa, ON, Canada via teleconference using Adobe Connect)

Requirements: Participants must bring a laptop.

Contact (iDigBio Participants): Please email Deb Paul for questions and information not covered here.

Contact (CBIF, Ottawa Participants): Please email James Macklin


Tuition for the course is free, but there is an application process and spots are limited. Currently, the workshop is full.

Software Installation Details

A laptop and a web browser are required for participants. Individualized IPTs will be installed on networked servers and made available to all participants throughout the duration of the workshop. These installations will not be publicly accessible and will not persist at the completion of the workshop. We use Adobe Connect extensively in this workshop. Please perform the systems test using the link below. Also, you will also need to install the Adobe Connect Add-In to participate in the workshop.

  • Adobe Connect Systems Test
  • Note when you follow the link to install and perform the test, some software will install (but it doesn't look like anything happens). To check, simply re-run the test.

If enterprising participants would like to install the IPT on a laptop prior to the workshop, a script is available to help streamline the necessary steps in a virtual machine. It requires a pre-existing install and experience with VirtualBox (or libvrt) and Git.

The pre-workshop Webinar focuses on Install and set up of IPT on a server.


Schedule - subject to change.

Course Overview - Day 1 - Tuesday January 13th
8:30-8:45 Check-in, name tags, log in, connect to wireless and Adobe Connect, Workshop Participant List All, (both locations)
8:45-9:00 Local logistics, etiquette for questions Deb Paul, iDigBio (Gainesville); James Macklin, CBIF (Ottawa)
9:00-9:20 1A: Introductions, Workshop Overview, Adobe Connect, Goals of the Workshop Deb Paul, David Shorthouse, James Macklin
09:20-10:15 1B: Theory: publishing basic primary biodiversity data: IPT and other methods.

Benefits: data papers, Nature data descriptors Standards: Darwin Core, Darwin Core Archive, TDWG and ratification process Workflow: GBIF registry, harvesting, presentation

Alberto Gonzalez-Talavan (Gainesville); James Macklin (Ottawa); Laura Russell (Gainesville)
10:15-10:45 Break (during this break, please verify you can log in to your IPT instance). all
10:45-11:40 1C: Theory: What are the metadata?
  • Ecological Metadata Language (EML)
  • Licensing, norms
  • Identifiers

-Demo: instructor creates a resource in IPT and fills out metadata while everyone follows along
-Exercise: Participants create resource and metadata

David Shorthouse (Derek Masaki, Deb Paul)
11:40-12:00 1D: Theory: publishing with the IPT
  • data sources
  • data quality
  • character encoding
  • mapping (discuss the different cores)
Laura Russell (David Shorthouse)
12:00-1:00 Lunch (provided)
1:00-1:30 1D cont'd:

-Demo: adding a data source and mapping data
-Exercise: data sources and mapping

  • Adding data sources
  • Mapping terms to DwC in IPT
Laura Russell (David Shorthouse)
1:30-2:30 1E: Theory: Complex primary biodiversity data
  • What are extensions to DwC?
  • Audubon Core & multimedia
  • Determination histories
  • Genomics extension (GGBN)

-Demo: multimedia (Audubon Core) extension

Deb Paul, Derek Masaki, Laura Russell (David Shorthouse)
2:30-3:00 Break
3:00-4:30 1E cont'd:

-Demo: determination histories extension
-Exercise: complex primary biodiversity data

  • adding data sources
  • determination histories
  • Audubon Core & multimedia
Laura Russell, Deb Paul (David Shorthouse)
4:30-5:00 1F: wrap-up and review for tomorrow
  • need to have pairs set up for tomorrow
6:00-8:30 Night at the Museum reception (Gainesville only): meet in Holiday Inn Hotel Lobby for bus pick up at 6 PM
Course Overview - Day 2 - Wednesday January 14th
8:30-9:00 Check-in, log in, pairing, connect to wireless...
9:00-10:15 2A: Open Practical session (Gainesville participants choose format, ex. work in pairs: admin + data manager)
  • participant data
  • break-outs as required
10:15-10:45 Break
10:45-12:00 2B: Open Practical session
  • publishing a dataset & producing a DwC-A file
  • participant data
  • break-outs as required
12:00-1:00 Lunch (provided)
1:00-2:30 2C: Administration functions and user management in the IPT (roles, permissions), registering data sets with GBIF (production mode instance)
  • time-to-live for data publication
  • feedback with regard to data quality
  • expectations from aggregator
  • expectations from providers / collectors
  • point-of-view: Canadensys, VertNet, iDigBio, GBIF, USGS Bison
  • lightning talk from participants*
Laura Russell (David Shorthouse)
2:30-3:00 Break
3:00-4:00 2D: Collaboration and the way forward:
  • Where to find IPT help: mailing lists, interest group, (web) resources, personalized support through Participant nodes.
  • What are the current limitations of the IPT and DwC-A + Future of the IPT
  • Upgrading the IPT when new versions are released
Alberto González-Talaván (David Shorthouse)
4:00-4:30 2E: Summary of the webinar and workshop, evaluation and feedback, next steps (participants present at their own institutions - and report back/share presentation), wrap-up.

Link to Workshop Report


Adobe Connect Access

Adobe Connect will be used to provide access for participants at The Canadian Biodiversity Information Facility (CBIF) in Ottawa, ON, Canada to instruction from iDigBio in Gainesville. Some instruction may come from CBIF to Gainesville. Workshop participants in both locations will be required to log in to the Adobe Connect room to facilitate communicating with each other. Already registered and accepted?

  • Link to Adobe Room for Registered Participants

Workshop Documents, Presentations, and Links

  1. Rick Levy, Database Associate, Denver Botanic Garden on "occurrenceID" and "recordID
  2. Shelley James,Botanist/Collections manager PCMB and Holly Bolick, Collections Manager- Invertebrate Zoology, Bishop Museum
  3. Nadia Cavallin, Herbarium Curator, Royal Botanical Gardens, Burlington
  4. Jennifer Wilkinson, Assistant Herbarium Curator Mycology, AAFC, Ottawa

Pre-Workshop Reading List

Links beneficial for review

Workshop Recordings

Day 1

Day 2

  • 8:30am-10:15m (not recorded, open discussion, see google doc for topics discussed)
  • 10:45am-12:00pm (not recorded, open discussion, see google doc for topics discussed)
  • 1:00pm-2:30pm
  • 3:00-5:00pm

Resources and Links

Digitization Training Workshops Wiki Home