Specify Paleo Collections Workshop - training, new technology, and data models

guest blog by Talia Karim (University of Colorado) and Una Farrell (University of Kansas)

Paleo Digital Atlas

Last September at the Paleontology Digitization workshop, held in New Haven, CT, several attendees indicated that they would be interested in a paleo-specific Specify workshop. Many paleontology collections across the world are now using Specify and with a proposal on the table to change the way in which stratigraphy is handled within the Specify data model it seemed like the time was right. The iDigBio team put things in motion and the first paleontology-specific Specify workshop, led by Andy Bentley, was held at the University of Kansas last month (May 20th-23rd, 2014).

Attendance at the workshop was great, with 17 onsite and 23 remote participants representing a variety of skill levels, from seasoned Specify users to total novices. Remote participants came from all over the world, including representatives from Germany and Columbia. Yousef Alfahd al Moathi from the Natural History Museum, Sultanate of Oman, wins the prize for most distant remote participant! We were all impressed that participants from Oman were still awake and asking really great questions during the afternoon sessions! 

Day 1 started out with an introduction to iDigBio by Deb Paul and a brief overview of the PaleoNiches TCN Project by PI Bruce Lieberman. Then we dove straight into Specify, starting at the very beginning with the Specify Installation Wizard. Andy Bentley provided a great walkthrough of all the menus, types of information required for each one, and where people might need to get help from their server administrator. 

Next we launched Specify on our laptops using a virtual machine that we had downloaded before we arrived. The idea of using Specify in this way, especially in workshop settings, was discussed at the iDigBio summit last November by Renato Figueiredo (iDigBio). His team at the Advanced Computing and Information Systems Laboratory (ACIS),created a virtual appliance that contained MySQL, Specify, a preloaded dataset, and some other pieces of software that made the whole package run on any machine. The virtual appliance worked great, except for one glitch with Windows 8, and saved the group a lot of time trying to troubleshoot why MySQL wouldn’t install properly on everyone’s laptops. This had been an issue at previous workshops.

Want to try out the Specify appliance? Find out more at Collaborating to Make it Easier to Start Digitizing!

We spent the rest of Day 1 and most of Day 2 going through all of the menus and forms in Specify, what they do, and how to use them effectively. Even those of us who have been using Specify for years learned a lot of new tricks and ways to make data entry and searching more efficient. We ended both days with some time to play around with the test dataset and ask questions directly to Theresa Miller (Specify) and Andy. This was very helpful, especially for some of the newer users that came to the workshop with a list of  “how do I do this?” or “can I do this in Specify 6?” type questions!

Day 3 was spent discussing more advanced topics, such as form view customization, using iReports to make labels and loan forms, and how the container and relationship features work. We finished Day 3 by calling in to the NSF EarthCube C4P workshop in Washington DC to talk about proposed changes to stratigraphy in the Specify data model. The topic has been under consideration for a number of years, between Specify and paleontologists from various institutions and working groups. Discussion about the Specify 6 data model can be found on the documentation part of the Specify website at “Specify 6's Approach to Stratigraphy”.

The issue at hand is whether the “Paleo Context” (PC), i.e. geological data such as geologic age and lithostratigraphy, should be associated with the collection object, the collecting event, or the locality. It is generally agreed that the current model, where PC is attached to the collection object, is not the most efficient, but there is considerable debate over the ideal solution. We heard from several people at C4P, where many considered that PC belonged, to some extent, with locality. At the Specify workshop, we were leaning towards attaching PC to collecting event as the most flexible and efficient method. Both groups prepared a summary document, which was forwarded to the Specify team. Subsequently, on June 10th, Jim Beach presented a webinar to the iDigBio Paleo Digitization group, where he discussed the perspective of the software developers and summarized some of the proposals on the table (you can find a recording, and some associated documents, here: https://www.idigbio.org/wiki/index.php/Paleo_Digitization_Working_Group). As it stands, no firm decision has been made, but in the end there may be an option for each institution to adopt the data model of their choice.

Day 4 of the workshop concluded with explanations of how to export data for the Specify web portal, iDigBio, and GBIF. We had some interesting discussions of how to merge data exports if you manage more than one collection within Specify, but would like all of the collections searchable via a single web portal. Laura Russell (VertNet) and Ben Anhalt (Specify) also came over to discuss data publishing and the new Specify 7 Thin Client with our group. Lots of exciting developments were presented for sharing data and using Specify via a web browser. The workshop finished up with a visit from Jim Beach, who we were able to talk with about the proposed data model change, some other new Specify developments, and a new Specify iPad app that will be released hopefully in the Fall. 

We all came away with a lot to think about, plenty of practical ideas to go home and put into action, and a better understanding of how we can share and talk about data in a standardized way across our collections. Many thanks to Andy Bentley and iDigBio for another great workshop!