by Deborah Paul for Nelson E. Rios
If you need to georeference hundreds of thousands of localities for millions of specimens, how will you get it done? On May 1st, 2014, 88 people logged in to find out more about how to use CoGe, or Collaborative Georeferencing using a suite of tools developed at Tulane Biodiversity Institute.
How do you manage a large georeferencing staff, track their progress, and visualize their data points to check data quality? How do you get these georeferences back into your database? These questions and more were the focus of the iDigBio Webinar: Collaborative Georeferencing Demo on Thursday, May 1st, 2014. GEOLocate software developer Nelson Rios, guided us through the use of a suite of tools designed to manage a large georeferencing project. People logged in (and some stayed up very late) to join us from as far away as New Zealand! Using data from the FISHNet2 project, Nelson provided us with a walk through what’s possible with current software and web services as they exceeded their goals, georeferencing over 900,000 localities for millions of specimens.
Georeferencing can have unique issues for localities from different organismal groups. In an informal poll: we asked our webinar attendees to share what types of specimen locality data they are working on georeferencing. (See screen shot right).
With the CoGe website you can create a community and then:
- upload your data for that community to georeference
- portion different sections the data to each community member
- easily visualize community-wide and individual progress
- provide a simple way for users to join the group.
Some useful CoGe suite features:
- a tool to map your locality data csv file to standard fields
- the upload process detects potential duplicates before you begin your project, possibly saving you hours, up front.
- the security model provides three distinct roles you assign: user, admin, and reviewer. Users georeference and can see their own progress. Those with admin status portion data for users to georeference and track users work. Reviewers have a blanket view of the entire community.
- visualize your georeferences in the History and Review tabs to see your prior work. You can move a record back to the Workbench if needed, to re-georeference.
For more details, check out the recording, and the [discussion] that followed. Software developers and representatives from Symbiota (Ed Gilbert), Specify (Andrew Bentley), Scio Qualis (Robin Schroeder), and Silver Biology (Michael Giddens) were present to answer questions about the georeferencing features and options in their software. Nelson Rios, and all the software folks, look forward to your input. The iDigBio Georeferencing Working Group (GWG) is ready and waiting for your questions, and if you need georeferencers, or trainers, check out the GWG TTT 1 and TTT 2 graduates.
* Use CoGe to track technicians' progress. Look for those, who after mastering the software, are fast (and accurate too, of course).
* The georeferencing technician's familiarity with the geographic region of localities to be georeferenced is key to accuracy and speed.
* Reasonable size datasets (not millions) can be exported as .kml files for visualization with Google Earth, anytime. To visualize millions of data points you'll need GIS software.
* Yes! You can use GEOLocate to visualize data you have already georeferenced.
- Your points will show up as the default “best” points.
- GEOLocate will re-georeference and you will see those points too.
* GEOLocate assumes WGS84
* Township-Range-Section (TRS) data is recognized in GEOLocate, but it must be in the locality string.
Highlights and on-going issues.
Specify 7, a future web-based version, may support more GIS layers. Specify 6 stores one georeference for each locality record. Symbiota can store multiple georeferences for a locality. Scio Qualis too provides the ability to store a georeferencing history. Symbiota and Specify are working on plans to automate the export > georeference > re-import process soon. And, that's a good thing because of,...
Data (re)integration. Often, georeferencing is still done on legacy data after data entry for all records is done. And, the georeferencing may be done outside the specimen database. Getting these georeferences back into the original provider database, can be challenging. This might mean the georeferences are available in an aggregator's database, but not in the original provider's database. Developers are aware of the difficulties in re-integrating georeferences. Symbiota and Specify are both working on means to make this simpler, more automated, and error-free. If you have this problem, contact us in the GWG to see if we can help you sort out how best to move forward.
Stay tuned for more GWG Webinars – suggest a topic, send us an email, or click the feedback button!