Georeferencing for Research Use: Difference between revisions

Jump to navigation Jump to search
no edit summary
No edit summary
 
(35 intermediate revisions by 4 users not shown)
Line 1: Line 1:
= Post Workshop Publication =
Organizers and participants co-wrote a summation from this workshop of lessons learned and key observations and published these results as
*Seltmann K, Lafia S, Paul D, James S, Bloom D, Rios N, Ellis S, Farrell U, Utrup J, Yost M, Davis E, Emery R, Motz G, Kimmig J, Shirey V, Sandall E, Park D, Tyrrell C, Thackurdeen R, Collins M, O'Leary V, Prestridge H, Evelyn C, Nyberg B (2018) Georeferencing for Research Use (GRU): An integrated geospatial training paradigm for biocollections researchers and data providers. Research Ideas and Outcomes 4: e32449. https://doi.org/10.3897/rio.4.e32449


== iDigBio - CCBER GWG Georeferencing for Research Use, a short course  ==
== iDigBio - CCBER GWG Georeferencing for Research Use, a short course  ==
Line 8: Line 11:
!colspan="2" style="background:#D58B28;text-align:center;font-size:9pt" | Quick Links for GWG Second Train the Trainers Workshop  
!colspan="2" style="background:#D58B28;text-align:center;font-size:9pt" | Quick Links for GWG Second Train the Trainers Workshop  
|-  
|-  
|Georeferencing for Research Use - link to agenda
|[[Georeferencing_for_Research_Use#Schedule_of_Events_-_Agenda|Georeferencing for Research Use - link to agenda]]
|-  
|-  
|Biblio entries<br>
|Biblio entries<br>
|-  
|-  
| Georeferencing for Research Use, short course report
|[https://www.idigbio.org/content/georeferencing-and-visualizing-biodiversity-data-research Georeferencing for Research Use, short course report]
|}
|}
[[Category:Workshop]]
[[Category:Workshop]][[Category:Georeferencing]][[Category:Research]]
[[File:Capture.PNG|200px|thumb|right|hotel and NCEAS map]]
[[File:Capture.PNG|200px|thumb|right|hotel and NCEAS map]]
October 4 - 7, 2016 at (https://www.nceas.ucsb.edu/) NCEAS, Santa Barbara California
October 4 - 7, 2016 at (https://www.nceas.ucsb.edu/) NCEAS, Santa Barbara California
Line 22: Line 25:
After the workshop, we will encourage our participants to share use cases, any training materials developed, and to offer workshops, webinars, talks, or other events aimed at increasing use of best practices for georeferencing legacy locality data, best practices for capturing the locality data from future biological and paleontological collecting and sampling events, and best practices for using the data in research.
After the workshop, we will encourage our participants to share use cases, any training materials developed, and to offer workshops, webinars, talks, or other events aimed at increasing use of best practices for georeferencing legacy locality data, best practices for capturing the locality data from future biological and paleontological collecting and sampling events, and best practices for using the data in research.


Some anticipated course content includes discussion and activities about georeferencing integration, georeferenced data visualization, and georeferences for modeling and research. Detailed agenda in development.
Some anticipated course content includes discussion and activities about georeferencing integration, georeferenced data visualization, and georeferences for modeling and research.


=== Logistics: ===
=== Logistics: ===
Line 30: Line 33:


=== Course Instructor List ===
=== Course Instructor List ===
(''in alphabetical order'') David Bloom, Matt Collins, Shelley James, Sara Lafia, Deborah Paul, Marcy Revelez, Nelson Rios, Katja Seltmann, Jessica Utrup, Mike Yost
(''in alphabetical order'') David Bloom, Matt Collins, Una Farrell, Shelley James, Sara Lafia, Deborah Paul, Marcy Revelez, Nelson Rios, Katja Seltmann, Jessica Utrup, Mike Yost
 
=== Meet the Participants: ===
* Participant list
=== Bring your Datasets and Laptops:  ===
=== Bring your Datasets and Laptops:  ===
'''Participants are strongly encouraged to bring representative datasets''' from their collections or research that need georeferencing to expose everyone to the variety of locality data georeferencing issues and give the experts and participants a chance to work together to address any challenges.
'''Participants are strongly encouraged to bring representative datasets''' from their collections or research that need georeferencing to expose everyone to the variety of locality data georeferencing issues and give the experts and participants a chance to work together to address any challenges.
Line 58: Line 58:


== Goals of the Workshop:  ==
== Goals of the Workshop:  ==
*Best practices for researchers for in-the-field creating of new locality data and legacy data georeferencing.
**Tools (hardware and software) and standards (what to document, datum etc.).
**How to re-patriate data and/or best practices for putting data into data repository if can’t be repatriated (what the obstacles are and minimization of data loss).
*How to evaluate already georeferenced data. Current tools for visualization and evaluation.
**Metrics to look for
**Current tools for georeferencing
**Online tools
**R
**QGIS
*Researchers give input on the challenges for georeferencing, using existing georeferences.
*Workflow review for some research review of using georeferenced data (Katja, Shelley, ...)
Ultimate goal: Participant can point to aspects they have learned (tool, standard etc.) during the workshop and can indicate how they will use those aspects for their research goal/purpose (present or future).


== Workshop Objectives:  ==
== Workshop Objectives:  ==
'''Topics to be covered'''<br>
'''Topics to be covered'''<br>
''Pre-workshop materials''<br>
''Pre-workshop materials''<br>
Introductory information about datums, mapping, coordinate systems<br>
*Introductory information about datums, mapping, coordinate systems<br>
Basic georeferencing how-to<br>
*Basic georeferencing how-to<br>
''During workshop''<br>
''During workshop''<br>
Data standards, DwC terminology and fields (e.g. lat, long, datum), differences among disciplines (neo- and paleontological fields)<br>
*Data standards, DwC terminology and fields (e.g. lat, long, datum), differences among disciplines (neo- and paleontological fields)<br>
Georeferencing toolkit and workflow examples (GeoLocate, maps, other resources, pros and cons)<br>
*Georeferencing toolkit and workflow examples (GEOLocate, maps, other resources, pros and cons)<br>
Best practices for field collection of data (locality strings and GPS units, precision, datum) <br>
*Best practices for field collection of data (locality strings and GPS units, precision, datum) <br>
Best practices for georeferencing of legacy data given:<br>
*How best to record and store georeferencing notes and other data sources (database/CMS dependant)<br>
Varied research requirements for precision<br>
*Best practices for georeferencing of legacy data given:<br>
Project and collection management limitations<br>
**Varied research requirements for accuracy and precision
Uncertainty data -, polygon vs. point radius, description etc.<br>
**Project and collection management limitations
Datum - georectify to standard or verbatim<br>
**Uncertainty data - polygon vs. point radius, description and metadata, etc.
Workflows for incorporating data into different collections databases <br>
**Datum - georectify to a standard versus verbatim
Best practice syntax in locality descriptions for use in automation vs verbatim strings<br>
*Workflows for incorporating data into different collections databases  
Database limitations<br>
**Best practice syntax in locality descriptions for use in automation vs verbatim strings
Multiple geopoint values and storage (verbatim, automated-non-vetted value, georef to nearest named place, update to more accurate value, etc.)<br>
**Database limitations
Downloading datasets - sources, different mechanisms<br>
**Multiple geopoint values and storage (verbatim, automated-non-vetted value, nearest named place, update to more accurate value, etc.)
Assessing data quality<br>
*Downloading datasets - sources, different mechanisms
Uncertainty data - availability in data sources and interpretation<br>
**Assessing data quality
Tools for aggregating, cleaning, visualizing and analyzing data<br>
**Uncertainty data - availability in data sources and interpretation
e.g. R, QGIS<br>
*Tools for aggregating, cleaning, visualizing and analyzing data
Creating maps<br>
**R, QGIS, OpenRefine
Spatial analyses<br>
**Creating maps
Automated tools using Geo data<br>
**Spatial analyses
Difficult cases, such as geopolitically fluid locations over time, offshore localities<br>
**Automated, online tools and applications using geospatial data (e.g. LifeMapper)
Hands-on practice & case studies<br>
*Difficult cases, such as geopolitically fluid locations over time, offshore localities<br>
*Hands-on practice & case studies<br>


 
== Schedule of Events - Agenda ==
=== Desired Outcomes:  ===
 
== Schedule of Events - Agenda - in development ==
Breakfast, Lunch and Dinner every day is on our own (not provided).  
Breakfast, Lunch and Dinner every day is on our own (not provided).  
=== Day 1, Tuesday October 4th  ===
=== Day 1, Tuesday October 4th  ===
[https://vimeo.com/album/2163673/video/192472653 Recording Day 1]
{| cellspacing="2" cellpadding="5" border="1"
{| cellspacing="2" cellpadding="5" border="1"
|-
|-
Line 175: Line 188:


=== Day 2, Wednesday October 5th  ===
=== Day 2, Wednesday October 5th  ===
[https://vimeo.com/album/2163673/video/192472654 Recording Day 2]
{| cellspacing="2" cellpadding="5" border="1"
{| cellspacing="2" cellpadding="5" border="1"
|+
|+
Line 273: Line 288:


=== Day 3, Thursday October 6th  ===
=== Day 3, Thursday October 6th  ===
[http://s.idigbio.org/idigbio-downloads/a69d1541-4726-465d-84ad-50c7ed556eee.zip Download zipped dataset] The parameters for this dataset are specimens in the family Carabidae, that have geocoordinates, and are in California.  It results in about 25,000 records in total.
[http://s.idigbio.org/idigbio-downloads/a69d1541-4726-465d-84ad-50c7ed556eee.zip Download zipped dataset] The parameters for this dataset are specimens in the family Carabidae, that have geocoordinates, and are in California.  It results in about 25,000 records in total.<br/>
[https://vimeo.com/album/2163673/video/192472656 Recording Day 3]
 
{| cellspacing="2" cellpadding="5" border="1"
{| cellspacing="2" cellpadding="5" border="1"
|-
|-
Line 292: Line 309:
* [https://github.com/iDigBio/idigbio-search-api/wiki/Data-Quality-Flags List of iDigBio Flags]:  
* [https://github.com/iDigBio/idigbio-search-api/wiki/Data-Quality-Flags List of iDigBio Flags]:  
* Walk through steps of download, but provide dataset.
* Walk through steps of download, but provide dataset.
* Data set: http://s.idigbio.org/idigbio-downloads/a69d1541-4726-465d-84ad-50c7ed556eee.zip
* iDigBio Data set: http://s.idigbio.org/idigbio-downloads/a69d1541-4726-465d-84ad-50c7ed556eee.zip
|Matthew Collins (remote), Katja Seltmann, Shelley James<br>
|Matthew Collins (remote), Katja Seltmann, Shelley James<br>
|-
|-
Line 318: Line 335:
|-
|-
| 13:00<br>  
| 13:00<br>  
| Cleaning Datasets: Spreadsheets, Open Refine, tracking your work (2)<br>  
| [https://www.idigbio.org/sites/default/files/workshop-presentations/georef-research-use/GRU_spreadsheetsRefine6Oct2016.pptx Cleaning Datasets: Spreadsheets, Open Refine, tracking your work] (2)<br>  
| Deb Paul, Nelson Rios, Katja Seltmann<br>
| Deb Paul, Nelson Rios, Katja Seltmann<br>
|-
|-
Line 347: Line 364:


=== Day 4, Friday October 7th  ===
=== Day 4, Friday October 7th  ===
[https://ucsb.box.com/s/5qqiiqw237jr5mb7ip8hm5yspl4b8hcn Download zipped QGIS project] The project to the point we completed on Day 3 is available for download in the same folder as the auxiliary data. Launch the QGIS project from the '''Tutorial.qgs''' file. <br/>
[https://vimeo.com/album/2163673/video/192472655 Recording Day 4]
 
{| cellspacing="2" cellpadding="5" border="1"
{| cellspacing="2" cellpadding="5" border="1"
|-
|-
Line 379: Line 399:
| 11:00<br>  
| 11:00<br>  
| Exploring datasets: Uncertainty
| Exploring datasets: Uncertainty
* Bin points based on uncertainty rank</li>
* Bin points based on uncertainty rank
* Symbolize uncertainty by collector, data quality score - systematic error
* Symbolize uncertainty by collector, data quality score - systematic error
| Sara Lafia<br>
| Sara Lafia<br>
Line 390: Line 410:
|-
|-
| 12:00<br>  
| 12:00<br>  
| Lunch on our own.<br>
| Lunch on our own.
| <br>
| <br>
|-
|-
Line 437: Line 457:
<br/>
<br/>
Some software [http://www.datacarpentry.org/workshop-template/install.html install instructions] from Data and Software Carpentry
Some software [http://www.datacarpentry.org/workshop-template/install.html install instructions] from Data and Software Carpentry
== Requests for the Future ==
* Scripts/tools for repeated cleaning/analysis
* Using the iDigBio API (API for dummies)
* Inselect (note we provided links for more on this tool - to the workshop participants, see [https://docs.google.com/document/d/1m9cdERGtJkukb3EHUXPmCg58G08WWMA2HyBv28k6PUo/edit?usp=sharing google doc])
* Automated data cleaning - iDigBio and VertNet activities
* What to do with quantified uncertainties & polygons - Jorge Soberon (KU team, others in the fitness for use GBIF working group - see [https://www.gbif.org/document/82612/report-of-the-task-group-on-gbif-data-fitness-for-use-in-distribution-modelling Final Report of the Task Group on GBIF Data Fitness for Use in Distribution Modelling]
* QGIS layers - use cases (e.g. elevation)
* Detailed Workflows - for georeferencing, when not to georeference (see  iDigBio Georeferencing Working Group - https://www.idigbio.org/wiki/index.php/IDigBio_Working_Groups#Georeferencing_Working_Group_.28GWG.29), cleaning
* Documentation for tutorials
* Standards/possibility for storing multiple georeferences (and other possibilities such as annotations within iDigBio)
* QGIS tutorial as a Software/Data Carpentry format
* QGIS working group
* Geolocate with r webinar (follow on from Symbiota  webinar https://www.idigbio.org/content/symbiota-webinar-geolocate-toolkit https://www.idigbio.org/content/coge-collaborative-georeferencing-demo-webinar


== Trained Georeferencers ==
== Trained Georeferencers ==
Line 532: Line 567:
### '''GPS Status''': available for [https://play.google.com/store/apps/details?id=com.eclipsim.gpsstatus2&hl=en android] and [https://itunes.apple.com/us/app/gps-status/id378085995?mt=8 iOS] devices.
### '''GPS Status''': available for [https://play.google.com/store/apps/details?id=com.eclipsim.gpsstatus2&hl=en android] and [https://itunes.apple.com/us/app/gps-status/id378085995?mt=8 iOS] devices.
### '''Geopaparazzi''': [https://play.google.com/store/apps/details?id=eu.hydrologis.geopaparazzi&hl=en android] only
### '''Geopaparazzi''': [https://play.google.com/store/apps/details?id=eu.hydrologis.geopaparazzi&hl=en android] only
== Updates  ==
4,707

edits

Navigation menu