Transcription Hackathon: Difference between revisions

Jump to navigation Jump to search
no edit summary
No edit summary
 
(9 intermediate revisions by 5 users not shown)
Line 1: Line 1:
[[Category:Transcription Hackathon]]
[[Category:Transcription Hackathon]][[Category:Workshop]]


'''Notes from Nature/iDigBio Hackathon to Further Enable Public Participation in the Online Transcription of Biodiversity Specimen Labels'''  
'''Notes from Nature/iDigBio Hackathon to Further Enable Public Participation in the Online Transcription of Biodiversity Specimen Labels'''  


December 16–20 at the University of Florida, Gainesville  
December 16–20 at the University of Florida, Gainesville  
{| class="wikitable" style="float:right;"
! colspan="2" style="background:#D58B28;width:200px;font-size:10pt" | Digitizing the Past and Present for the Future
|-
| colspan="2" style="text-align:center;font-size:7pt" | <!--YOU CAN INSERT A NEW IMAGE FOR THE LOGO BETWEEN THE COLON AND THE PIPE-->[[Image:IDigBio Logo RGB.png|center|300px|iDigBio Logo RGB.png]]<br />
|-
!colspan="2" style="background:#D58B28;text-align:center;font-size:9pt" | Quick Links for Transcription Hackathon Workshop
|-
|[https://docs.google.com/document/d/1TyluwM1rMcq7O_nidy8CLJFMW4FrOPjsHkrLVho5cVU/edit?usp=sharing Transcription Hackathon Workshop Agenda]
|-
|[https://www.idigbio.org/biblio?f%5bkeyword%5d=274 Transcription Hackathon Workshop Biblio Entries]
|-
|[https://www.idigbio.org/content/citscribe-hackathon Transcription Hackathon Workshop Report]
|}


== Agenda and Logistics  ==
== Agenda and Logistics  ==
Line 16: Line 30:
*[https://www.facebook.com/media/set/?set=a.645283388848944.1073741833.215120891865198&type=1 Citscribe Hackathon Facebook Album]
*[https://www.facebook.com/media/set/?set=a.645283388848944.1073741833.215120891865198&type=1 Citscribe Hackathon Facebook Album]
*Twitter stuff: @iDigBio @NfromN hashtag #CITScribe
*Twitter stuff: @iDigBio @NfromN hashtag #CITScribe
==Report==
*[https://www.idigbio.org/content/citscribe-hackathon Citscribe Hackathon Report]


== Coordination  ==
== Coordination  ==
Line 37: Line 54:
*Joshua Campbell, iDigBio: [https://www.idigbio.org/sites/default/files/workshop-presentations/citscribe/CampbelliDigBioCrowdsourcingHackathon2013.pdf Herbarium Labels Transcription Crowdsourcing Consensus]
*Joshua Campbell, iDigBio: [https://www.idigbio.org/sites/default/files/workshop-presentations/citscribe/CampbelliDigBioCrowdsourcingHackathon2013.pdf Herbarium Labels Transcription Crowdsourcing Consensus]
*Yonggang Liu, ACIS iDigBio: [https://www.idigbio.org/sites/default/files/workshop-presentations/citscribe/Yonggang_image_ingestion_appliance.pdf iDigBio Image Ingestion Appliance]
*Yonggang Liu, ACIS iDigBio: [https://www.idigbio.org/sites/default/files/workshop-presentations/citscribe/Yonggang_image_ingestion_appliance.pdf iDigBio Image Ingestion Appliance]
*Paul Kimbereley, Smithsonian: [https://www.idigbio.org/sites/default/files/workshop-presentations/citscribe/SI_Center.pdf Smithsonian Transcription Center]
*Paul Kimberly, Smithsonian: [https://www.idigbio.org/sites/default/files/workshop-presentations/citscribe/SI_Center.pdf Smithsonian Transcription Center]
*William Ulate, Missouri Botanical Garden: [[Media:Purposeful_Gaming_BHL_Dec_2013.pdf|Purposeful Gaming and BHL]]


== Development Resources  ==
== Development Resources  ==
Line 65: Line 83:
* Gold Images from aOCR Hackthon:
* Gold Images from aOCR Hackthon:
** CSV file with URLs for the Images on iDigBio beta server (Uploaded by Image Ingestion Appliance): [http://www.acis.ufl.edu/~yonggang/idigbio/recordset/gold/ent.csv ent], [http://www.acis.ufl.edu/~yonggang/idigbio/recordset/gold/herb.csv herb],[http://www.acis.ufl.edu/~yonggang/idigbio/recordset/gold/lichens.csv lichens].
** CSV file with URLs for the Images on iDigBio beta server (Uploaded by Image Ingestion Appliance): [http://www.acis.ufl.edu/~yonggang/idigbio/recordset/gold/ent.csv ent], [http://www.acis.ufl.edu/~yonggang/idigbio/recordset/gold/herb.csv herb],[http://www.acis.ufl.edu/~yonggang/idigbio/recordset/gold/lichens.csv lichens].
* Code from the aOCR Hackthon:
* Code from the aOCR Hackthon:
** HandwritingDetection (https://github.com/idigbio-aocr): an algorithm that separates images into sets with no handwriting, little handwriting (mostly text typed or printed), lots of handwriting, based on the noise generated by the OCR software. [http://manuscripttranscription.blogspot.com/2013/02/detecting-handwriting-in-ocr-text.html Read more at Ben's blog]. This could be used to rank which images are in more need for human transcription.
** HandwritingDetection (https://github.com/idigbio-aocr): an algorithm that separates images into sets with no handwriting, little handwriting (mostly text typed or printed), lots of handwriting, based on the noise generated by the OCR software. [http://manuscripttranscription.blogspot.com/2013/02/detecting-handwriting-in-ocr-text.html Read more at Ben's blog]. This could be used to rank which images are in more need for human transcription.
Line 73: Line 90:
* Hi all - (Paul Flemons).
* Hi all - (Paul Flemons).
**I have uploaded a number of files:
**I have uploaded a number of files:
***[https://www.idigbio.org/wiki/index.php/File:OpenRefine_procedures_for_EVENTS_1212a.pdf a description of Open Refine procedures used for matching BVP fields to EMu EVENTS]
***[[Media:OpenRefine_procedures_for_EVENTS_1212a.pdf|a description of Open Refine procedures used for matching BVP fields to EMu EVENTS]]
***[https://www.idigbio.org/wiki/index.php/File:Preparing_BVP_data_for_import_into_EMu_-_process_1212a.pdf Detailed process of preparing BVP data for EMu]
***[[Media:Preparing_BVP_data_for_import_into_EMu_-_process_1212a.pdf|Detailed process of preparing BVP data for EMu]]
***[https://www.idigbio.org/wiki/index.php/File:Preparing_BVP_data_for_import_into_EMu_-_overview.pdf Overview of preparing BVP data for EMu]
***[[Media:Preparing_BVP_data_for_import_into_EMu_-_overview.pdf|Overview of preparing BVP data for EMu]]
***[File:VisioDiagramofProcess.JPG|Diagram of the process of preparing data from BVP for EMu]]
***[[Media:VisioDiagramofProcess.JPG|Diagram of the process of preparing data from BVP for EMu]]


*From Steve Raden: some background on Zooniverse's design
*From Steve Raden: some background on Zooniverse's design
1,650

edits

Navigation menu