Title | iConference2013: The Apiary Project: a workflow for text extraction and parsing for herbarium specimens |
Publication Type | Presentation |
Year of Publication | 2013 |
Authors | Best, Jason |
Keywords | Augmenting Optical Character Recognition, collaboration, EF-1115210, Hackathon, iConference2013, Machine Learning, Natural Language Processing, parsing algorithms, Research, Transcription, workflow |
Abstract | The Apiary Project: combining OCR technology, OCR output from herbarium specimen or other images containing museum specimen data, well-developed regular expressions for parsing output to Darwin Core fields, and humans-in-the-digitization-loop for an elegant, sophisticated user-interface for a workflow designed to maximize the value of the human interaction, minimize steps, and speed data throughput. |
URL | https://www.idigbio.org/sites/default/files/workshop-presentations/aocr-hackathon/JBest-Apiary-iConference.ppt |