IDigBio Augmenting OCR Workshop

From iDigBio
Revision as of 02:01, 22 September 2012 by Dpaul (Talk | contribs)

Jump to: navigation, search

Overview Augmenting OCR Workshop

The Augmenting OCR Working Group, the iDigBio IT staff and invited guests meet October 1 - 2, 2012 in Gainesville, Florida for a 2 day intensive workshop to plan a hackathon and concurrent workshop, put together iDigBio Wiki content from collective knowledge about OCR in digitization workflows, and learn about the latest developments in OCR and NLP from all invited participants.

Karl-Heinz Steinke from Hannover, Germany and the Herbar Digital project is our key speaker. Karl-Heinz' group has been working for the last 5 years on improving OCR algorithms for recognizing handwriting and on OCR algorithms in general as part of making digitization of herbarium specimens more efficient. See Feature recognition for herbarium specimens (Herbar-Digital) to learn more about this project's work.

A Hackathon for February 2013 is on our list. We're set up to head for the Botanical Research Institute of Texas (BRIT) in February of 2013 to make strides in just what OCR, ML and NLP can do to make our digitization efforts more efficient in producing data faster and producing data that's fit-for-use. We'll be choosing our hackathon focus and designing the hackathon together with the iDigBio IT Staff at the upcoming October workshop.

As part of our working group's outreach efforts, we've set up participation in the upcoming iSchools Conference in Fort Worth, Texas in February 2013 where our working group is participating in three ways. We're submitting a poster, a notes paper, and hosting a half-day workshop to showcase our work and seek out potential collaborators. The iSchools 2013 theme is Data-Innovation-Wisdom which lines up perfectly with the goals of the ADBC, iDigBio and the TCNs. This conference is concurrent with our hackathon at BRIT.


  • Find details here for lodging, meals, maps, reimbursement information, hotel shuttle details and more.


See a list of meeting participants including working group members, iDigBio staff, invited guests and remote participants.

Attending Remotely

via Adobe Connect at Augment OCR

Workshop Materials

Conceptual Agenda Agenda by Day Hackathon Topic List - "our wish list" Priority Issues Outlined

Taking Notes

Hackathon Issues

Nescent White Paper ppt