Hackathon FAQ

From iDigBio
Revision as of 01:15, 11 January 2013 by Dpaul (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Frequently Asked Questions

Questions and Answers

  1. Evaluation Process: What is the format of the parsing evaluation? What is the test?

The test is how close you can match the human-parsed gold and silver standard CSV files.

  1. Do we each generate output at the hackathon? Bring completed data with us?

Yes, bring results. But it will be possible to generate new parsed output from existing OCR and run evaluation software again while at the hackathon. It's also possible to run partial sets of images back through OCR software and run parsing again. Given the 2-day agenda, it's probably not feasible to run OCR and output algorithms on all 10,000 images in a dataset at the hackathon.

  1. What if I parse and refine the data farther than required, even parsing out more fields than set in the parameters? Any extra columns in the CSV files output by participants (not in the current specified set) are okay and don't affect metrics.