4,713
edits
m (→The Challenge) |
m (→Parameters) |
||
Line 13: | Line 13: | ||
*For the hackathon there will be at least 600 examples of OCR text, in 3 groups of 200, that have been previously properly classified/parsed by humans. | *For the hackathon there will be at least 600 examples of OCR text, in 3 groups of 200, that have been previously properly classified/parsed by humans. | ||
**This parsed text may be used for training some learning algorithms. | **This parsed text may be used for training some learning algorithms. | ||
**This set will also be used for evaluation of performance of parsing algorithms. *Overfitting is a potential problem so at the time of the hackathon we may provide additional testing records for evaluation. | **This set will also be used for evaluation of performance of parsing algorithms. | ||
*Overfitting is a potential problem so at the time of the hackathon we may provide additional testing records for evaluation. | |||
== Scope == | == Scope == |