Hackathon Challenge: Difference between revisions

Jump to navigation Jump to search
m
Line 18: Line 18:
:::; ''Perfect OCR text files'' : Hand-transcribed from each image, these text files represent faithfully (exactly) what is in the image and are supposed to reflect what the output would look like if the OCR understood all the data in the image (including the handwriting).
:::; ''Perfect OCR text files'' : Hand-transcribed from each image, these text files represent faithfully (exactly) what is in the image and are supposed to reflect what the output would look like if the OCR understood all the data in the image (including the handwriting).
:::; Gold CSV files : These Gold CSV files have darwin core element column headers and the data parsed into the appropriate column. Data to populate these Gold CSV files comes from the hand-transcribed gold text files.
:::; Gold CSV files : These Gold CSV files have darwin core element column headers and the data parsed into the appropriate column. Data to populate these Gold CSV files comes from the hand-transcribed gold text files.
:::; Silver CSV files : These Silver CSV files also have the same darwin core element column headers and the data parsed into the appropriate column. But, the data here is from the OCR "as is." The same data, with any OCR errors, from the same images is now captured and put into each silver CSV.
:::; Silver CSV files : These Silver CSV files also have the same darwin core element column headers and the data parsed into the appropriate column. But, the data here is from the OCR output "as is." The same data, with any OCR errors, from the same images is now captured and put into each silver CSV.


== Parameters ==
== Parameters ==
4,707

edits

Navigation menu