Hackathon Challenge: Difference between revisions

Jump to navigation Jump to search
m
Line 24: Line 24:
**host server name: aocr1.acis.ufl.edu
**host server name: aocr1.acis.ufl.edu
**user name and password given to you at our first meeting and via email.
**user name and password given to you at our first meeting and via email.
*Sample of what you will see there:
*Sample of what you will see there for Set 1 (LBCC TCN lichen bryophyte packet labels):
<pre>human hand-parses the image (no errors) into a text file == gold.txt
<pre>human hand-parses the image (no errors) into a text file == gold.txt
     sample: /home/aocr/egilbert/dataset/gold/outputs
     sample: ~/egilbert/dataset/gold/outputs
human (parses) gets the data out of the gold.txt files into a csv file (darwin core fields) == gold.csv
human (parses) gets the data out of the gold.txt files into a csv file (darwin core fields) == gold.csv
     sample: /home/aocr/egilbert/dataset/gold/parsed
     sample: ~/egilbert/dataset/gold/parsed
OCR (of choice, ABBYY, TESSERACT, GOCR/JOCR, OCRopus, Omnipage) run on these images = output to silver.txt files
OCR (of choice, ABBYY, TESSERACT, GOCR/JOCR, OCRopus, Omnipage) run on these images = output to silver.txt files
     sample: /home/aocr/egilbert/dataset/gold/parsed
     sample: ~/egilbert/dataset/silver/outputs
3a. human (parses) the "dirty" OCR out of these silver.txt in to darwin core fields ==silver.csv
3a. human (parses) the "dirty" OCR out of these silver.txt in to darwin core fields ==silver.csv
     sample: /home/aocr/egilbert/dataset/silver/parsed</pre>
     sample: ~/egilbert/dataset/silver/parsed</pre>


== Parameters ==
== Parameters ==
4,707

edits

Navigation menu