Dataset Errata: Difference between revisions

m
Line 2: Line 2:


::Gold label NY01075763_lg.txt has Pyrenidium actinellurn, should be Pyrenidium actinellum.  Gold Parsed copies the error verbatim (as it should) and needs to be corrected if the .txt file is corrected.
::Gold label NY01075763_lg.txt has Pyrenidium actinellurn, should be Pyrenidium actinellum.  Gold Parsed copies the error verbatim (as it should) and needs to be corrected if the .txt file is corrected.
::::/home/aocr/datasets/lichens/gold/outputs/human/NY01075763_lg.txt fixed dp
::::/home/aocr/datasets/lichens/gold/parsed/human/NY01075763_lg.csv fixed dp


::Inconsistency in Gold Parsed decimalLatitude and decimalLongitude in many labels.  All omitted from NYBG lichens and Tennesee lichens.  Gold Parsed WIS-L-0011728_lg.csv has decimalLatitude & decimalLongitude rounded to 3 decimal digits (e.g. 60.467).  WIS-L-0011729_lg.csv has decimalLatitude rounded to 2 decimal digits (60.15), decimalLongitude rounded to 1 decimal digit (-152.6).  Typical of variations found throughout the files.  It's possible that trailing zeros were just stripped off, but this inconsistency makes it impossible to match all the labels with a parsing program.
::Inconsistency in Gold Parsed decimalLatitude and decimalLongitude in many labels.  All omitted from NYBG lichens and Tennesee lichens.  Gold Parsed WIS-L-0011728_lg.csv has decimalLatitude & decimalLongitude rounded to 3 decimal digits (e.g. 60.467).  WIS-L-0011729_lg.csv has decimalLatitude rounded to 2 decimal digits (60.15), decimalLongitude rounded to 1 decimal digit (-152.6).  Typical of variations found throughout the files.  It's possible that trailing zeros were just stripped off, but this inconsistency makes it impossible to match all the labels with a parsing program.
4,713

edits