150
edits
Line 62: | Line 62: | ||
== Right single Quote == | == Right single Quote == | ||
The following files contain the unicode character u+201D, Right Double Quotation Mark | The following files contain the unicode character u+201D, Right Double Quotation Mark | ||
datasets/lichens/gold/ocr/WIS-L-0012053_lg.txt | * datasets/lichens/gold/ocr/WIS-L-0012053_lg.txt | ||
== Parse file errors == | |||
::Inconsistency in Gold Parsed decimalLatitude and decimalLongitude in many labels. All omitted from NYBG lichens and Tennesee lichens. Gold Parsed WIS-L-0011728_lg.csv has decimalLatitude & decimalLongitude rounded to 3 decimal digits (e.g. 60.467). WIS-L-0011729_lg.csv has decimalLatitude rounded to 2 decimal digits (60.15), decimalLongitude rounded to 1 decimal digit (-152.6). Typical of variations found throughout the files. It's possible that trailing zeros were just stripped off, but this inconsistency makes it impossible to match all the labels with a parsing program. | ::Inconsistency in Gold Parsed decimalLatitude and decimalLongitude in many labels. All omitted from NYBG lichens and Tennesee lichens. Gold Parsed WIS-L-0011728_lg.csv has decimalLatitude & decimalLongitude rounded to 3 decimal digits (e.g. 60.467). WIS-L-0011729_lg.csv has decimalLatitude rounded to 2 decimal digits (60.15), decimalLongitude rounded to 1 decimal digit (-152.6). Typical of variations found throughout the files. It's possible that trailing zeros were just stripped off, but this inconsistency makes it impossible to match all the labels with a parsing program. | ||
edits