Dataset Errata: Difference between revisions

no edit summary
No edit summary
No edit summary
Line 7: Line 7:


datasets/lichens/gold/ocr/WIS-L-0012040_lg.txt: Longitude recorded as L49 (capitalized for clarity) instead of 149
datasets/lichens/gold/ocr/WIS-L-0012040_lg.txt: Longitude recorded as L49 (capitalized for clarity) instead of 149
== Right single Quote ==
The following files contain the unicode character u+2019, Right Single Quotation Mark
datasets/lichens/gold/ocr/NY01075760_lg.txt
datasets/lichens/gold/ocr/NY01075761_lg.txt
datasets/lichens/gold/ocr/NY01075761_lg.txt
datasets/lichens/gold/ocr/NY01075762_lg.txt
datasets/lichens/gold/ocr/NY01075764_lg.txt
datasets/lichens/gold/ocr/NY01075768_lg.txt
datasets/lichens/gold/ocr/NY01075768_lg.txt
datasets/lichens/gold/ocr/NY01075770_lg.txt
datasets/lichens/gold/ocr/NY01075771_lg.txt
datasets/lichens/gold/ocr/NY01075771_lg.txt
datasets/lichens/gold/ocr/NY01075771_lg.txt
datasets/lichens/gold/ocr/NY01075776_lg.txt
datasets/lichens/gold/ocr/NY01075777_lg.txt
datasets/lichens/gold/ocr/NY01075779_lg.txt
datasets/lichens/gold/ocr/NY01075779_lg.txt
datasets/lichens/gold/ocr/NY01075781_lg.txt
datasets/lichens/gold/ocr/NY01075785_lg.txt
datasets/lichens/gold/ocr/NY01075785_lg.txt
datasets/lichens/gold/ocr/NY01075786_lg.txt
datasets/lichens/gold/ocr/NY01075786_lg.txt
datasets/lichens/gold/ocr/NY01075787_lg.txt
datasets/lichens/gold/ocr/NY01075787_lg.txt
datasets/lichens/gold/ocr/NY01075788_lg.txt
datasets/lichens/gold/ocr/NY01075788_lg.txt
datasets/lichens/gold/ocr/NY01075789_lg.txt
datasets/lichens/gold/ocr/NY01075789_lg.txt
datasets/lichens/gold/ocr/NY01075797_lg.txt
datasets/lichens/gold/ocr/NY01075798_lg.txt
datasets/lichens/gold/ocr/NY01075812_lg.txt
datasets/lichens/gold/ocr/NY01075817_lg.txt
datasets/lichens/gold/ocr/NY01075818_lg.txt
datasets/lichens/gold/ocr/NY01075819_lg.txt
datasets/lichens/gold/ocr/NY01075820_lg.txt
datasets/lichens/gold/ocr/NY01075821_lg.txt
datasets/lichens/gold/ocr/NY01075821_lg.txt
datasets/lichens/gold/ocr/NY01075822_lg.txt
datasets/lichens/gold/ocr/NY01075828_lg.txt
datasets/lichens/gold/ocr/NY01075829_lg.txt
datasets/lichens/gold/ocr/NY01075830_lg.txt
datasets/lichens/gold/ocr/NY01075831_lg.txt
datasets/lichens/gold/ocr/TENN-L-0000059_lg.txt
datasets/lichens/gold/ocr/TENN-L-0000073_lg.txt
datasets/lichens/gold/ocr/WIS-L-0011728_lg.txt
datasets/lichens/gold/ocr/WIS-L-0011730_lg.txt
datasets/lichens/gold/ocr/WIS-L-0011736_lg.txt
datasets/lichens/gold/ocr/WIS-L-0012033_lg.txt
datasets/lichens/gold/ocr/WIS-L-0012035_lg.txt
datasets/lichens/gold/ocr/WIS-L-0012039_lg.txt
datasets/lichens/gold/ocr/WIS-L-0012082_lg.txt
== Right single Quote ==
The following files contain the unicode character u+201D, Right Double Quotation Mark
datasets/lichens/gold/ocr/WIS-L-0012053_lg.txt


::Inconsistency in Gold Parsed decimalLatitude and decimalLongitude in many labels.  All omitted from NYBG lichens and Tennesee lichens.  Gold Parsed WIS-L-0011728_lg.csv has decimalLatitude & decimalLongitude rounded to 3 decimal digits (e.g. 60.467).  WIS-L-0011729_lg.csv has decimalLatitude rounded to 2 decimal digits (60.15), decimalLongitude rounded to 1 decimal digit (-152.6).  Typical of variations found throughout the files.  It's possible that trailing zeros were just stripped off, but this inconsistency makes it impossible to match all the labels with a parsing program.
::Inconsistency in Gold Parsed decimalLatitude and decimalLongitude in many labels.  All omitted from NYBG lichens and Tennesee lichens.  Gold Parsed WIS-L-0011728_lg.csv has decimalLatitude & decimalLongitude rounded to 3 decimal digits (e.g. 60.467).  WIS-L-0011729_lg.csv has decimalLatitude rounded to 2 decimal digits (60.15), decimalLongitude rounded to 1 decimal digit (-152.6).  Typical of variations found throughout the files.  It's possible that trailing zeros were just stripped off, but this inconsistency makes it impossible to match all the labels with a parsing program.
150

edits