Hackathon Challenge: Difference between revisions

Jump to navigation Jump to search
Line 69: Line 69:
This is open to debate, but I think Elevation should be a pure numeric field, assumed to be in meters.  Therefore, it should not be expressed as "750 m", but rather as "750".  verbatimElevation, of course, should retain the "m" if it was present on the label.  (Note that Darwin Core apparently does not have a field called "elevation", but rather MinimumElevationInMeters, and MaximumElevationInMeters, both numeric fields.)  Not sure if this is something to change on the labels, but worth being aware of.  I think parsing programs should generate the Darwin Core fields. (Daryl)
This is open to debate, but I think Elevation should be a pure numeric field, assumed to be in meters.  Therefore, it should not be expressed as "750 m", but rather as "750".  verbatimElevation, of course, should retain the "m" if it was present on the label.  (Note that Darwin Core apparently does not have a field called "elevation", but rather MinimumElevationInMeters, and MaximumElevationInMeters, both numeric fields.)  Not sure if this is something to change on the labels, but worth being aware of.  I think parsing programs should generate the Darwin Core fields. (Daryl)


Inconsistency in the Gold Parsed labels for Country.  If a US State is listed as the state, the label doesn't always say the name of the country, though it is obviously the USA.  Some Gold parsed results leave it blank, some fill it in with "USA", or "United States", though neither of these are on the label.  I think it is valid to fill it in, but it should be consistent.
Inconsistency in the Gold Parsed labels for Country.  If a US State is listed as the state, the label doesn't always say the name of the country, though it is obviously the USA.  Some Gold parsed results leave it blank, some fill it in with "USA", or "United States", though neither of these are on the label.  I think it is valid to fill it in, but it should be consistent. (Daryl)


Many Gold Parse Tennessee lichen labels have country errors.  Examples:
Many Gold Parse Tennessee lichen labels have country errors.  Examples:
Line 75: Line 75:
-- Gold Parsed TENN-L-0000001_lg.csv lists country as  "USA", but on the .txt label, it is "U.S.A." (with periods).  
-- Gold Parsed TENN-L-0000001_lg.csv lists country as  "USA", but on the .txt label, it is "U.S.A." (with periods).  


-- Gold Parsed TENN-L-0000005_lg.csv leaves country blank, but the label shows it as "USA".
-- Gold Parsed TENN-L-0000005_lg.csv leaves country blank, but the label shows it as "USA". (Daryl)




Line 84: Line 84:
-- TENN-L-0000017_lg.csv omits dateIdentified, though it is on the label as 3 Feb. 1963
-- TENN-L-0000017_lg.csv omits dateIdentified, though it is on the label as 3 Feb. 1963


-- TENN-L-0000019_lg.csv has 1954-Aug-8, but on the label it is "8 Aug 1954", again neither verbatim nor DarwinCore (1954-08-08).
-- TENN-L-0000019_lg.csv has 1954-Aug-8, but on the label it is "8 Aug 1954", again neither verbatim nor DarwinCore (1954-08-08). (Daryl)


== Parameters ==
== Parameters ==