Data Ingestion Guidance: Difference between revisions

Line 44: Line 44:
*plus an RSS feed for ready access and update is recommended, otherwise email the files to us
*plus an RSS feed for ready access and update is recommended, otherwise email the files to us
*metadata:  
*metadata:  
** each specimen record needs to have a unique (within the dataset) identifier in the occurrenceID field.
**each specimen record needs to have a unique (within the dataset) identifier in the occurrenceID field.
** name the fields as close to Darwin Core as possible, in XML style, e.g., '''dwc:continent'', and additionally use the [[MISC-Authority-File-Working-Group#Data_Element_Lists_by_Data_Model_Concept|MISC field names]] (local iDigbio extensions to DarwinCore)
**name the fields as close to Darwin Core as possible, in XML style, e.g., '''dwc:continent'', and additionally use the [[MISC-Authority-File-Working-Group#Data_Element_Lists_by_Data_Model_Concept|MISC field names]] (local iDigbio extensions to DarwinCore)
*you need to have permission to submit the data
*you need to have permission to submit the data
*data recommendations for '''optimal searchability''':
*data recommendations for '''optimal searchability''' in exported data:
**put dates in [http://www.w3.org/TR/NOTE-datetime  ISO 8601 ] format, i.e., YYYY-MM-DD, e.g., 2014-06-22
**put dates in [http://www.w3.org/TR/NOTE-datetime  ISO 8601 ] format, i.e., YYYY-MM-DD, e.g., 2014-06-22
**put elevation in METERS units in the elevation field without the units (e.g., the fields minimumElevationInMeters and maximumElevationInMeters already assume the numeric values are in meters, so no need to include the units with the data)
**put elevation in METERS units in the elevation field without the units (e.g., the fields minimumElevationInMeters and maximumElevationInMeters already assume the numeric values are in meters, so no need to include the units with the data)
Line 53: Line 53:
**no '0' in fields to represent no value, e.g., lat or lon
**no '0' in fields to represent no value, e.g., lat or lon
**lat and lon coordinates need to be in decimal, and no N, S, E, W
**lat and lon coordinates need to be in decimal, and no N, S, E, W
**parse out genus, species, infraspecific epithet if already aggregated into a scientific name
**parse genus, species, infraspecific epithet if already aggregated into a scientific name
**include parsed higher taxonomy
**include parsed higher taxonomy
**save the data in UTF8 format when exporting for ingestion -  to preserve diacritics in people and place names
**save the data in UTF8 format to preserve diacritics in people and place names


===Packaging for images/media objects===
===Packaging for images/media objects===
5,887

edits