Data Ingestion Guidance: Difference between revisions

Jump to navigation Jump to search
No edit summary
Line 205: Line 205:
*Columns are defined in the meta.xml so the column headers in the multimedia file itself are a convenience but not actually significant to the meaning or processing of the column.
*Columns are defined in the meta.xml so the column headers in the multimedia file itself are a convenience but not actually significant to the meaning or processing of the column.


A pristine sample of a minimally-populated AC CSV published via an extension in a Darwin Core Archive:
A pristine sample of a minimally-populated Audubon Core (AC) CSV published via an extension in a Darwin Core Archive:


<pre>coreid, identifier, type, format, accessURI, rights, owner, creator, metadataLanguage
<pre>coreid, identifier, type, format, accessURI, rights, owner, creator, metadataLanguage
Line 218: Line 218:




The columns are defined in the accompanying meta.xml.
The column mappings are defined in the accompanying meta.xml.


*If submitting media records with specimen data records, here are the critical fields to fill in:
If submitting media records with specimen data records, here are the critical fields to fill in:
**'''coreid''' - If media data are being provided via an extension, the coreid field in the Audubon Core extension file is what links the media record to the specimen record. "coreid" is not a term defined by Darwin Core or Audubon Core. The value in the extension coreid column will link to a value in the core file "id" column (normally column 0). Examples: <pre>urn:catalog:institutionCode:collectionCode:catalogNumber</pre><pre>urn:uuid:32e5da5d-c747-435c-a368-07d989259bf4</pre><pre>123456</pre>
*'''coreid''' - If media data are being provided via an extension, the coreid field in the Audubon Core extension file is what links the media record to the specimen record. "coreid" is not a term defined by Darwin Core or Audubon Core. The value in the extension coreid column will link to a value in the core file "id" column (normally column 0). Examples: <pre>urn:catalog:institutionCode:collectionCode:catalogNumber</pre><pre>urn:uuid:32e5da5d-c747-435c-a368-07d989259bf4</pre><pre>123456</pre>
**'''identifier''' ([http://purl.org/dc/terms/identifier dcterms:identifier] or [http://purl.org/dc/elements/1.1/identifier dc:identifier]) = The persistent and unique id of the media record within the Audubon Core file. It may be tempting to use the URL of the media as the identifier. However, we have seen multiple cases where media have moved, making the identifier not persistent. If you have multiple types of identifiers for a media, put the least stable here and the most stable in ac:providerManagedID.  Examples: <pre>urn:uuid:84fb24fa-fd15-476a-99a6-a7f876b87d08</pre>
*'''identifier''' ([http://purl.org/dc/terms/identifier dcterms:identifier] or [http://purl.org/dc/elements/1.1/identifier dc:identifier]) = The persistent and unique id of the media record within the Audubon Core file. It may be tempting to use the URL of the media as the identifier. However, we have seen multiple cases where media have moved, making the identifier not persistent. If you have multiple types of identifiers for a media, put the least stable here and the most stable in ac:providerManagedID.  Examples: <pre>urn:uuid:84fb24fa-fd15-476a-99a6-a7f876b87d08</pre>
**'''format''' ([http://purl.org/dc/elements/1.1/format dc:format]) = Media Type / MIME Type (from http://www.iana.org/assignments/media-types/media-types.xhtml controlling vocabulary if possible). Examples: <pre>image/jpeg</pre><pre>audio/mpeg</pre>
*'''format''' ([http://purl.org/dc/elements/1.1/format dc:format]) = Media Type / MIME Type (from http://www.iana.org/assignments/media-types/media-types.xhtml controlling vocabulary if possible). Examples: <pre>image/jpeg</pre><pre>audio/mpeg</pre>
**'''accessURI''' ([http://rs.tdwg.org/ac/terms/accessURI ac:accessURI]) = direct http link to the media file. Note that the media type (format) *must* match the media type of the resource at the target end of this accessURI. For example, if the format is "image/jpeg" then accessURI '''must''' link to an image, not a web page. Examples: <pre>http://example.com/IMAGES/00000001.jpg</pre><pre>http://example.com/objects/987654321</pre>
*'''accessURI''' ([http://rs.tdwg.org/ac/terms/accessURI ac:accessURI]) = direct http link to the media file. Note that the media type (format) *must* match the media type of the resource at the target end of this accessURI. For example, if the format is "image/jpeg" then accessURI '''must''' link to an image, not a web page. Examples: <pre>http://example.com/IMAGES/00000001.jpg</pre><pre>http://example.com/objects/987654321</pre>
**'''providerManagedID''' ([http://rs.tdwg.org/ac/terms/providerManagedID ac:providerManagedID]) =  (Optional) If you have a stable UUID GUID for your media records and you have populated "dc:identifier" with a different type of identifier, place the guid in the optional ac:providerManagedID field. Examples: <pre>urn:uuid:32e5da5d-c747-435c-a368-07d989259bf4</pre>
*'''providerManagedID''' ([http://rs.tdwg.org/ac/terms/providerManagedID ac:providerManagedID]) =  (Optional) If you have a stable UUID GUID for your media records and you have populated "dc:identifier" with a different type of identifier, place the guid in the optional ac:providerManagedID field. Examples: <pre>urn:uuid:32e5da5d-c747-435c-a368-07d989259bf4</pre>


'''Note:''' dc:terms format and dc:type should match the type of the object returned by ac:accessURI (If ac:accessURI is not present, dc:terms format and dc:type should not be present either), especially in the case where ac:furtherInformationURL is used as an alternative to ac:accessURI.  Media embedded on a webpage is a considered a webpage and thus will not be treated as media.  accessURI should point to the media itself.
'''Note:''' dc:terms format and dc:type should match the type of the object returned by ac:accessURI (If ac:accessURI is not present, dc:terms format and dc:type should not be present either), especially in the case where ac:furtherInformationURL is used as an alternative to ac:accessURI.  Media embedded on a webpage is a considered a webpage and thus will not be treated as media.  accessURI should point to the media itself.
1,554

edits

Navigation menu