Data Ingestion Guidance: Difference between revisions

Data Ingestion Guidance (view source)

Revision as of 13:31, 28 May 2015

No change in size , 28 May 2015

→‎Instructions on Changing Identifiers

Joanna

5,887

edits

@@ Line 197: / Line 197: @@
 All updates for iDigBio should be sent to us using the method by which you originally published your data. For most data systems, this will mean generating a whole new export of your data periodically. iDigBio will examine the new data file, and convert it into an update-only dataset on our end. For publishers using RSS feeds, we automatically harvest these updates daily, and process them in about a week unless there are interruptions in our data ingestion workflow, such as system maintenance or your update getting stuck behind a very large ingestion run. If you remove any records from your data export, iDigBio will flag those records as deleted in our system, and remove them from our indexes, but they will still be available via our data API to those who know the identifiers of the records.
-==Instructions on Changing Identifiers==
+==Instructions on changing identifiers==
 If you have already had your data ingested by iDigBio, and you decide to reformat or replace your specimen identifiers (occurenceIDs), and are not giving us a record identifier (recordID) with your record, you will need to add the following to your Darwin Core Archive:
 * include the resource relationship extension in your archive and document the relationship using the OWL 'sameAs' relationship (http://www.w3.org/TR/owl-ref/#sameAs-def). A trivial example archive can be found at: [http://www.idigbio.org/sites/default/files/sites/default/files/DarwinCoreExamples/sameAs.zip sameAs Archive]