Data Ingestion Guidance: Difference between revisions

Data Ingestion Guidance (view source)

403 bytes added , 28 May 2015

150

edits

@@ Line 195: / Line 195: @@
 ==Sending updates to iDigBio==
-If you are using IPT on an RSS feed, we will automatically pick up your data changes. Our harvester is operating on a weekly basis, unless there are local interruptions. If you not using IPT, and thus not on an RSS feed, then you should create a replacement recordset. We use an all-or-nothing strategy. If you were to only give us the changes, we would assume that the missing records had been deleted.
+All updates to for iDigBio should be sent to us using the method by which you originally published your data. For most data systems, this will mean generating a whole new export of your data periodically. iDigBio will examine the new data file, and convert it into an update-only dataset on our end. For publishers using RSS feeds, we automatically harvest these updates daily, and process them in about a week unless there are interruptions in our data ingestion workflow, such as system maintenance or your update getting stuck behind a very large ingestion run. If you remove any records from your data export, iDigBio will flag those records as deleted in our system, and remove them from our indexes, but they will still be available via our data API to those who know the identifiers of the records.
 ==Instructions on Changing Identifiers==
 If you have already had your data ingested by iDigBio, and you decide to revamp your identifiers, thereby replacing them all, here is what you should add to your Darwin Core Archive: