Talk:Data Ingestion Guidance

From iDigBio
Revision as of 19:33, 9 January 2014 by Dpaul (talk | contribs)
Jump to navigation Jump to search

--Dpaul 16:49, 9 January 2014 (EST)

  1. Regarding data@idigbio.org now that dp has a gator link, does this mean I can be added to the mailing list that "sees" data@idigbio.org--Dpaul
  2. Where do users send email if they have a question
    1. data@idigbio.org
    2. the feedback (or both)?--Dpaul (talk) 17:48, 9 January 2014 (EST)
  3. (At Morphbank, to keep this straightforward, all email requests for help go to morphbank@scs.fsu.edu
    1. We do not have 2 separate paths for help / issues requests.
    2. I am assuming (I think) that clicking to send "feedback" generates a Redmine ticket (efficient and transparent). But, data@idigbio.org is not transparent.--Dpaul
  4. About this Section: Registering Your Collection in Preparation for Data Ingestion
    1. I suggest a different order. See next.--Dpaul

When your data are ready for ingestion, please see the next steps.

  1. Get an iDigBio account for yourself (if you don't have one yet). https://www.idigbio.org/auth/login.php
    1. These are the only login credentials you will need.
  2. Log in with your iDigBio account username and password. https://www.idigbio.org/auth/login.php
  3. Register your collection. http://portal.idigbio.org/register OR
  4. Register your collection at GRBIO
    1. Repository: http://grbio.org/find-biorepositories OR
    2. Institutional Collections: http://grbio.org/find-institutional-collections
  5. If you are already on the portal page, the 'Register A Collection' is in the menu under your login name in the upper right of the page.

  1. About this next section:Data Requirements--Dpaul (talk) 18:33, 9 January 2014 (EST)
  2. I would avoid the word ownership, if at all possible to help the community get around this issue (eventually). This reinforces ideas / misconceptions, and adds to confusion about data, media (and copyright, and intellectual property, etc). Something like this for number 2...
    1. You have permission to contribute this dataset to iDigBio.
  3. for number 3. do we need to explain or justify? how about
Data Format choices
DarwinCore archive format OR
CSV files mapped to Darwin Core (and other relevant standards, example Audubon Core)
Data Transfer
Darwin Core Archive files harvest via IPT and RSS
CSV files via (...)
  1. for number 4, please add UTF-8 reference. something like:
UTF-8 encoding preferred (should be required).
validate (or verify) that "special characters" (diacritics like umlauts, tilde, cedilla) are correct in your dataset.

From Morphbank http://www.morphbank.net/About/Manual/imagePhilosophy.php to see how we worded "permissions" issue (revolving around images).--Dpaul (talk) 18:33, 9 January 2014 (EST)


  1. These following links (to me) are not Data Requirements. (at the bottom of the page in review). They are Image/Media Issues or Image/Media Guidance--Dpaul (talk) 18:33, 9 January 2014 (EST)
    1. Additional info about image format is here: https://www.idigbio.org/content/idigbio-image-file-format-requirements-and-recommendations
    2. If you need to learn about acceptable Creative Commons licenses in iDigBio: https://www.idigbio.org/content/idigbio-intellectual-property-policy
  2. Next, General Information (nothing to do with Data Requirements, etc.)
    1. If you are contemplating writing a proposal (e.g., to NSF) and want to coordinate your data with iDigBio: https://www.idigbio.org/content/collaborating-idigbio-grant-proposals
    2. If you are brand new to iDigBio and looking for some entry-level info about the project, try here:

At some future date, I would hope we have enough staff to re-visit, re-write our ingestion documentation.--Dpaul (talk) 18:33, 9 January 2014 (EST) I would like us to avoid a page, that has a link, that leads to a page with much-repetitive content, that leads to yet another page (often a pdf or .doc) that also has links in it. If I had to follow all of this, as a novice, I would be drowning. I think we'd need one person -- to do this (so that it's consistent) and it would need to be at least 50% of their job until done. It's better, but still I think we can streamline it more.--Dpaul (talk) 18:33, 9 January 2014 (EST)