Talk:Data Ingestion Guidance: Difference between revisions

From iDigBio
Jump to navigation Jump to search
No edit summary
No edit summary
Line 1: Line 1:
--[[User:Dpaul|Dpaul]] ([[User talk:Dpaul|talk]]) 16:49, 9 January 2014 (EST): Regarding data@idigbio.org
--[[User:Dpaul|Dpaul]] 16:49, 9 January 2014 (EST)
now that DP has a gator link, does this mean I can be added to the mailing list that "sees" data@idigbio.org
:Regarding data@idigbio.org now that DP has a gator link, does this mean I can be added to the mailing list that "sees" data@idigbio.org--[[User:Dpaul|Dpaul]] ([[User talk:Dpaul|talk]]) 17:48, 9 January 2014 (EST)
:Where do users send email if they have a question
::data@idigbio.org
::the feedback (or both)?--[[User:Dpaul|Dpaul]] ([[User talk:Dpaul|talk]]) 17:48, 9 January 2014 (EST)
:(At Morphbank, to keep this straightforward, all email requests for help go to morphbank@scs.fsu.edu We do not have 2 separate paths for help / issues requests. I am assuming (I think) that clicking to send "feedback" generates a Redmine ticket (efficient and transparent). But, data@idigbio.org is not transparent.--[[User:Dpaul|Dpaul]] ([[User talk:Dpaul|talk]]) 17:48, 9 January 2014 (EST)


Where do users send email if they have a question
:About this Section
data@idigbio.org
the feedback (or both)?
 
(At Morphbank, to keep this straightforward, all email requests for help go to morphbank@scs.fsu.edu We do not have 2 separate paths for help / issues requests. I am assuming (I think) that clicking to send "feedback" generates a Redmine ticket (efficient and transparent). But, data@idigbio.org is not transparent.
 
About this Section


When your data are ready for ingestion, please register your collection here. You will need to have an established login already, and be logged in. Here are the steps to take:
When your data are ready for ingestion, please register your collection here. You will need to have an established login already, and be logged in. Here are the steps to take:
 
*http://portal.idigbio.org/register
    http://portal.idigbio.org/register
 
Log into iDigBio if you have not already done so, your regular iDigBio.org login credentials are sufficient, there is no special login for the portal:
Log into iDigBio if you have not already done so, your regular iDigBio.org login credentials are sufficient, there is no special login for the portal:
 
*https://www.idigbio.org/auth/login.php
    https://www.idigbio.org/auth/login.php
 
Once you are logged in, go to this link:
Once you are logged in, go to this link:
 
*http://portal.idigbio.org/register
    http://portal.idigbio.org/register


If you are already on the portal page, the 'Register A Collection' is in the menu under your login name in the upper right of the page.  
If you are already on the portal page, the 'Register A Collection' is in the menu under your login name in the upper right of the page.  
Line 54: Line 47:
     just like the ownership of catalog records, the media records need to provided freely and with permission, and each record needs to have at least Creative Commons permission = "CC BY"
     just like the ownership of catalog records, the media records need to provided freely and with permission, and each record needs to have at least Creative Commons permission = "CC BY"


:I would avoid the word ownership, if at all possible to help the community get around this issue. This reinforces ideas / misconceptions about data (and copyright, and intellectual property, etc). Something like this for number 2 below.
:I would avoid the word ownership, if at all possible to help the community get around this issue (eventually). This reinforces ideas / misconceptions about data (and copyright, and intellectual property, etc). Something like this for number 2 below.
::You have permission to contribute this dataset to iDigBio.
::You have permission to contribute this dataset to iDigBio.


Line 67: Line 60:
for number 4, please add UTF-8 reference. something like:
for number 4, please add UTF-8 reference. something like:
:UTF-8 encoding preferred (should be required).
:UTF-8 encoding preferred (should be required).
::validate that "special characters" (diacritics like umlauts, tilde, cedilla) are correct in your dataset.
::validate (or verify) that "special characters" (diacritics like umlauts, tilde, cedilla) are correct in your dataset.


----
----

Revision as of 18:48, 9 January 2014

--Dpaul 16:49, 9 January 2014 (EST)

Regarding data@idigbio.org now that DP has a gator link, does this mean I can be added to the mailing list that "sees" data@idigbio.org--Dpaul (talk) 17:48, 9 January 2014 (EST)
Where do users send email if they have a question
data@idigbio.org
the feedback (or both)?--Dpaul (talk) 17:48, 9 January 2014 (EST)
(At Morphbank, to keep this straightforward, all email requests for help go to morphbank@scs.fsu.edu We do not have 2 separate paths for help / issues requests. I am assuming (I think) that clicking to send "feedback" generates a Redmine ticket (efficient and transparent). But, data@idigbio.org is not transparent.--Dpaul (talk) 17:48, 9 January 2014 (EST)
About this Section

When your data are ready for ingestion, please register your collection here. You will need to have an established login already, and be logged in. Here are the steps to take:

Log into iDigBio if you have not already done so, your regular iDigBio.org login credentials are sufficient, there is no special login for the portal:

Once you are logged in, go to this link:

If you are already on the portal page, the 'Register A Collection' is in the menu under your login name in the upper right of the page.


--Dpaul (talk) 17:22, 9 January 2014 (EST) When your data are ready for ingestion, please see the next steps.

  1. Get an iDigBio account for yourself (if you don't have one yet). https://www.idigbio.org/auth/login.php
    1. This are the only login credentials you will need.
  2. Log in with your iDigBio account username and password. https://www.idigbio.org/auth/login.php
  3. Register your collection. http://portal.idigbio.org/register OR
  4. Register your collection at GRBIO
    1. Repository: http://grbio.org/find-biorepositories OR
    2. Institutional Collections: http://grbio.org/find-institutional-collections
  5. If you are already on the portal page, the 'Register A Collection' is in the menu under your login name in the upper right of the page.

--Dpaul (talk) 17:22, 9 January 2014 (EST) About this next section Data Requirements[edit]

   For all data records
   all specimen records need to have a GUID in each digital record: a persistent globally unique identifier
   you need to have ownership of the data in the case of your being its source, on the other hand if you are an aggregator, you need to have the owner's permission to send it to us.
   we would like it to be available to our harvester via IPT and RSS if possible, otherwise in DarwinCore format in a CSV file would work too.
   dates in ISO 8601 format, i.e., YYYY-MM-DD
   caution to preserve diacritics in people and place names.
   For all images/media objects
   each media record needs to have a GUID: a persistent globally unique identifier
   we need there to be Audubon Core metadata file, with one record to go with each media record, and we can provide coaching to help you create that file. The more you can flesh out the details of the image, the more likely it will be to be highly retrievable.
   just like the ownership of catalog records, the media records need to provided freely and with permission, and each record needs to have at least Creative Commons permission = "CC BY"
I would avoid the word ownership, if at all possible to help the community get around this issue (eventually). This reinforces ideas / misconceptions about data (and copyright, and intellectual property, etc). Something like this for number 2 below.
You have permission to contribute this dataset to iDigBio.

for number 3. do we need to explain or justify? how about

Data Format choices
DarwinCore archive format OR
CSV files mapped to Darwin Core (and other relevant standards, example Audubon Core)
Data Transfer
Darwin Core Archive files harvest via IPT and RSS
CSV files via (...)

for number 4, please add UTF-8 reference. something like:

UTF-8 encoding preferred (should be required).
validate (or verify) that "special characters" (diacritics like umlauts, tilde, cedilla) are correct in your dataset.

Data Requirements

Data Records

  1. all specimen records need to have a GUID in each digital record: a persistent globally unique identifier
  2. you need to have ownership of the data in the case of your being its source, on the other hand if you are an aggregator, you need to have the owner's permission to send it to us.
  3. we would like it to be available to our harvester via IPT and RSS if possible, otherwise in DarwinCore format in a CSV file would work too.
  4. dates in ISO 8601 format, i.e., YYYY-MM-DD
  5. caution to preserve diacritics in people and place names.