IDigBio API v1 Specification: Difference between revisions

From iDigBio
Jump to navigation Jump to search
Line 344: Line 344:


;Description
;Description
:Returns a collection of record IDs
:Returns a collection of Specimen Record IDs


;Resource URL
;Resource URL
Line 361: Line 361:
|limit
|limit
|
|
|
|Controls the number of entities returned by a collection url. Large values may cause HTTP requests to time out.
|-
|-
|offset
|offset
|
|
|
|Controls the starting entity offset for paging through the API. Large offsets are extremely inefficient, so combinations of small limits and large offsets may cause requests to fail.
|}
|}



Revision as of 13:41, 20 May 2014


API Version Information

This is the specification for v1 of the iDigBio API. Previous versions of the API continue to exist but should be considered deprecated. API users should migrate to using the current version of the API. This document supercedes iDigBio API v0 Specification.

iDigBio Data and Schema

Data elements generally conform to the Biodiversity Information Standards (also known as the Taxonomic Databases Working Group or TDWG) Darwin Core and Audobon Core.

The iDigBio Data Ingestion Requirements and Guidelines may be useful to understand how data becomes available in iDigBio.

Endpoints

Unless otherwise noted, successful responses from the API will return a JSON-formatted document.

Most of the provided examples include a JSON formatter (such as json_pp) to make the output easier for humans to read. Additional usage examples as well as information on JSON formatting and the "curl" command, are available in iDigBio API Examples.

There are two major types of API enpoints:

  • Collection - which is a group endpoint that returns lists of multiple records. These urls are of the form <base url>/<version>/<type>, such as http://api.idigbio.org/v1/mediarecords/ . Additionally, a collection endpoint can contain optional query parameters, ?limit indicates the number of records returned in the collection and defaults to 1000 and the ?offset parameter which indicates the number of records to skip before returning a set of records and defaults to 0. If a collection endpoint request finds more then the set limit of records it will include a "next page" link to retrieve the next set of records in the collection. See the endpoint properties section for more information on properties returned.
  • Entity - A single item endpoint which returns all of the data available about an object. These urls are of the form <base url>/<version>/<type>/<id> like the example used above.

Examples:

collection:
"http://api.idigbio.org/v1/mediarecords"
collection w/ optional query parameters:
"http://api.idigbio.org/v1/mediarecords?limit=100&offset=100"
entity:
"http://api.idigbio.org/v1/mediarecords/00000230-01bc-4a4f-8389-204f39da9530"



GET /

Description
Returns a list of top-level api_version or service URLs
Resource URL
http://api.idigbio.org/
Optional Parameters
None
Sample Usage
$ curl -s http://api.idigbio.org/ | json_pp
{
   "v1" : "http://api.idigbio.org/v1/",
   "check" : "http://api.idigbio.org/check",
   "v0" : "http://api.idigbio.org/v0/"
}

GET /{api_version}

Description
Returns a list of top-level API feature types for a particular version of the API
Resource URL
http://api.idigbio.org/v1
Optional Parameters
None
Sample Usage
$ curl -s http://api.idigbio.org/v1 | json_pp
{
   "aggregates" : "http://api.idigbio.org/v1/aggregates",
   "records" : "http://api.idigbio.org/v1/records",
   "mediaaps" : "http://api.idigbio.org/v1/mediaaps",
   "taxa" : "http://api.idigbio.org/v1/taxa",
   "people" : "http://api.idigbio.org/v1/people",
   "organizations" : "http://api.idigbio.org/v1/organizations",
   "recordsets" : "http://api.idigbio.org/v1/recordsets",
   "mediarecords" : "http://api.idigbio.org/v1/mediarecords"
}

Notes
Some of the listed feature types may deprecated. This will be noted elsewhere in the API specification document.

GET /v1/aggregates

Description
Deprecated, do not use.

GET /v1/mediaaps

Description
Deprecated, do not use.

GET /v1/mediarecords

Description
Returns a collection (list) of Media Record IDs.
Resource URL
http://api.idigbio.org/v1/mediarecords
Optional Parameters
parameter valid values detailed description
limit Controls the number of records returned by a collection url. Large values may cause HTTP requests to time out.
offset Controls the starting record offset paging through the API. Large offsets are extremely inefficient, so combinations of small limits and large offsets may cause requests to fail.
Sample Usage

Request the first 5 media record entity ids:

$ curl -s "http://api.idigbio.org/v1/mediarecords?limit=5" | json_pp
{
   "idigbio:errors" : [],
   "idigbio:links" : {
      "idigbio:nextPage" : "http://api.idigbio.org/v1/mediarecords?limit=5&offset=5"
   },
   "idigbio:items" : [
      {
         "idigbio:links" : {
            "mediarecord" : "http://api.idigbio.org/v1/mediarecords/000003cd-0cca-421b-8f26-f557a26b0393"
         },
         "idigbio:uuid" : "000003cd-0cca-421b-8f26-f557a26b0393",
         "idigbio:version" : 1,
         "idigbio:etag" : "ce3e2f7272ec996bb479c87549ba90c15ba96426",
         "idigbio:dateModified" : "2014-04-21T22:19:27.436Z"
      },
      {
         "idigbio:links" : {
            "mediarecord" : "http://api.idigbio.org/v1/mediarecords/00000728-ffb3-4a68-9f93-137f19961121"
         },
         "idigbio:uuid" : "00000728-ffb3-4a68-9f93-137f19961121",
         "idigbio:version" : 3,
         "idigbio:etag" : "ef2cac326a60d89d8cb9005abaa82068bfa83565",
         "idigbio:dateModified" : "2014-04-24T05:03:56.782Z"
      },
      {
         "idigbio:links" : {
            "mediarecord" : "http://api.idigbio.org/v1/mediarecords/00000b03-e208-4d22-983b-506ad2842f7c"
         },
         "idigbio:uuid" : "00000b03-e208-4d22-983b-506ad2842f7c",
         "idigbio:version" : 2,
         "idigbio:etag" : "bc118a7ea53e004c82ab9b7e813e1010ae5f8e17",
         "idigbio:dateModified" : "2014-04-20T05:16:20.389Z"
      },
      {
         "idigbio:links" : {
            "mediarecord" : "http://api.idigbio.org/v1/mediarecords/000010bc-a4d4-483d-b71d-0dbdd4fd2d5a"
         },
         "idigbio:uuid" : "000010bc-a4d4-483d-b71d-0dbdd4fd2d5a",
         "idigbio:version" : 0,
         "idigbio:etag" : "68c441bd3c49507bf930f3b278f2c58f9cb792ec",
         "idigbio:dateModified" : "2014-04-20T21:38:46.679Z"
      },
      {
         "idigbio:links" : {
            "mediarecord" : "http://api.idigbio.org/v1/mediarecords/000012f9-d288-4a14-b898-77430e0a137a"
         },
         "idigbio:uuid" : "000012f9-d288-4a14-b898-77430e0a137a",
         "idigbio:version" : 1,
         "idigbio:etag" : "cf49416750fdb9bdb808c334a74b84f27bb8160b",
         "idigbio:dateModified" : "2014-04-23T02:43:08.344Z"
      }
   ],
   "idigbio:itemCount" : "2342880"
}

Of interest here is that "idigbio:itemCount" contains the number of items of this type in the API. In this case, we have 2,342,880 mediarecords total.

A link to the next "page" of records is also provided:

   "idigbio:links" : {
      "idigbio:nextPage" : "http://api.idigbio.org/v1/mediarecords?limit=5&offset=5"
   }

The next page of records can be requested by adding the "offset" paramenter:

$ curl -s "http://api.idigbio.org/v1/mediarecords?limit=5&offset=5" | json_pp
{
   "idigbio:errors" : [],
   "idigbio:links" : {
      "idigbio:nextPage" : "http://api.idigbio.org/v1/mediarecords?limit=5&offset=10",
      "idigbio:prevPage" : "http://api.idigbio.org/v1/mediarecords?limit=5&offset=0"
   },
   "idigbio:items" : [
      {
         "idigbio:links" : {
            "mediarecord" : "http://api.idigbio.org/v1/mediarecords/00001478-c150-4faf-a617-439a838d4377"
         },
         "idigbio:uuid" : "00001478-c150-4faf-a617-439a838d4377",
         "idigbio:version" : 1,
         "idigbio:etag" : "30f602e4eb47ebb2ceb265f64217e3cf5664f517",
         "idigbio:dateModified" : "2014-03-21T23:09:39.752Z"
      },
      {
         "idigbio:links" : {
            "mediarecord" : "http://api.idigbio.org/v1/mediarecords/00001a91-189b-4002-b56e-a770a55951a0"
         },
         "idigbio:uuid" : "00001a91-189b-4002-b56e-a770a55951a0",
         "idigbio:version" : 0,
         "idigbio:etag" : "647e82d17ee435fb14f0f8607dabe88dfc3a1944",
         "idigbio:dateModified" : "2014-04-25T04:49:32.359Z"
      },
      {
         "idigbio:links" : {
            "mediarecord" : "http://api.idigbio.org/v1/mediarecords/00002091-4fb3-410a-9307-bd3e917dfcca"
         },
         "idigbio:uuid" : "00002091-4fb3-410a-9307-bd3e917dfcca",
         "idigbio:version" : 0,
         "idigbio:etag" : "90d98d48d9e7e07eab9064bd9b6e22ce6502c07f",
         "idigbio:dateModified" : "2014-05-03T18:45:47.112Z"
      },
      {
         "idigbio:links" : {
            "mediarecord" : "http://api.idigbio.org/v1/mediarecords/00002c32-ae3a-41ed-9bd9-f6c50d3e35fb"
         },
         "idigbio:uuid" : "00002c32-ae3a-41ed-9bd9-f6c50d3e35fb",
         "idigbio:version" : 3,
         "idigbio:etag" : "d1ded90d06e93876b1badd01222905add93e8806",
         "idigbio:dateModified" : "2014-04-19T00:25:59.471Z"
      },
      {
         "idigbio:links" : {
            "mediarecord" : "http://api.idigbio.org/v1/mediarecords/00002dbd-6415-463b-8cae-38f548415ffa"
         },
         "idigbio:uuid" : "00002dbd-6415-463b-8cae-38f548415ffa",
         "idigbio:version" : 2,
         "idigbio:etag" : "4e298045b496146f5c51e331c9887fd7afde4deb",
         "idigbio:dateModified" : "2014-04-21T20:29:39.531Z"
      }
   ],
   "idigbio:itemCount" : "2342880"
}

which includes links to the previous page and next page:

      "idigbio:nextPage" : "http://api.idigbio.org/v1/mediarecords?limit=5&offset=10",
      "idigbio:prevPage" : "http://api.idigbio.org/v1/mediarecords?limit=5&offset=0"

using offsets of 0 (previous page) and 10 (next page).

DO NOT expect to be able to page through the entire iDigBio data this way. See iDigBio API Performance if you find yourself trying to page through large amounts of data.


GET /v1/mediarecords/{ID}

Description
Returns a Media Record with the specific entity ID
Resource URL
http://api.idigbio.org/v1/mediarecords/{ID}
Optional Parameters
parameter valid values detailed description
version Integer values from 0 to maxium version of a particular record Mediarecords may be updated over time (changes submitted by data publihsers). The "version" parameter is used to retrieve a previous version of a record. A value of "-1" returns the most recent version of a record. Omitting the "version" parameter returns the most recent version of a record.


Sample Usage



GET /v1/mediarecords/{ID}/media

Description
Returns an image file (JPEG) associated with the specific entity ID. Omitting the "quality" parameter will return the full size image specified in the source data accessURI field. For many use cases, the recommended use of this endpoint would include the quality parameter.
Resource URL
http://api.idigbio.org/v1/mediarecords/{ID}/media
Optional Parameters
parameter valid values detailed description
quality "thumbnail" "webview" Specifiy the quality of the image returned from the API. Omitting quality will return the full-size high quality original image from source provider. The values "thumbnail" and "webview" return images of width 260 and 600 pixels respectively.
Sample Usage
# CURL SOMETHING with -L to follow redirects

GET /v1/records

Description
Returns a collection of Specimen Record IDs
Resource URL
http://api.idigbio.org/v1/records
Optional Parameters
parameter valid values detailed description
limit Controls the number of entities returned by a collection url. Large values may cause HTTP requests to time out.
offset Controls the starting entity offset for paging through the API. Large offsets are extremely inefficient, so combinations of small limits and large offsets may cause requests to fail.
Sample Usage
# CURL SOMETHING

GET /v1/records/{ID}

Description
Returns a record with the specific entity ID
Resource URL
http://api.idigbio.org/v1/records/{ID}
Optional Parameters
parameter valid values detailed description
version Numeric values from 0 to maxium version of a particular record The API normally returns the "latest" or most recent version of a particular record. Records may be updated over time. The version parameter can be used to retrieve previous versions of a record.
Sample Usage
# CURL SOMETHING

GET /v1/records/{ID}/media

Description
Returns an image (JPEG) associated with the specific entity ID (via the relationship to a mediarecord). If multiple mediarecords are associated with a specimen record, the particular image returned in non-deterministic.
Resource URL
http://api.idigbio.org/v1/records/{ID}/media
Optional Parameters
parameter valid values detailed description
quality "thumbnail" "webview" Specifiy the quality of the image returned from the API. Omitting quality will return the full-size high quality original image from source provider. The values "thumbnail" and "webview" return images of width 260 and 600 pixels respectively.
Sample Usage
# CURL SOMETHING with -L to watch redirects

GET /v1/publishers

Description
Returns a collection of publisher IDs
Resource URL
http://api.idigbio.org/v1/publishers


Optional Parameters
  • something goes here
Sample Usage
# CURL SOMETHING

GET /v1/organizations

Description
Deprecated, do not use.

GET /v1/people

Description
Deprecated, do not use.

GET /v1/publishers/{ID}

Description
Returns a publisher with specific entity ID
Resource URL
http://api.idigbio.org/v1/publishers/{ID}
Optional Parameters
  • something goes here
Sample Usage
# CURL SOMETHING

GET /v1/recordsets

Description
Returns a collection of recordset IDs
Resource URL
http://api.idigbio.org/v1/recordsets
Optional Parameters
  • something goes here
Sample Usage
# CURL SOMETHING

GET /v1/recordsets/{ID}

Description
Returns information about a recordset with specific entity ID
Resource URL
http://api.idigbio.org/v1/recordsets/{ID}
Parameters
  • something goes here
Sample Usage
# CURL SOMETHING

GET /v1/recordsets/{ID}/mediarecords

Description
Returns a colleciton of mediarecord IDs that belong to the recordset of the specified entity ID
Resource URL
http://api.idigbio.org/v1/recordsets/{ID}/mediarecords
Optional Parameters
  • something goes here
Sample Usage
# CURL SOMETHING

GET /v1/recordsets/{ID}/records

Description
Returns a collection of record IDs that belong to the recordset of the specified entity ID
Resource URL
http://api.idigbio.org/v1/recordsets/{ID}/records
Optional Parameters
  • something goes here
Sample Usage
# CURL SOMETHING

GET /v1/taxa

Description
Deprecated, do not use.

Search

Elasticsearch is an open source distributed document-oriented NoSQL search system. Although not technically part of the API, iDigBio exposes a public Elasticsearch interface for programmers to access advanced search functionality of iDigBio data.

The following are external links to Elasticsearch reference documentation and should be considered prerequisite reading before attempting to use the iDigBio Elasticsearch interface.

There is also an elasticsearch Google Group available.

The iDigBio search index provides two document types to query on: Records (specimen records) and Media Records (media metadata). Search results are returned as JSON-formatted documents.

Each type can be queried through the following respective URLs:

Query Type Description Search URL
Records specimen records https://search.idigbio.org/idigbio/records/_search
Media Records media metadata records https://search.idigbio.org/idigbio/mediarecords/_search

Examples specific to iDigBio are available in iDigBio API Examples.

Elasticsearch - Records

Specimen Records Query URL:

https://search.idigbio.org/idigbio/records/_search

The following terms are currently available in the index for Records type of queries to Elasticsearch:

"barcodevalue"
"catalognumber"
"class"
"collectioncode"
"collectionid"
"collectionname"
"collector"
"commonname"
"continent"
"country"
"county"
"datecollected"
"datemodified"
"etag"
"family"
"fieldnumber"
"genus"
"geopoint"
"hasImage"
"highertaxon"
"infraspecificepithet"
"institutioncode"
"institutionid"
"institutionname"
"kingdom"
"locality"
"maxdepth"
"maxelevation"
"mediarecords"
"mindepth"
"minelevation"
"municipality"
"occurenceid"
"order"
"phylum"
"recordset"
"scientificname"
"specificepithet"
"stateprovince"
"typestatus"
"uuid"
"verbatimlocality"
"version"
"waterbody"

The values stored in these terms are converted to lowercase, so searches based on terms should use the all-lowercase version of the string.

For example, searching for "Arkansas" in stateprovince will return no records.

$ curl -s "http://search.idigbio.org/idigbio/records/_search?q=stateprovince:Arkansas" | json_pp | grep scientificname | wc -l
0

Searching for "arkansas" will return multiple records.

$ curl -s "http://search.idigbio.org/idigbio/records/_search?q=stateprovince:arkansas" | json_pp | grep scientificname | wc -l
10

See iDigBio API Examples page for more Elasticsearch examples that are specific to iDigBio.

Elasticsearch - Media Records

Media Records Query URL:

https://search.idigbio.org/idigbio/mediarecords/_search

There are no useful search terms for Media Records queries using Elasticsearch at this time.