IDigBio API: Difference between revisions

From iDigBio
Jump to navigation Jump to search
Line 17: Line 17:
iDigBio is transitioning from version 1 to version 2 (v1 to v2) of the API. We plan to keep the v1 endpoints in production for the foreseeable future. However, most programmers who are interested in accessing data directly in iDigBio via an API are going to be interested in the new v2 features. The v2 API is also known as the iDigBio Search API.
iDigBio is transitioning from version 1 to version 2 (v1 to v2) of the API. We plan to keep the v1 endpoints in production for the foreseeable future. However, most programmers who are interested in accessing data directly in iDigBio via an API are going to be interested in the new v2 features. The v2 API is also known as the iDigBio Search API.


The [https://github.com/idigbio/idigbio-search-api/wiki iDigBio API v2 / iDigBio Search API specification] includes detailed information about the Search API endpoints, parameters, values, and returns. It is available in the github wiki associated with the [https://github.com/iDigBio/idigbio-search-api code repository]. The iDigBio Search API is currently in beta with a target release to production by June 2015.
The [https://github.com/idigbio/idigbio-search-api/wiki iDigBio API v2 / iDigBio Search API specification] includes detailed information about the Search API endpoints, parameters, query format, values, and returns. It is available in the github wiki associated with the [https://github.com/iDigBio/idigbio-search-api code repository]. The iDigBio Search API is currently in beta with a target release to production by June 2015.


The [[iDigBio API v1 Specification]] includes detailed information about the v1 API endpoints, parameters, values, and returns. In most cases, the only reason to use the v1 API is to retrieve raw, unprocessed versions of a record (e.g. no iDigBio data quality enhancements) or to access previous versions of a particular record.
The [[iDigBio API v1 Specification]] includes detailed information about the v1 API endpoints, parameters, values, and returns. In most cases, the only reason to use the v1 API is to retrieve raw, unprocessed versions of a record (e.g. no iDigBio data quality enhancements) or to access previous versions of a particular record.

Revision as of 15:03, 23 February 2015


iDigBio API Overview

This document serves as the starting page of official documentation for the iDigBio Application Programming Interface (API).

Integrated Digitized Biocollections (iDigBio) is the National Resource for Advancing Digitization of Biodiversity Collections (ADBC) funded by the National Science Foundation. Through ADBC, data and images for millions of biological specimens are being made available in electronic format for the research community, government agencies, students, educators, and the general public. iDigBio is a data aggregator. This means that data is provided to iDigBio through various publishing mechanisms.

Many consumers of the iDigBio aggregated data will choose to use the iDigBio Portal web site. Additionally, to facilitate integration of iDigBio data with other web sites, services, or research uses, iDigBio provides an API.

The iDigBio API is an abstraction layer for retrieving data from the iDigBio back-end data systems. This abstraction allows reuse and mashup of aggregated data without needing to understand the complex underlying details of the back-end data storage. Currently, the public API supports HTTP GET requests for data read operations only. The iDigBio API is a RESTful web service that delivers data primarily as JSON documents.

Specification

iDigBio is transitioning from version 1 to version 2 (v1 to v2) of the API. We plan to keep the v1 endpoints in production for the foreseeable future. However, most programmers who are interested in accessing data directly in iDigBio via an API are going to be interested in the new v2 features. The v2 API is also known as the iDigBio Search API.

The iDigBio API v2 / iDigBio Search API specification includes detailed information about the Search API endpoints, parameters, query format, values, and returns. It is available in the github wiki associated with the code repository. The iDigBio Search API is currently in beta with a target release to production by June 2015.

The iDigBio API v1 Specification includes detailed information about the v1 API endpoints, parameters, values, and returns. In most cases, the only reason to use the v1 API is to retrieve raw, unprocessed versions of a record (e.g. no iDigBio data quality enhancements) or to access previous versions of a particular record.

Examples

The iDigBio API Examples page contains many more examples of the iDigBio API in action.

Quick Start

iDigBio API endpoints follow the general form:

http://api.idigbio.org/{api_version}{endpoint}{optional_parameters}

In nearly all cases, a successful API request returns data as a JSON-formatted document.

Quick Start Example - Family Curculionidae

Let us say that we have already located the specimen record for a particular Curculionidae specimen (a family of weevils). The specimen record for our particular example is identified by the following iDigBio GUID:

"idigbio:uuid" : "354210ae-4aa3-49d2-8a66-78a86b019c7b"

To retrieve a specimen record from v1 of the API with the above iDigBio UUID, we issue an HTTP "GET" request to the following endpoint:

http://api.idigbio.org/v1/records/354210ae-4aa3-49d2-8a66-78a86b019c7b

and receive the following JSON document from the API (in this case, formatted for readability):

{
   "idigbio:uuid" : "354210ae-4aa3-49d2-8a66-78a86b019c7b",
   "idigbio:etag" : "02736fd7318eafed62a4a5ff35175a27fa63983e",
   "idigbio:links" : {
      "mediarecord" : [
         "http://api.idigbio.org/v1/mediarecords/59141135-813a-4db1-a527-009ae6d17101"
      ],
      "owner" : [
         "872733a2-67a3-4c54-aa76-862735a5f334"
      ],
      "recordset" : [
         "http://api.idigbio.org/v1/recordsets/69037495-438d-4dba-bf0f-4878073766f1"
      ]
   },
   "idigbio:version" : 2,
   "idigbio:createdBy" : "872733a2-67a3-4c54-aa76-862735a5f334",
   "idigbio:recordIds" : [
      "urn:uuid:b036a012-ba1e-41e0-a39a-76fc253640cf"
   ],
   "idigbio:dateModified" : "2014-04-22T07:33:16.129Z",
   "idigbio:data" : {
      "dwc:day" : "16",
      "dwc:identifiedBy" : "CPMAB",
      "idigbio:recordId" : "urn:uuid:b036a012-ba1e-41e0-a39a-76fc253640cf",
      "dwc:catalogNumber" : "NAUF4A0013309",
      "dwc:locality" : "Box Cyn. Santa Rita Mts.",
      "dwc:occurrenceID" : "1063507",
      "dwc:year" : "1967",
      "dwc:recordedBy" : "C.D. Johnson",
      "dwc:scientificName" : "Curculionidae",
      "dwc:basisOfRecord" : "PreservedSpecimen",
      "dwc:family" : "Curculionidae",
      "symbiotaverbatimScientificName" : "Curculionidae",
      "dwc:collectionCode" : "NAUF",
      "dcterms:modified" : "2013-12-20 13:00:36",
      "dwc:country" : "USA",
      "dcterms:references" : "http://symbiota4.acis.ufl.edu/scan/portal/collections/individual/index.php?occid=1063507",
      "dwc:eventDate" : "1967-08-16",
      "dwc:scientificNameAuthorship" : "Latreille, 1802",
      "dwc:collectionID" : "urn:uuid:c87a0756-fdd7-4cb6-9921-ca5774f8330e",
      "dwc:minimumElevationInMeters" : "1524",
      "dwc:verbatimElevation" : "5000'",
      "dwc:startDayOfYear" : "228",
      "dwc:month" : "8",
      "dwc:rights" : "http://creativecommons.org/licenses/by-nc-sa/3.0/",
      "dwc:stateProvince" : "Arizona",
      "dwc:genus" : "Curculionidae",
      "dwc:institutionCode" : "NAU",
      "dwc:county" : "Pima"
   }
}

The iDigBio API Examples page shows how we might use Search features to find specimen records that match specific criteria such as Family, Genus, or Scientific Name.

Searching iDigBio

Search Portal and Bulk Record Downloads

The recommended method for searching iDigBio is to use the Portal search, not the API. The portal also provides bulk download capabilities for aquiring larger sets of data. See: https://www.idigbio.org/portal

Search API

The iDigBio Search API is part of the v2 API and is currently in beta testing. The next release of the iDigBio Portal will make use of the v2 API and the new search features provided.

Elasticsearch Overview

At present iDigBio provides a public-facing interface to the back-end Elasticsearch system. This is the same interface that is used by the iDigBio Portal search but allows experienced programmers to search for data in ways beyond what is provided by the Portal.

Note: Direct queries to the iDigBio Elasticsearch service should be considered an Advanced operation.

According to the Elasticsearch project site, Elasticsearch is a "flexible and powerful open source, distributed, real-time search and analytics engine." Elasticsearch provides a RESTful web services interfaces and returns data in JSON format.

More detailed information on iDigBio Elasticsearch capabilities is available in iDigBio API v1 Specification#Search.

See iDigBio API Examples for Elasticsearch query examples that are specific to iDigBio.

Client Libraries

Client libraries, packages, and modules are pieces of software that make it easier to interface with the iDigBio API. The following list includes the known iDigBio libraries that are currently available.

ridigbio R Package for Search API

R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS.

R Package Coming soon! Draft available at https://github.com/fmichonneau/ridigbio