IDigBio Download API

From iDigBio
Revision as of 13:33, 27 April 2015 by Dstoner (talk | contribs) (more...)
Jump to navigation Jump to search


Overview

Note: While the download API is currently used by the production portal, it should be considered highly unstable for non-iDigBio consumers. The next revision of the API will most likely be a total rewrite, backend and front.

The download API may not provide "friendly" error messages at this time.


The Download API works by performing the requested query and building a Darwin Core Archive. Once archive generation has begun, the status endpoint can be polled to determine if the generation has been completed. Once the archive generation is completed, the API provides a link to the file for download. If the optional email parameter is supplied on the query request, an email notification will be sent.

Large queries (and thus large archive file creation) can take multiple hours to complete.


GET requests

A query submitted as a GET request must be URL-encoded.

POST requests

A query submitted as a POST request must be supplied as JSON in the content body and specify the "Content-Type: application/json" request header.

Query Endpoint

The download service url:

https://csv.idigbio.org/?query={Query in iDigBio query format}[&email={valid email address}]

See iDigBio query format for more information on writing queries.

Query Example - genus acer

Consider the following query:

{ "genus" : "acer"}

Without specifying an email address, we could request a download with the following url:

https://csv.idigbio.org/?query=%7B%22genus%22%3A%22acer%22%7D

Using curl, we can see the response:


https://csv.idigbio.org/?query=<[https://github.com/iDigBio/idigbio-search-api/wiki/Query-Format idigbio query format] , not all query types are available yet>&email=<email, optional>

Example:

https://csv.idigbio.org/?query={%22genus%22:%22acer%22}

The resulting JSON will include the "status_url" field which is a link to a status page for this download query.

When the download file is generated, the "complete" field will be set to "true" and the "download_url" field will include a link to the available file.

Query Example - genus acer with email

Status Endpoint

A successful request to the query endpoint will return a JSON document that includes a number of fields including "complete" which is a status flag and "download_url" which, once the generation is completed, a link to the generated download file.

Status Endpoint Example

Given the following query JSON:

{ "scientificname" : "puma concolor" }
https://csv.idigbio.org/?query=%7B%22scientificname%22%3A%22puma+concolor%22%7D

Using the above query,

http://csv.idigbio.org/status/<download id> (returned as "status_url" from the original download query)

Ex. http://csv.idigbio.org/status/995ed58b-01fd-4c98-893e-e0cbdfadc8fe


{"complete": false,
"status_url": "http://csv.idigbio.org/status/cba4ae0f-da2b-42ec-b763-132a209c3251",
"expires": "2015-04-28T11:46:54.562842",
"query_hash": "5921ce268fe0d911196a4564eea8ce9ffc2e2420",
"query": 
{"scientificname": "puma concolor"},
"task_status": "PENDING"}

We wait a while and try again, the status changes:


{"complete": true,
"status_url": "http://csv.idigbio.org/status/cba4ae0f-da2b-42ec-b763-132a209c3251",
"expires": "2015-04-28T11:46:54.401498",
"download_url": "http://s.idigbio.org/idigbio-downloads/cba4ae0f-da2b-42ec-b763-132a209c3251.zip",
"query_hash": "5921ce268fe0d911196a4564eea8ce9ffc2e2420",
"query": 
{"scientificname": "puma concolor"},
"task_status": "SUCCESS"}