Difference between revisions of "IDigBio Download API"

From iDigBio
Jump to: navigation, search
(add categories)
(more...)
Line 11: Line 11:
  
  
 
+
The Download API works by performing the requested query and building a Darwin Core Archive. Once archive generation has begun, the status endpoint can be polled to determine if the generation has been completed. Once the archive generation is completed, the API provides a link to the file for download. If the optional email parameter is supplied on the query request, an email notification will be sent.
The Download API works by performing the requested query and building an archive. Once the archive is completed, the API provides a link to the file. If the optional email parameter is supplied on the query request, an email notification will be sent.
+
  
 
Large queries (and thus large archive file creation) can take multiple hours to complete.
 
Large queries (and thus large archive file creation) can take multiple hours to complete.
Line 25: Line 24:
 
A query submitted as a POST request must be supplied as JSON in the content body and specify the "Content-Type: application/json" request header.
 
A query submitted as a POST request must be supplied as JSON in the content body and specify the "Content-Type: application/json" request header.
  
 +
== Query Endpoint ==
  
 
The download service url:
 
The download service url:
Line 34: Line 34:
 
See [https://github.com/iDigBio/idigbio-search-api/wiki/Query-Format iDigBio query format] for more information on writing queries.
 
See [https://github.com/iDigBio/idigbio-search-api/wiki/Query-Format iDigBio query format] for more information on writing queries.
  
== Examples ==
+
=== Query Example - genus acer ===
  
 
Consider the following query:
 
Consider the following query:
Line 42: Line 42:
 
</pre>
 
</pre>
  
Without specifying an email address, we could request a download  
+
Without specifying an email address, we could request a download with the following url:
 +
 
 +
<pre>
 +
https://csv.idigbio.org/?query=%7B%22genus%22%3A%22acer%22%7D
 +
</pre>
 +
 
 +
Using curl, we can see the response:
  
  
Line 58: Line 64:
  
 
When the download file is generated, the "complete" field will be set to "true" and the "download_url" field will include a link to the available file.
 
When the download file is generated, the "complete" field will be set to "true" and the "download_url" field will include a link to the available file.
 +
 +
=== Query Example - genus acer with email ===
  
  
 
== Status Endpoint ==
 
== Status Endpoint ==
  
stub
+
A successful request to the query endpoint will return a JSON document that includes a number of fields including "complete" which is a status flag and "download_url" which, once the generation is completed, a link to the generated download file.
  
 
=== Status Endpoint Example ===
 
=== Status Endpoint Example ===
 +
 +
Given the following query JSON:
 +
 +
<pre>
 +
{ "scientificname" : "puma concolor" }
 +
</pre>
 +
 +
<pre>
 +
https://csv.idigbio.org/?query=%7B%22scientificname%22%3A%22puma+concolor%22%7D
 +
</pre>
 +
 +
Using the above query,
  
 
http://csv.idigbio.org/status/&lt;download id> (returned as "status_url" from the original download query)
 
http://csv.idigbio.org/status/&lt;download id> (returned as "status_url" from the original download query)
  
 
Ex. http://csv.idigbio.org/status/995ed58b-01fd-4c98-893e-e0cbdfadc8fe
 
Ex. http://csv.idigbio.org/status/995ed58b-01fd-4c98-893e-e0cbdfadc8fe
 +
 +
 +
<pre>
 +
{"complete": false,
 +
"status_url": "http://csv.idigbio.org/status/cba4ae0f-da2b-42ec-b763-132a209c3251",
 +
"expires": "2015-04-28T11:46:54.562842",
 +
"query_hash": "5921ce268fe0d911196a4564eea8ce9ffc2e2420",
 +
"query":
 +
{"scientificname": "puma concolor"},
 +
"task_status": "PENDING"}
 +
</pre>
 +
 +
We wait a while and try again, the status changes:
 +
 +
<pre>
 +
 +
{"complete": true,
 +
"status_url": "http://csv.idigbio.org/status/cba4ae0f-da2b-42ec-b763-132a209c3251",
 +
"expires": "2015-04-28T11:46:54.401498",
 +
"download_url": "http://s.idigbio.org/idigbio-downloads/cba4ae0f-da2b-42ec-b763-132a209c3251.zip",
 +
"query_hash": "5921ce268fe0d911196a4564eea8ce9ffc2e2420",
 +
"query":
 +
{"scientificname": "puma concolor"},
 +
"task_status": "SUCCESS"}
 +
</pre>

Revision as of 12:33, 27 April 2015


Overview

Note: While the download API is currently used by the production portal, it should be considered highly unstable for non-iDigBio consumers. The next revision of the API will most likely be a total rewrite, backend and front.

The download API may not provide "friendly" error messages at this time.


The Download API works by performing the requested query and building a Darwin Core Archive. Once archive generation has begun, the status endpoint can be polled to determine if the generation has been completed. Once the archive generation is completed, the API provides a link to the file for download. If the optional email parameter is supplied on the query request, an email notification will be sent.

Large queries (and thus large archive file creation) can take multiple hours to complete.


GET requests

A query submitted as a GET request must be URL-encoded.

POST requests

A query submitted as a POST request must be supplied as JSON in the content body and specify the "Content-Type: application/json" request header.

Query Endpoint

The download service url:

https://csv.idigbio.org/?query={Query in iDigBio query format}[&email={valid email address}]

See iDigBio query format for more information on writing queries.

Query Example - genus acer

Consider the following query:

{ "genus" : "acer"}

Without specifying an email address, we could request a download with the following url:

https://csv.idigbio.org/?query=%7B%22genus%22%3A%22acer%22%7D

Using curl, we can see the response:


https://csv.idigbio.org/?query=<[https://github.com/iDigBio/idigbio-search-api/wiki/Query-Format idigbio query format] , not all query types are available yet>&email=<email, optional>

Example:

https://csv.idigbio.org/?query={%22genus%22:%22acer%22}

The resulting JSON will include the "status_url" field which is a link to a status page for this download query.

When the download file is generated, the "complete" field will be set to "true" and the "download_url" field will include a link to the available file.

Query Example - genus acer with email

Status Endpoint

A successful request to the query endpoint will return a JSON document that includes a number of fields including "complete" which is a status flag and "download_url" which, once the generation is completed, a link to the generated download file.

Status Endpoint Example

Given the following query JSON:

{ "scientificname" : "puma concolor" }
https://csv.idigbio.org/?query=%7B%22scientificname%22%3A%22puma+concolor%22%7D

Using the above query,

http://csv.idigbio.org/status/<download id> (returned as "status_url" from the original download query)

Ex. http://csv.idigbio.org/status/995ed58b-01fd-4c98-893e-e0cbdfadc8fe


{"complete": false,
"status_url": "http://csv.idigbio.org/status/cba4ae0f-da2b-42ec-b763-132a209c3251",
"expires": "2015-04-28T11:46:54.562842",
"query_hash": "5921ce268fe0d911196a4564eea8ce9ffc2e2420",
"query": 
{"scientificname": "puma concolor"},
"task_status": "PENDING"}

We wait a while and try again, the status changes:


{"complete": true,
"status_url": "http://csv.idigbio.org/status/cba4ae0f-da2b-42ec-b763-132a209c3251",
"expires": "2015-04-28T11:46:54.401498",
"download_url": "http://s.idigbio.org/idigbio-downloads/cba4ae0f-da2b-42ec-b763-132a209c3251.zip",
"query_hash": "5921ce268fe0d911196a4564eea8ce9ffc2e2420",
"query": 
{"scientificname": "puma concolor"},
"task_status": "SUCCESS"}