IDigBio API Performance

From iDigBio
Revision as of 13:28, 9 May 2014 by Dstoner (Talk | contribs)

Jump to: navigation, search


If you wish to download a large number of records or a complete recordset, use the Download system available in the search portal (

If the search portal does not meet your needs, consider using the Elasticsearch interface.

If neither of these meet your needs, please contact iDigBio.

The following information about API performance has been left here for historical reference, but generally speaking if you find yourself running into the following kinds of performance issues due to iterative paging over a large offsets, you should consider one of the alternatives mentioned above.

Understanding the performance limits fetching list of records from the iDigBio API

While fetching a single record does not require computation, fetching a list of record IDs (endpoints) varying the amount of records in the response and the starting record offset, requires different amounts of computation that leads to certain performance limits. The graphs below give you an understanding of these limits, enabling you to code in the most efficient manner depending on your needs. All graphs show the average response time of a request (blue lines; left side axis) and the number of requests that terminate with a time out exception (red lines; right side axis). Varying the number of parallel requests, we can observe that iDigBio can easily handle 60 concurrent users for requests with the first 1,000 record endpoints; this limit drops to 20 concurrent users if the requests are for larger responses containing a 10,000 record endpoints. Response time also varies with the offset of the request. With 10 concurrent users, the limit is around at the 10-million offset. The current timeout for requests is 50 seconds. Thus, the number of requests terminating with timeout increase as you get close to this limit. It is recommended that you code with retries and back-off mechanisms when you encounter timeout situations.

RecordRetrieval_1000batch_VaryingParallelism.png RecordRetrieval_10000batch_VaryingParallelism.png RecordRetrieval_1000batch_10Parallelism_VaryingOffset.png