IDigBio API Performance

From iDigBio
Revision as of 10:42, 6 May 2014 by Dstoner (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Recommendations

Use the Download system available in the search portal (https://www.idigbio.org/portal) if you wish to download a large number of records.


Understanding the performance limits fetching list of records from the iDigBio API

While fetching a single record does not require computation, fetching a list of record IDs (endpoints) varying the amount of records in the response and the starting record offset, requires different amounts of computation that leads to certain performance limits. The graphs below give you an understanding of these limits, enabling you to code in the most efficient manner depending on your needs. All graphs show the average response time of a request (blue lines; left side axis) and the number of requests that terminate with a time out exception (red lines; right side axis). Varying the number of parallel requests, we can observe that iDigBio can easily handle 60 concurrent users for requests with the first 1,000 record endpoints; this limit drops to 20 concurrent users if the requests are for larger responses containing a 10,000 record endpoints. Response time also varies with the offset of the request. With 10 concurrent users, the limit is around at the 10-million offset. The current timeout for requests is 50 seconds. Thus, the number of requests terminating with timeout increase as you get close to this limit. It is recommended that you code with retries and back-off mechanisms when you encounter timeout situations.

RecordRetrieval_1000batch_VaryingParallelism.png RecordRetrieval_10000batch_VaryingParallelism.png RecordRetrieval_1000batch_10Parallelism_VaryingOffset.png