Increasing the efficiency of digitization workflows for herbarium specimens

TitleIncreasing the efficiency of digitization workflows for herbarium specimens
Publication TypeJournal Article
Year of Publication2012
AuthorsTulig, Melissa
Secondary AuthorsTarnowsky, Nicole, Bevans Michael, Kirchgessner Anthony, and Thiers Barbara
Start Page103
Issue(Special Issue)
Date Published07/2012
Keywordsdigital imaging, field books, georeferencing, Herbarium specimen digitization, WorkFlows
AbstractThe New York Botanical Garden Herbarium has been databasing and imaging its estimated 7.3 million plant specimens for the past 17 years. Due to the size of the collection, we have been selectively digitizing fundable subsets of specimens, making successive passes through the herbarium with each new grant. With this strategy, the average rate for databasing complete records has been 10 specimens per hour. With 1.3 million specimens databased, this effort has taken about 130, 000 hours of staff time. At this rate, to complete the herbarium and digitize the remaining 6 million specimens, another 600, 000 hours would be needed. Given the current biodiversity and economic crises, there is neither the time nor money to complete the collection at this rate. Through a combination of grants over the last few years, The New York Botanical Garden has been testing new protocols and tactics for increasing the rate of digitization through combinations of data collaboration, field book digitization, partial data entry and imaging, and optical character recognition (OCR) of specimen images. With the launch of the National Science Foundation’s new Advancing Digitization of Biodiversity Collections program, we hope to move forward with larger, more efficient digitization projects, capturing data from larger portions of the herbarium at a fraction of the cost and time.
Refereed DesignationRefereed
The evolution of digitization workflow strategies at the New York Botanical Garden are outlined in this insightful and metrics-based paper. The authors highlight the point that the current method of using a semi-automated approach to create a skeleton (index) record of the specimen is providing much faster throughput (125 indexed records / hour) and essentially instantaneous availability of data and images to the public, researchers included. In conclusion, they note their limiting step is now manpower -- the technology is there to go even faster, but now they need more staff, more volunteers and public participation to speed the current process.