Five task clusters that enable efficient and effective digitization of biological collections

TitleFive task clusters that enable efficient and effective digitization of biological collections
Publication TypeJournal Article
Year of Publication2012
AuthorsNelson, Gil, Paul Deborah L., Riccardi Gregory, and Mast Austin R.
Start Page19
Date Published07/2012
Other NumbersEF-1115210
KeywordsADBC, biodiversity informatics, Biological specimen collections, curation, digitization, EF-1115210, iDigBio, Imaging, paleontological specimen collections, task cluster, workflow
AbstractThis paper describes and illustrates five major clusters of related tasks (herein referred to as task clusters) that are common to efficient and effective practices in the digitization of biological specimen data and media. Examples of these clusters come from the observation of diverse digitization processes. The staff of iDigBio (The U.S. National Science Foundation’s National Resource for Advancing Digitization of Biodiversity Collections) visited active biological and paleontological collections digitization programs for the purpose of documenting and assessing current digitization practices and tools. These observations identified five task clusters that comprise the digitization process leading up to data publication: (1) predigitization curation and staging, (2) specimen image capture, (3) specimen image processing, (4) electronic data capture, and (5) georeferencing locality descriptions. While not all institutions are completing each of these task clusters for each specimen, these clusters describe a composite picture of digitization of biological and paleontological specimens across the programs that were observed. We describe these clusters, three workflow patterns that dominate the implemention of these clusters, and offer a set of workflow recommendations for digitization programs.
Refereed DesignationRefereed
From our paper: "The collections community has recognized that digitization processes need to be made more efficient to meet pressing scientific and societal needs (a topic broadly reviewed by Chapman 2005a), a notion supported by such initiatives as GBIF (, iDigBio ( and the Thematic Collections Networks funded by the National Science Foundation’s Advancing Digitization of Biodiversity Collections (ADBC) program (, Atlas of Living Australia (, ViBRANT (, and VertNet ( However, little has been published that characterizes modern existing and effective digitization workflows for a broad range of collections (e.g. plant, insect, vertebrate, fossil, microscope slides). We believe such characterizations are an early step in the process of building a common framework for sharing efficiencies across biological and paleontological research collections." Our paper strives to synthesize observations of digitization-in-progress from natural history collections housing a variety of specimen types while looking for the underlying common effective practices and seeking to illuminate where and how further progress might be made.