The Grand Challenge - Uniting the Nation’s Biodiversity Collections through Digitization

Tue, 2011-11-22 17:42 -- jgrabon

On any given evening, it is commonplace for the nightly news to refer to debt and spending amounts in the billions and trillions of dollars. The use of these massive numbers is so ubiquitous that many of us have become numb to the true magnitude of what one billion objects represent. Consider this: to count from one to one billion, one number per second, without breaks or sleep, would take you thirty one years!  Now consider that most estimates indicate that there are over one billion biological and paleontological specimens stored in natural history museums throughout the United States, with new specimens added every day. Over a period of hundreds of years, from locations across the world, humankind has accumulated a vast collection of insects, mammals, plants, fossils, fish, birds and other important life forms, many of which are now extinct. However, only 10% of these U.S. specimens are available online. An even smaller fraction of these specimens is on display in museums at any given time. The result is that objects of high historic or scientific value may remain locked away from public view and blocked from easy access by researchers for years, decades, or a lifetime. For some specimens, their existence may be completely unknown to otherwise interested parties.

Imagine for a moment that you are a detective with a mystery to solve. The world is counting upon you to gather the evidence, analyze the facts, and produce a logical and verifiable conclusion that solves the mystery. However, unlike the events that unfold on your favorite evening crime drama, there is no known crime scene from which you can harvest your evidence. Instead, your evidence is located in one, or perhaps dozens or even hundreds, of locations across the United States. To further complicate your challenge, you do not know which locations may or may not contain your evidence. There are several hundred possible locations to sort through, and each location houses thousands of shelves of evidence.

The detective scenario above is an analogy for the real-world challenges faced every day by scientists, researchers, environmentalists, legislators, agriculturalists, naturalists, and the general public. Evidence is required for research related to climate change, evolutionary science, loss of biodiversity, natural disasters, the tracking of disease vectors and agricultural pests, agricultural development, and even species identification for a butterfly spotted by a budding lepidopterist. A coordinated, large-scale effort to build a digital library that represents the specimens contained within the natural history museums of the United States, and that brings that information together into a national collections resource for discovery and sharing, is indeed a Grand Challenge.

The National Science Foundation has funded an initiative named Advancing Digitization of Biodiversity Collections (ADBC) that aims to begin to tackle this challenge. Networks of institutions across the United States are being selected to begin extensive digitization of their collections. This includes digital recording of label data associated with each specimen, images, field notes from the collector, audio recordings, DNA, and many more important data elements. The institutions participating in the ADBC program are aligned to serve a particular theme of research and data sharing, and as such are known as Thematic Collections Networks, or TCNs. The ADBC program also funds a coordinated HUB between the University of Florida in Gainesville, FL and Florida State University in Tallahassee, FL. The HUB, known as Integrated Digitized Biocollections (iDigBio), endeavors to engage both the collections community and consumers of digitized collection data, to produce a unified search portal and supporting toolsets that enable awareness of, and access to, the massive amount of data contained within U.S. natural history collections.

Advancing Digitization of Biodiversity Collections cannot complete this mission alone. Since merely counting to one billion would take an individual 31 years, imagine the level of effort required to handle, photograph and record pertinent information for one billion biological forms. Technical challenges also abound. How do you accurately photograph a fish suspended in a rounded glass jar that distorts the image? How do you rapidly transcribe a tiny label that was hand-written one hundred years ago and pinned beneath a butterfly that obscures the view of the label? How do you efficiently assign geographic coordinates representing the collection location of millions of specimens collected long before the invention of GPS?

These are but a few of the exciting and intriguing challenges that iDigBio, partnered with the ADBC program affiliates, digitization tool providers, and other collections partners throughout the world, will endeavor to overcome in the coming years. Through this blog, we look forward to keeping you abreast of our progress, the technical challenges we encounter, and the successes that we achieve, as well as interesting articles related to biodiversity and the exciting applications of digitized biodiversity data.

Now you know our mission, and you can begin to appreciate the challenges in front of us. But imagine that 10 years have passed, and every one of those one billion specimens has been cataloged and photographed. You now have access to all associated data through the iDigBio collections search portal. What would you do with these data? What questions would you answer? What mystery would you solve? We would love to hear from you.