Darwin Core: An Evolving Community-Developed Biodiversity Data Standard

Publication TypeJournal Article
Year of Publication2012
AuthorsWieczorek, John
Secondary AuthorsBloom, David, Guralnick Robert, Blum Stan, Döring Markus, Giovanni Renato, Robertson Tim, and Vieglais David
JournalPLoS One
Start Pagee29715
Date Published01/2012
KeywordsDarwin Core, Data Standards, metagenomics
AbstractBiodiversity data derive from myriad sources stored in various formats on many distinct hardware and software platforms. An essential step towards understanding global patterns of biodiversity is to provide a standardized view of these heterogeneous data sources to improve interoperability. Fundamental to this advance are definitions of common terms. This paper describes the evolution and development of Darwin Core, a data standard for publishing and integrating biodiversity information. We focus on the categories of terms that define the standard, differences between simple and relational Darwin Core, how the standard has been implemented, and the community processes that are essential for maintenance and growth of the standard. We present case-study extensions of the Darwin Core into new research communities, including metagenomics and genetic resources. We close by showing how Darwin Core records are integrated to create new knowledge products documenting species distributions and changes due to environmental perturbations.
Refereed DesignationRefereed
This paper works well as an introduction to answer the question, "What is Darwin Core?" After an overview of the development of the Darwin Core Standard, and current examples of Darwin Core data in use, this paper goes on to describe current goals and challenges of integrating biodiversity data and what that means for Darwin Core. From the Introduction section of the paper: "Darwin Core [20] is a standard for sharing data about biodiversity – the occurrence of life on earth and its associations with the environment. Darwin Core first emerged around 1999 as a loosely defined set of terms, and progressed through several iterations by different groups resulting in many different variants [21]. A formal set of terms and processes to manage changes were necessary to ensure utility for data integration. These aspects were developed within the Darwin Core Task Group [22] of the Taxonomic Databases Working Group (TDWG; www.tdwg.org) and ratified as a standard in October 2009. The philosophy for Darwin Core development has been to keep the standard as simple and open as possible and to develop terms only when there is shared demand. Darwin Core has a relatively long history of community development and is deployed widely [23]–[24]. For example, the Global Biodiversity Information Facility currently indexes approximately 300 million Darwin Core-formatted records published by more than 340 organizations in 43 countries. Increasingly, Darwin Core is being incorporated in communities beyond that of natural history collections (Figure 1), in which the standard has its roots."