Citizen Science/Crowdsourcing Tools for Digitization

From iDigBio
Jump to: navigation, search


The Public Participation in Digitization Working Group is developing this list of citizen science and/or crowdsourcing tools for the digitization of biological and paleontological scientific collections. These are not necessarily available for widescale use, and potential users should contact the individual projects below to discuss availability.'


Several categories of tools are available to engage the public in various aspects of the digitization process. Descriptions have been taken directly from the website of the respective organization or publication.

  • Species- and Taxa-based Digitization Programs and Tools
  • Workflow Procedures, Data Management and Visualization
  • General Project Management and Social Media


Species- and Taxa-based Digitization Programs and Tools

  • BugGuide
    Collects “photographs of bugs from the United States and Canada for identification and research. We summarize our findings in guide pages for each order, family, genus, and species. More than just a clearinghouse for information, this site helps expand on the natural histories of subjects. By capturing the place and time that submitted images were taken, they are creating a virtual collection that helps define where and when things might be found.”
  • CitizenSort
    “Citizen Sort is a website that contains tools and games to classify various species of insects, animals, and plants. Our motivation for creating this website is based upon two goals. The first is to help biologists and ecologists with scientific classification tasks. The second is to help information scientists and human-computer interaction researchers evaluate the role of motivation in citizen science.”
  • CitSci.org
    “CitSci.org supports your research by providing tools and resources that allow you to customize your scientific procedure - all in one location on the internet. As your partner in research, CitSci.org provides tools for the entire research process including: creating new projects, managing project members, building custom data sheets, analyzing collected data, and gathering participant feedback.”
  • Curio
    "Curio is a crowdsourcing platform that connects interested citizens with researchers to help answer important questions in the sciences and humanities."
  • DiscoverLife
    Provides “free on-line tools to identify species, teach and study nature's wonders, report findings, build maps, process images, and contribute to and learn from a growing, interactive encyclopedia of life with 1,267,805 species pages and 623,990 maps.”
  • Encyclopedia of Life
    Mission: “To increase awareness and understanding of living nature through an Encyclopedia of Life that gathers, generates, and shares knowledge in an open, freely accessible and trusted digital resource.”
  • FromThePage
    “FromThePage is free software that allows volunteers to transcribe handwritten documents on-line. It's easy to index and annotate subjects within a text using a simple, wiki-like mark-up. Users can discuss difficult writing or obscure words within a page to refine their transcription. The resulting text is hosted on the web, making documents easy to read and search.” (Example from herpetology field notes)
  • HelpingScience
    “This is a website for processing herbarium specimen sheets using citizen science. (Currently conducting a closed beta testing before official release).”
  • Lifemapper
    Lifemapper “uses all online geospatial species occurrence data to create distribution maps and, notably, goes one step further to predict where an individual species could exist based on where it is documented to live. Lifemapper does this by combining species occurrence data with global climate, terrain and land cover information, to identify environmental correlates of species ranges.”
  • Mechanical Turk (Amazon)
    "Mechanical Turk is a marketplace for work. We give businesses and developers access to an on-demand, scalable workforce. Workers select from thousands of tasks and work whenever it's convenient."
  • Notes from Nature
    Notes from Nature is a digitization project allowing citizen scientists to transcribe museum records from one of several collections.
  • Specify
    “The Specify Software Project offers Specify 6 and allied applications for museum and herbarium research data processing. Specify 6 handles specimen information for computerizing collection holdings, for tracking specimen and tissue management transactions, and for mobilizing species occurrence data to the Internet.”
  • Symbiota
    “The Symbiota Software Project is working towards building a library of webtools to aid biologists in establishing specimen based virtual floras and faunas… The central premise of this open source software project is that through a partnership between software engines and the scientific community, higher quality and more publicly useful biodiversity portals can be built.”
  • USGS North American Bird Phenology Program
    “The North American Bird Phenology Program… exists now as a historic collection of six million migration card observations. Today, in an innovative project to curate the data and make them publically available, the records are being scanned and placed on the internet, where volunteers worldwide transcribe these records and add them into a database for analysis.”

Workflow Procedures, Data Management and Visualization

  • Apiary - open source web application for extraction of specimen data from herbarium sheet images.
    “The University of North Texas’ Texas Center for Digital Knowledge (TxCDK) and the Botanical Research Institute of Texas (BRIT) are conducting fundamental research with the goal of identifying how human intelligence can be combined with machine processes for effective and efficient transformation of textual museum specimen label information into high-quality machine-processible parsed data… The results of this research will yield a new workflow model for effective and efficient label data transformation, correction, and enhancement that can be replicated, adapted, and transferred to herbaria and other natural history collections.”
  • CartoDB
    “Visualize and analyze geospatial data”
  • DataOne Software Tools Catalog
    “The Software Tools database is the product of two NSF-funded Informatics Education Planning Workshops hosted by DataONE. The database provides a brief description of a wide range of tools that are recommended for use by scientists and students, as well as additional information and links to further resources. Users can access tools within the database by selecting keywords (under advanced search) or using free search.”
  • DataOne Data Management Best Practices
    “The DataONE Best Practices database provides individuals with recommendations on how to effectively work with their data through all stages of the data lifecycle. Users can access best practices within the database by either clicking on a stage of the lifecycle, selecting keywords (under advanced search) or using free search.”
  • EpiCollect
    “EpiCollect.net provides a web and mobile app for the generation of forms (questionnaires) and freely hosted project websites for data collection. Data are collected (including GPS and media) using multiple phones and all data can be viewed centrally (using Google Maps / tables / charts).”
  • FuzzyWuzzy
    “Fuzzy string matching in Python.”
  • National Geographic FieldScope
    “National Geographic FieldScope is a web-based platform to support our Community Geography Initiative. FieldScope is a mapping, analysis, and collaboration tool designed to support geographic investigations and engage citizen scientists in investigations of real-world issues—both in the classroom and in outdoor education settings.”
  • Open Data Kit
    “Open Data Kit (ODK) is a free and open-source set of tools which help organizations author, field, and manage mobile data collection solutions. ODK provides an out-of-the-box solution for users to:
         1. Build a data collection form or survey;
         2. Collect the data on a mobile device and send it to a server; and
         3. Aggregate the collected data on a server and extract it in useful formats.”
  • Taxamatch
    “Fuzzy matching algorithm for genus and species scientific names”

General Project Management and Social Media

  • Evernote (NLP/OCR API)
    “Evernote makes it easy to remember things big and small from your everyday life using your computer, phone, tablet and the web.”
  • Google Consumer Surveys
    “Get statistically significant, valid results at scale from real people, not biased panels”
  • Twitter
    “Twitter is a real-time information network that connects you to the latest stories, ideas, opinions and news about what you find interesting. Simply find the accounts you find most compelling and follow the conversations.”