SPNHC 2014: Progress in Digitization: Supporting Data Flows in the New England Vascular Plant Network with FilteredPush Technologies

Thu, 2014-05-29 14:14 -- ellwood
TitleSPNHC 2014: Progress in Digitization: Supporting Data Flows in the New England Vascular Plant Network with FilteredPush Technologies
Publication TypePresentation
Year of Publication2014
AuthorsMorris, Paul J., Hanken James, Kelly Maureen, Lowery David B., Ludäscher Bertram, Macklin James A., McCallum Chuck, Morris Robert A., Song Tianhong, and Sweeney Patrick
KeywordsDarwinCore, digitization, FilteredPush, quality control, Specify, SPNHC 2014, Symbiota
AbstractFilteredPush is supporting digitization in two Thematic Collections Networks (TCNs), the Southwest Arthropods Collections Network (SCAN) and the New England Vascular Plant project (NEVP). In NEVP, minimal data records, including current taxonomic identification, state and town of collection, and date collected are created during imaging at high-throughput digitization stations at digitization sites. Current identification is obtained at the folder level in a pre-capture step and associated with the specimens upon imaging. FilteredPush transports data from the digitization sites by wrapping DarwinCore terms in Open Annotation ontology documents, including metadata about the when, where, and who of digitization, typing as New Occurrence records, and minimal AudubonCore for the images. These annotations are ingested into the NEVP Symbiota portal. The annotations are then available for ingest into the databases of record through new occurrence ingest tools in the Specify 6 collections management system. Additional data will be transcribed, in Symbiota, from the specimen images, and, supporting the science goals of the project, flowering and fruiting state will be coded for some taxa. These assertions will be wrapped in annotations, typed to reflect domain business operations, and transported to the relevant collections for ingest. Records harvested from network participants into datastores within FilteredPush will be subject to quality control from an Akka workflow that tests the taxon name, georeference, and date collected values of each record. Quality control issues (including proposed corrections) are reported in response to queries by researchers and transported through annotations to both Symbiota and the databases of record.
URLhttps://www.idigbio.org/sites/default/files/workshop-presentations/spnhc2014/16_Morris_gwyth_ei_hidlo_NEVP_mini.pdf