Data Without Borders ICE 2016: Difference between revisions

Jump to navigation Jump to search
no edit summary
No edit summary
No edit summary
 
(4 intermediate revisions by 2 users not shown)
Line 4: Line 4:
|Agenda=Agenda
|Agenda=Agenda
|Biblio=https://vimeo.com/album/4168896 Recorded Presentations
|Biblio=https://vimeo.com/album/4168896 Recorded Presentations
|Report=Report
|[https://www.idigbio.org/content/data-without-borders-symposium-xxv-international-congress-entomologists-ice-2016 Report]
}}
}}


Line 29: Line 29:
|145
|145
| [https://vimeo.com/album/4168896/video/184611447 Like blood from that stone we always hear about: a quest to extract meaningful data from historical grasshopper specimens]
| [https://vimeo.com/album/4168896/video/184611447 Like blood from that stone we always hear about: a quest to extract meaningful data from historical grasshopper specimens]
::'''Introduction''': One of the most ancient ecosystems in the southeastern U.S.A. is scrub, often associated with ridge systems that were most likely used as refugia during Pleistocene sea level changes. Following sea level stabilization, these habitats effectively remained islands due to unique soil composition and a lack of plant diversity leading to a myriad of floral and faunal endemics. In particular, arthropod endemics abound as in the grasshopper genus Melanoplus (Orthoptera: Acrididae: Melanoplinae). Many genus members possess short wings incapable of flight and are unable to easily disperse over large distances, which makes such Melanoplus species ideal candidates for examining speciation hypotheses. To test such hypotheses, the Puer Group (PG), comprised of 24 species with related morphology, was chosen. The group spans four neighboring states (FL, GA, SC, and NC), contains many scrub endemics, and its males exhibit great genitalia variation. A good beginning for delving deeper into the group’s evolutionary history was determining current species ranges by georeferencing around 5,000 specimens, borrowed from various U.S. collections and gathered in the field during recent expeditions.
::One of the most ancient ecosystems in the southeastern U.S.A. is scrub, often associated with ridge systems that were most likely used as refugia during Pleistocene sea level changes. Following sea level stabilization, these habitats effectively remained islands due to unique soil composition and a lack of plant diversity leading to a myriad of floral and faunal endemics. In particular, arthropod endemics abound as in the grasshopper genus ''Melanoplus'' (Orthoptera: Acrididae: Melanoplinae). Many genus members possess short wings incapable of flight and are unable to easily disperse over large distances, which makes such ''Melanoplus'' species ideal candidates for examining speciation hypotheses. To test such hypotheses, the Puer Group, comprised of 24 species with related morphology, was chosen. The group spans five neighboring states (FL, AL, GA, SC, and NC), contains many scrub endemics, and its males exhibit great variation in genitalia. A good beginning for delving deeper into the group’s evolutionary history was determining current species ranges by georeferencing around 6,000 specimens, borrowed from various U.S. collections and gathered in the field during recent expeditions. Via the creation of maps, detailed field notes, and different type of anatomical imaging, the backbone of this project is to collate as much data as possible for the Puer Group. Then, it will be disseminated to a wide audience for a trio of purposes: 1) raise awareness of a fascinating system of study, 2) create a solid platform for future studies to build upon, and 3) demonstrate the utility of integrating multiple methods of investigation.
::'''Methods''': Via the creation of maps, detailed field notes, and different type of anatomical imaging, the backbone of this project is to collate as much data as possible for the PG.
::'''Results/Conclusion''': Then, it will be disseminated it to a wide audience for a trio of purposes: 1) raise awareness of a fascinating system of study, 2) create a solid platform for future studies to build upon, and 3) demonstrate the utility of integrating multiple methods of investigation.  
| '''Derek Woller''' (asilid@gmail.com) and Hojun Song, Texas A & M University, College Station, TX
| '''Derek Woller''' (asilid@gmail.com) and Hojun Song, Texas A & M University, College Station, TX
|-
|-
Line 45: Line 43:
|-
|-
|230  
|230  
| Harnessing specimen data to visualize and investigate the ecology of species
| [https://vimeo.com/album/4168896/video/184610236 Harnessing specimen data to visualize and investigate the ecology of species]
::The process of digitizing specimen data can be done via a collecting event approach in order to maximize efficiency and accuracy. The collecting event approach involves attaching specimen data to previously digitized collecting event information, rather than attaching data to specimens. Using source materials such as field notes allows for less transcription errors and increases the precision with which localities can be georeferenced. In addition, this approach distinguishes between true absences and collecting artifacts, allowing a researcher to investigate why specimens occur at certain sites and are absent at others. Several recent digitization projects using this method will be examined, including the North American Macroinvertebrate Database and CReAC. Specimen data can be digitized using source materials such as field notes in order to increase the accuracy of the data and the efficiency of the digitization workflow.  
::The process of digitizing specimen data can be done via a collecting event approach in order to maximize efficiency and accuracy. The collecting event approach involves attaching specimen data to previously digitized collecting event information, rather than attaching data to specimens. Using source materials such as field notes allows for less transcription errors and increases the precision with which localities can be georeferenced. In addition, this approach distinguishes between true absences and collecting artifacts, allowing a researcher to investigate why specimens occur at certain sites and are absent at others. Several recent digitization projects using this method will be examined, including the North American Macroinvertebrate Database and CReAC. Specimen data can be digitized using source materials such as field notes in order to increase the accuracy of the data and the efficiency of the digitization workflow.  
| '''Sarah Schmits''' (scschmits@ku.edu), Andrew Short, University of Kansas, Lawrence, KS
| '''Sarah Schmits''' (scschmits@ku.edu), Andrew Short, University of Kansas, Lawrence, KS
|-
|-
|245  
|245  
| The usefulness of DNA-barcoding databases for routine taxonomic research and identification of Lepidoptera
| [https://vimeo.com/album/4168896/video/184611440 The usefulness of DNA-barcoding databases for routine taxonomic research and identification of Lepidoptera]
::Due to our extensive knowledge about the taxonomy of Lepidoptera and the ease with which the appropriate tissue samples can be obtained from dry specimens in collections, butterflies and moths have served as a model group for developing DNA barcoding methodology. DNA barcoding has now become a routine tool in taxonomy, and many species of Lepidoptera have been barcoded at least once. For example, the Barcode of Life initiative has produced almost a million such sequences for 84,000 species of Lepidoptera (half of the world’s described fauna). However, most species are represented by a single or few barcodes. In my presentation, using my own taxonomic work as well as several examples from the work of my colleagues at the McGuire Center for Lepidoptera and Biodiversity, I would like to argue that many more resources need to be invested into mitochondrial DNA barcoding, generating not only barcodes for species and subspecies that have not been barcoded to date, but also representing as many populations and individuals as possible.  While genomic methods are increasingly popular for the purposes of phylogenetic reconstruction and evolutionary research, and currently dominate grant proposals, I argue that it is equally important to direct more resources towards DNA barcoding, which has proved to be the best taxonomic tool developed in the last 100 years for resolving current taxonomic conundrums, for revealing cryptic species and for describing biodiversity.
::Due to our extensive knowledge about the taxonomy of Lepidoptera and the ease with which the appropriate tissue samples can be obtained from dry specimens in collections, butterflies and moths have served as a model group for developing DNA barcoding methodology. DNA barcoding has now become a routine tool in taxonomy, and many species of Lepidoptera have been barcoded at least once. For example, the Barcode of Life initiative has produced almost a million such sequences for 84,000 species of Lepidoptera (half of the world’s described fauna). However, most species are represented by a single or few barcodes. In my presentation, using my own taxonomic work as well as several examples from the work of my colleagues at the McGuire Center for Lepidoptera and Biodiversity, I would like to argue that many more resources need to be invested into mitochondrial DNA barcoding, generating not only barcodes for species and subspecies that have not been barcoded to date, but also representing as many populations and individuals as possible.  While genomic methods are increasingly popular for the purposes of phylogenetic reconstruction and evolutionary research, and currently dominate grant proposals, I argue that it is equally important to direct more resources towards DNA barcoding, which has proved to be the best taxonomic tool developed in the last 100 years for resolving current taxonomic conundrums, for revealing cryptic species and for describing biodiversity.
| '''Andrei Sourakov''' (asourakov@flmnh.ufl.edu), University of Florida, Gainesville, FL
| '''Andrei Sourakov''' (asourakov@flmnh.ufl.edu), University of Florida, Gainesville, FL
Line 60: Line 58:
|-
|-
|330  
|330  
| Preventing Bugs in Data Analysis: Data Skills to Improve the Reliability and Effectiveness of Entomological Research
| [https://vimeo.com/album/4168896/video/185379654 Preventing Bugs in Data Analysis: Data Skills to Improve the Reliability and Effectiveness of Entomological Research]
::Our increasing capacity to collect data is changing science. This is particularly true as specimen data is being digitized and availability of data is no longer the bottleneck. There is great potential for discovery, but we are primarily failing to translate this sea of data into scientific advances, because researchers are not trained in the skills needed for effective management and analysis. The question then becomes, in addition to scaling data production and computation, how do we develop and deliver training to scale data literate researchers? Course curriculums are slow to change, need qualified instructors and are already full. Short courses are oversubscribed and reach a limited number of participants. To provide scalable and distributed training, Data Carpentry develops and teaches domain-specific hands-on workshops in data organization, management, and analysis. This is a grassroots training effort developed by practitioners for practitioners, who identify core skills and collaboratively develop lessons. All lessons are open source, and workshops are taught by volunteers trained by the Software Carpentry Foundation. With iDigBio, a focus has been on training in the biodiversity community. Workshops are designed for people with little to no prior computational experience and teach in two days how to organize and clean data, manage data in SQL and analyze and visualize data in R – the full data lifecycle. Workshops are in high demand, but this model allows for scaling of training and teaches the foundational skills to get biologists started managing and analyzing their data effectively.
::Our increasing capacity to collect data is changing science. This is particularly true as specimen data is being digitized and availability of data is no longer the bottleneck. There is great potential for discovery, but we are primarily failing to translate this sea of data into scientific advances, because researchers are not trained in the skills needed for effective management and analysis. The question then becomes, in addition to scaling data production and computation, how do we develop and deliver training to scale data literate researchers? Course curriculums are slow to change, need qualified instructors and are already full. Short courses are oversubscribed and reach a limited number of participants. To provide scalable and distributed training, Data Carpentry develops and teaches domain-specific hands-on workshops in data organization, management, and analysis. This is a grassroots training effort developed by practitioners for practitioners, who identify core skills and collaboratively develop lessons. All lessons are open source, and workshops are taught by volunteers trained by the Software Carpentry Foundation. With iDigBio, a focus has been on training in the biodiversity community. Workshops are designed for people with little to no prior computational experience and teach in two days how to organize and clean data, manage data in SQL and analyze and visualize data in R – the full data lifecycle. Workshops are in high demand, but this model allows for scaling of training and teaches the foundational skills to get biologists started managing and analyzing their data effectively.
| '''Tracy Teal''' (tkteal@datacarpentry.org), Michigan State University, East Lansing, MI
| '''Tracy Teal''' (tkteal@datacarpentry.org), Michigan State University, East Lansing, MI
|-
|-
|345  
|345  
| Developing Best Practices for Data Management Across all Stages of the Data Life Cycle <br/>
| [https://vimeo.com/album/4168896/video/184610149 Developing Best Practices for Data Management Across all Stages of the Data Life Cycle] <br/>
::Best practices for empirical data collection (experimental design, laboratory techniques) are often well-covered in undergraduate and graduate training, yet there has been less emphasis on managing the resulting data effectively. This is an increasingly important skill set; many funding agencies require data management plans, and journals are requiring that data pertaining to published articles be accessible. Researchers with good data management skills will be able to maximize the productivity of their research program, effectively and efficiently share their data with the scientific community, and potentially benefit from the re-use of their data by others. In this talk, I will highlight some of the pitfalls to be avoided when working with data and introduce example best practices and tools that will improve your data management skills and research program.
::Best practices for empirical data collection (experimental design, laboratory techniques) are often well-covered in undergraduate and graduate training, yet there has been less emphasis on managing the resulting data effectively. This is an increasingly important skill set; many funding agencies require data management plans, and journals are requiring that data pertaining to published articles be accessible. Researchers with good data management skills will be able to maximize the productivity of their research program, effectively and efficiently share their data with the scientific community, and potentially benefit from the re-use of their data by others. In this talk, I will highlight some of the pitfalls to be avoided when working with data and introduce example best practices and tools that will improve your data management skills and research program.
| '''Amber Budden''' (aebudden@dataone.unm.edu), DataONE, Albuquerque, NM
| '''Amber Budden''' (aebudden@dataone.unm.edu), DataONE, Albuquerque, NM
|-
|-
|400  
|400  
| Data capture methodologies in digitisation of bee pollinators  
| [https://vimeo.com/album/4168896/video/184610177 Data capture methodologies in digitisation of bee pollinators]
::Digitisation is an activity that museums and academic institutions increasingly recognize, though many still do not embrace, as a means to boost the impact of collections for global research and society through improved access. And as such, many researchers still fail to realise the importance of data capture methodologies used in digitisation. New opportunities exist to design and implement processes through use of the available technology that will support data capture to enable a range of research on biodiversity of pollinators in order to make scientific collections increasingly relevant. While the usefulness of specimen digitisation is true for all taxa, immense additional benefits come from the digitisation of bees. This group of organisms is of prime importance as they provide most of the world’s pollination ecosystem services. Through international collaborative efforts, the wealth of data in natural history museums and collections about the diversity, distribution and biology of bees may be utilised for international biodiversity efforts.
::Digitisation is an activity that museums and academic institutions increasingly recognize, though many still do not embrace, as a means to boost the impact of collections for global research and society through improved access. And as such, many researchers still fail to realise the importance of data capture methodologies used in digitisation. New opportunities exist to design and implement processes through use of the available technology that will support data capture to enable a range of research on biodiversity of pollinators in order to make scientific collections increasingly relevant. While the usefulness of specimen digitisation is true for all taxa, immense additional benefits come from the digitisation of bees. This group of organisms is of prime importance as they provide most of the world’s pollination ecosystem services. Through international collaborative efforts, the wealth of data in natural history museums and collections about the diversity, distribution and biology of bees may be utilised for international biodiversity efforts.
| '''Nicole Fisher''' (Nicole.Fisher@csiro.au), Australian National Insect Collection (ANIC), Clayton, Australia
| '''Nicole Fisher''' (Nicole.Fisher@csiro.au), Australian National Insect Collection (ANIC), Clayton, Australia
|-
|-
|415  
|415  
| The Current State of Arthropod Biodiversity Data In North America: Can We Address Impacts of Global Change?
| [https://vimeo.com/album/4168896/video/184610173 The Current State of Arthropod Biodiversity Data In North America: Can We Address Impacts of Global Change?]
::There are well over 500 million arthropod specimens housed in approximately 1,000 collections worldwide.  Although reliable estimates are not available, it is likely that less than 5% of these specimens have been digitized and the current rate of digitization is probably not even adequate to keep pace with the acquisition of new specimens.  If we hope to achieve the goal of digitizing all specimens by 2050 we need to develop global networks that can overcome many of the constraints we face today. We will review the current holdings of arthropods in collections across continents and digitization efforts from data providers to aggregators. More specifically, we will assess the type of collaborations needed and the technological and social network areas that are developing to obtain the goal of full digitization.
::There are well over 500 million arthropod specimens housed in approximately 1,000 collections worldwide.  Although reliable estimates are not available, it is likely that less than 5% of these specimens have been digitized and the current rate of digitization is probably not even adequate to keep pace with the acquisition of new specimens.  If we hope to achieve the goal of digitizing all specimens by 2050 we need to develop global networks that can overcome many of the constraints we face today. We will review the current holdings of arthropods in collections across continents and digitization efforts from data providers to aggregators. More specifically, we will assess the type of collaborations needed and the technological and social network areas that are developing to obtain the goal of full digitization.
| '''Neil Cobb''' Northern Arizona University (NAU), Edward Gilbert (egbot@asu.edu), Nico Franz, and Katja C. Seltmann
| '''Neil Cobb''' Northern Arizona University (NAU), Edward Gilbert (egbot@asu.edu), Nico Franz, and Katja C. Seltmann
|-
|-
|430  
|430  
| Database before you label – the key to a digitized collections future
| [https://vimeo.com/album/4168896/video/184611446 Database before you label – the key to a digitized collections future]
::Digitization of millions of historic entomology specimens remains an enormous challenge. Our community should not make this challenge worse by generating newly collected, undigitized specimens. Entomologists in North America currently generate many tens of thousands of new specimens annually, that get added to our undigitized backlog. The University of Alaska Museum Insect Collection contains over 1 million specimens represented by ~230,000 database records, of which, 82% have been collected since the year 2000. This talk will describe the rapid growth of our collection and database. Methods used are similar to those established by Costa Rica's INBio in the 1990s.
::Digitization of millions of historic entomology specimens remains an enormous challenge. Our community should not make this challenge worse by generating newly collected, undigitized specimens. Entomologists in North America currently generate many tens of thousands of new specimens annually, that get added to our undigitized backlog. The University of Alaska Museum Insect Collection contains over 1 million specimens represented by ~230,000 database records, of which, 82% have been collected since the year 2000. This talk will describe the rapid growth of our collection and database. Methods used are similar to those established by Costa Rica's INBio in the 1990s.
| '''Derek S. Sikes''' (dssikes@alaska.edu), University of Alaska, Fairbanks, AK
| '''Derek S. Sikes''' (dssikes@alaska.edu), University of Alaska, Fairbanks, AK
|-
|-
|445  
|445  
| Troubleshooting industrial insect digitisation
| [https://vimeo.com/album/4168896/video/185379632 Troubleshooting industrial insect digitisation]
::Natural history collections are one of the most important sources of biodiversity information and their digitization is essential for providing greater access to both researchers and the general public. Industrial approaches are needed in order to mobilise the vast numbers of specimens (up to 10 billion) accumulated by the natural history museums in the world. Following the experience of the Digital Collection Programme (DCP) in the Natural History Museum we explore several ways of optimising the digitization process of insect collections. Success is impossible without an organised approach to project management, staff buy-in and administrative support on all levels. Key elements of industrial digitisation are: detailed  yet flexible workflows which can accommodate different kinds of digitised material; automation through software and hardware; appropriate staff management; and community involvement.
::Natural history collections are one of the most important sources of biodiversity information and their digitization is essential for providing greater access to both researchers and the general public. Industrial approaches are needed in order to mobilise the vast numbers of specimens (up to 10 billion) accumulated by the natural history museums in the world. Following the experience of the Digital Collection Programme (DCP) in the Natural History Museum we explore several ways of optimising the digitization process of insect collections. Success is impossible without an organised approach to project management, staff buy-in and administrative support on all levels. Key elements of industrial digitisation are: detailed  yet flexible workflows which can accommodate different kinds of digitised material; automation through software and hardware; appropriate staff management; and community involvement.
| '''Vladimir Blagoderov''' (vlab@nhm.ac.uk) and Laurence Livermore, The Natural History Museum, Cromwell Road, London, England
| '''Vladimir Blagoderov''' (vlab@nhm.ac.uk) and Laurence Livermore, The Natural History Museum, Cromwell Road, London, England
|-
|-
|500
|500
| DAMmed If You Do or Don’t: Life Cycles of Digital Assets
| [https://vimeo.com/album/4168896/video/184610191 DAMmed If You Do or Don’t: Life Cycles of Digital Assets]
::Imaging of specimens is now regular curatorial practice in entomological collections, complementing longer-standing efforts to capture label data and related information.  Many different imaging approaches exist, but a common thread is that vast quantities of images are being amassed rapidly around the globe.  Managing, preserving, and safeguarding this proliferation of images is critical to the success of digitizing entomological collections.  This talk examines the life cycle of digital assets produced during imaging projects at the Yale Peabody Museum, with focus on student driven workflows in the Entomology Division and other curatorial departments.  Once acquired, Peabody’s digital assets flow through its collections management system into a Yale University-wide digital asset management system (DAM).  Peabody Entomology helped develop the Yale DAM, harmonize workflow and metadata from dissimilar campus units, and integrate several collections management systems with a single DAM endpoint.  Adopting this infrastructure has allowed Peabody to disseminate its images and specimen metadata more broadly into “foreign” contexts, such as the Yale Library’s Finding Aid system and a campus asset discovery portal, alongside more well-known biodiversity outlets for entomological collections such as GBIF and the National Science Foundation’s iDigBio initiative.
::Imaging of specimens is now regular curatorial practice in entomological collections, complementing longer-standing efforts to capture label data and related information.  Many different imaging approaches exist, but a common thread is that vast quantities of images are being amassed rapidly around the globe.  Managing, preserving, and safeguarding this proliferation of images is critical to the success of digitizing entomological collections.  This talk examines the life cycle of digital assets produced during imaging projects at the Yale Peabody Museum, with focus on student driven workflows in the Entomology Division and other curatorial departments.  Once acquired, Peabody’s digital assets flow through its collections management system into a Yale University-wide digital asset management system (DAM).  Peabody Entomology helped develop the Yale DAM, harmonize workflow and metadata from dissimilar campus units, and integrate several collections management systems with a single DAM endpoint.  Adopting this infrastructure has allowed Peabody to disseminate its images and specimen metadata more broadly into “foreign” contexts, such as the Yale Library’s Finding Aid system and a campus asset discovery portal, alongside more well-known biodiversity outlets for entomological collections such as GBIF and the National Science Foundation’s iDigBio initiative.
| '''Lawrence Gall''' (lawrence.gall@yale.edu), Yale University, New Haven, CT
| '''Lawrence Gall''' (lawrence.gall@yale.edu), Yale University, New Haven, CT
|-
|-
|515  
|515  
| Involving undergraduates in the digital community: Leveraging collections preservation, research, and outreach through a network of natural history collections clubs
| [https://vimeo.com/album/4168896/video/184610204 Involving undergraduates in the digital community: Leveraging collections preservation, research, and outreach through a network of natural history collections clubs]
::In February of 2013 nine students at Arkansas State University came together to form the Natural History Collections Curation Club (NHC3). This club was an innovative approach to resolving many issues facing the natural history collections at A-State. The university houses collections in many disciplines. The collections were primarily built in the 1960s and 1970s and by 2013 several of the collections were in disrepair due to a lack of funding and support. The students of the club made it their goal to restore the collections by dedicating their time and helping to secure funding. These efforts have resulted in funding from the Dean of the College of Sciences and Mathematics for a part-time student worker in the collections, supplies for several projects including jars and ethanol for restoring the fish collections and materials to create two large specimen mounts, and trips to visit several natural history museums. The NHC3 has helped A-State become recognized in the collections field where it was previously unknown. The club has also helped other universities increase student interest and involvement in collections. To date, two other universities have active natural history collections clubs as a result of the A-State model. Beginning in the fall of 2015 these three clubs will form a network to outreach to other universities that may benefit from this model. Our goal is to use the Natural History Collections Club Network (NHCCN) as a platform to motivate students across the United States to become more involved in university specimen collections.
::In February of 2013 nine students at Arkansas State University came together to form the Natural History Collections Curation Club (NHC3). This club was an innovative approach to resolving many issues facing the natural history collections at A-State. The university houses collections in many disciplines. The collections were primarily built in the 1960s and 1970s and by 2013 several of the collections were in disrepair due to a lack of funding and support. The students of the club made it their goal to restore the collections by dedicating their time and helping to secure funding. These efforts have resulted in funding from the Dean of the College of Sciences and Mathematics for a part-time student worker in the collections, supplies for several projects including jars and ethanol for restoring the fish collections and materials to create two large specimen mounts, and trips to visit several natural history museums. The NHC3 has helped A-State become recognized in the collections field where it was previously unknown. The club has also helped other universities increase student interest and involvement in collections. To date, two other universities have active natural history collections clubs as a result of the A-State model. Beginning in the fall of 2015 these three clubs will form a network to outreach to other universities that may benefit from this model. Our goal is to use the Natural History Collections Club Network (NHCCN) as a platform to motivate students across the United States to become more involved in university specimen collections.
| '''Kari Harris''' (kari.panhorst@smail.astate.edu), Arkansas State University, Jonesboro, AR
| '''Kari Harris''' (kari.panhorst@smail.astate.edu), Arkansas State University, Jonesboro, AR
4,707

edits

Navigation menu