Glossary of Tools

From iDigBio
Revision as of 15:30, 15 December 2011 by Kevinlove (talk | contribs) (Created page with "{| |- | ShortName | LongName | URL | Definition |- | Amazon EC2 | Amazon Elastic Compute Cloud | http://aws.amazon.com/ec2/ | Amazon Elastic Compute Cloud (Amazon EC2) is a...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search
ShortName LongName URL Definition
Amazon EC2 Amazon Elastic Compute Cloud http://aws.amazon.com/ec2/ Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides resizable compute capacity in the cloud. It is designed to make web-scale computing easier for developers. Amazon EC2�s simple web service interface allows you to obtain and configure capacity with minimal friction. It provides you with complete control of your computing resources and lets you run on Amazon�s proven computing environment. Amazon EC2 reduces the time required to obtain and boot new server instances to minutes, allowing you to quickly scale capacity, both up and down, as your computing requirements change. Amazon EC2 changes the economics of computing by allowing you to pay only for capacity that you actually use. Amazon EC2 provides developers the tools to build failure resilient applications and isolate themselves from common failure scenarios.
Animal Divesity Web Animal Divesity Web http://animaldiversity.ummz.umich.edu/site/index.html Animal Diversity Web (ADW) is an online database of animal natural history, distribution, classification, and conservation biology at the University of Michigan. Data include descriptions, still images, video, and audio.
ArcGIS ArcGIS http://www.esri.com/products/index.html State-of-the-art, industry standard geographical information system software built by ESRI.
Audubon Core Audubon Core http://species-id.net/wiki/Audubon_Core The Audubon Core is a set of vocabularies designed to represent metadata for biodiversity multimedia resources and collections. These vocabularies aim to represent information that will help to determine whether a particular resource or collection will be fit for some particular biodiversity science application before acquiring the media. Among others, the vocabularies address such concerns as the management of the media and collections, descriptions of their content, their taxonomic, geographic, and temporal coverage, and the appropriate ways to retrieve, attribute and reproduce them.
Automontage Automontage http://www.syncroscopy.com/syncroscopy/products.asp A software package produced by Syncroscopy and designed to produce clearly focused digital images with extreme depth-of-field.
BioGeoMancer BioGeoMancer http://www.biogeomancer.org/index.html A worldwide collaboration of natural history and geospatial data experts. The primary goal of the project is to maximize the quality and quantity of biodiversity data that can be mapped in support of scientific research, planning, conservation, and management. The project promotes discussion, manages geospatial data and data standards, and develops software tools in support of this mission.
CombineZ CombineZ http://www.broadhurst-family.co.uk/lefteye/MainPages/combinez.htm Freeware stacking software designed to produce a single, clearly focused image with high depth-of-field. Supports batch processing.
DiGIR Distributed Generic Information Retrieval http://digir.net/ (DiGIR) is a client/server protocol for retrieving information from distributed resources. It uses HTTP as the transport mechanism and XML for encoding messages sent between client and server. It is an opensource project, originally conceived to be the replacement for the Z39.50 protocol used in the Species Analyst project, but is intended to work with any type of information, not just Natural History collections.
djatoka djatoka http://www.dlib.org/dlib/september08/chute/09chute.html Open source JPEG 2000 image server.
Dryad Dryad http://datadryad.org/ Dryad is an international repository of data underlying peer-reviewed articles in the basic and applied biosciences. Dryad enables scientists to validate published findings, explore new analysis methodologies, repurpose data for research questions unanticipated by the original authors, and perform synthetic studies. Dryad is governed by a consortium of journals that collaboratively promote data archiving and ensure the sustainability of the repository. As of Nov 18, 2011, Dryad contained 1090 data packages and 2583 data files, associated with articles in 94 journals.
DSLR Digital Single Lens Reflex Camera http://en.wikipedia.org/wiki/Digital_single-lens_reflex_camera A digital camera, usually with interchangeable lenses, that captures digital images in one or more of a variety of formats.
DwC Darwin Core http://www.tdwg.org/standards/450/http://www.tdwg.org/standards/450/ The Darwin Core is a body of standards. It includes a glossary of terms (in other contexts these might be called properties, elements, fields, columns, attributes, or concepts) intended to facilitate the sharing of information about biological diversity by providing reference definitions, examples, and commentaries. The Darwin Core is primarily based on taxa, their occurrence in nature as documented by observations, specimens, and samples, and related information. Included are documents describing how these terms are managed, how the set of terms can be extended for new purposes, and how the terms can be used.
EXIF Exchangeable Image File Format http://www.exif.org/ A standard for storing image metadata and interchange information in image files, especially those using JPEG compression. Most digital cameras now use the EXIF format. The format is part of the DCF standard created by JEITA to encourage interoperability between imaging devices.
FilteredPush FilteredPush http://etaxonomy.org/mw/FilteredPush FilteredPush is an NSF-funded project of the Harvard University Herbarium and the UMASS-Boston Biodiversity Informatics Lab to build a platform for distributed annotation of distrtibuted data. It provides feedback to the data provider and any other parties interested in the annotations, based on configurable filters to select notification based on attributes of the annotation or the data. FilteredPush is designed to connect remote sites where annotations can be generated with the authoritative databases of the collections holding the vouchers to which those annotations apply. The name reflects function; Push, as annotations can be pushed from remote corners of the network back to authoritative data sets, Filtered, as the curators of these data sets can filter and reject annotations of their data.
GEOLocate GEOLocate http://www.museum.tulane.edu/geolocate/ Software designed to georeference natural history collections data by interpreting locality data and converting them into a specific geographic coordinate pairs, including an estimate of precision based on the point-radius method. GeoLocate may be downloaded as a standalone desktop application, used via a web-based interface, or run as a web service against an existing online database.
GigaPan GigaPan http://gigapansystems.com/ A robotic camera harness that allows one of numerous DSLR cameras to take multiple images that can be stitched together to form gigapixel, high-resolution image files.
GNIS Geographic Names Information System http://geonames.usgs.gov/pls/gnispublic/f?p=154:1:2731965477055666 The Geographic Names Information System (GNIS) is the federal and national standard for geographic nomenclature in the U.S. The U.S. Geological Survey developed the GNIS in support of the U.S. Board on Geographic Names as the official repository of domestic geographic names data, the official vehicle for geographic names used by all departments of the Federal Government, and the source for applying geographic names to Federal electronic and printed products. Datasets that can be downloaded include latitude/longitude coordinates, and may be imported into a GIS to assist in georeferencing.
Google Analytics Google Analytics http://www.google.com/analytics/ Google Analytics is the enterprise-class web analytics solution that gives you rich insights into your website traffic and marketing effectiveness. Powerful, flexible, and easy-to-use features now let you see and analyze your traffic data in an entirely new way.
Hadoop Hadoop http://hadoop.apache.org/ The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using a simple programming model. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-avaiability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-availabile service on top of a cluster of computers, each of which may be prone to failures.
HERBIS Erudite Recorded Botanical Information Synthesizer http://www.herbis.org/ This project offers proof of concept and an initial implementation of 'one-button' specimen imaging and data capture by which clicking the shutter on a digital camera would initiate a sequence that culminates with the population of label data and a specimen image into a structured collection database. The ultimate goal is to reduce the total cost of digital collection data capture by significantly reducing human labor required and total project duration. Significant gains can be achieved by developing appropriate protocols and methodologies, then packaging them as web services. Much of this can be accomplished by applying existing technology to data acquisition bottlenecks. The HERBIS webservice is now located at: http://www3.isrl.illinois.edu/~TeleNature/Herbis/src/web/htdocs/
HUBzero HUBzero http://hubzero.org/ HUBzero is a platform created by Purdue University with the collaboration of several other intitution and is used to create dynamic web sites for scientific research and educational activities. HUBzero allows users to easily publish research software and related educational materials on the web. It is specificially designed for scientific applications. InvertNet.org is built on this platform.
ICR Intelligent Character Recognition http://en.wikipedia.org/wiki/Intelligent_character_recognition An advanced optical character recognition (OCR) or rather more specific handwriting recognition system that allows fonts and different styles of handwriting to be learned by a computer during processing to improve accuracy and recognition levels.
ICZN International Commission on Zoological Nomenclature http://iczn.org/ The International Commission on Zoological Nomenclature (ICZN) acts as adviser and arbiter for the zoological community by generating and disseminating information on the correct use of the scientific names of animals. The ICZN is responsible for producing the International Code of Zoological Nomenclature, a set of rules for the naming of animals and the resolution of nomenclatural problems
ImageMagick ImageMagick http://www.imagemagick.org/script/index.php ImageMagick is a software suite to create, edit, compose, or convert bitmap images. It can read and write images in a variety of formats (over 100) including DPX, EXR, GIF, JPEG, JPEG-2000, PDF, PhotoCD, PNG, Postscript, SVG, and TIFF. ImageMagick can be used to resize, flip, mirror, rotate, distort, shear and transform images, adjust image colors, apply various special effects, or draw text, lines, polygons, ellipses and Bezier curves.
IPNI International Plant Names Index http://ipni.org/ The International Plant Names Index (IPNI) is a database of the names and associated basic bibliographical details of seed plants, ferns, and lycophytes. Its goal is to eliminate the need for repeated reference to primary sources for basic bibliographic information about plant names. The data are freely available and are gradually being standardized and checked. IPNI is the product of a collaboration between The Royal Botanic Gardens, Kew, The Harvard University Herbaria, and the Australian National Herbarium.
IPT GBIF Integrated Publishing Toolkit http://ipt.gbif.org/ The IPT is an open source software server written in Java that allows users to publish and share biodiversity datasets through the Global Biodiversity Information Facility network (GBIF). Designed for interoperability, it enables the publishing of content in databases or text files using open standards; namely the Darwin Core and the Ecological Metadata Language.
ITIS Integrated Taxonomic Information System http://www.itis.gov/ Authoritative, searchable taxonomic information on plants, animals, fungi, and microbes of North America and the world, including synonymous and accepted names.
JSON JavaScript Object Notation http://www.json.org/ JSON (JavaScript Object Notation) is a lightweight data-interchange format. It is easy for humans to read and write. It is easy for machines to parse and generate. It is based on a subset of the JavaScript Programming Language, Standard ECMA-262 3rd Edition - December 1999. JSON is a text format that is completely language independent but uses conventions that are familiar to programmers of the C-family of languages, including C, C++, C#, Java, JavaScript, Perl, Python, and many others. These properties make JSON an ideal data-interchange language.
KE Emu Electronic Museum Management System http://www.kesoftware.com/emu This is a commercial software product used by several biological collections. The provider is KE Software. Claims from the website state: KE Software's Electronic Museum management system, EMu, is a collections management system for all museums, from the small to the very large. Engineered to manage all types of collections, EMu is suited to: Cultural collections, Anthropology, Archae_ology, Science and Technology; Paintings, Drawings, Prints, Sculpture and 3-dimensional objects, Decorative Art, Performing Art, Photography, Textiles and Digital Objects; Natural History collections, including Zoology, Earth Sciences, Palaeobiology, Botany, Horticulture and Physical Anthropology; and Special collections, Digital Assets, Historical Societies and Archives.
KML Keyhole Markup Language http://code.google.com/apis/kml/ Keyhole Markup Language (KML) is an XML notation for expressing geographic annotation and visualization within Internet-based, two-dimensional maps and three-dimensional Earth browsers. KML was developed for use with Google Earth, which was originally named Keyhole Earth Viewer. It was created by Keyhole, Inc, which was acquired by Google in 2004. KML is an international standard of the Open Geospatial Consortium. Google Earth was the first program able to view and graphically edit KML files. Other projects such as Marble have also started to develop KML support.
MaNIS Mammal Networked Information System http://www.manisnet.org/ With support from the National Science Foundation, seventeen North American institutions and their collaborators developed the Mammal Networked Information System. The original objectives of MaNIS were to 1) facilitate open access to combined specimen data from a web browser, 2) enhance the value of specimen collections, 3) conserve curatorial resources, and 4) use a design paradigm that can be easily adopted by other disciplines with similar needs. As an NSF-funded initiative, MaNIS achieves these objectives while avoiding the need for long-term, external maintenance of the network and centralized data management.
MANTIS MantisBT http://www.mantisbt.org/ MantisBT is a free popular web-based insect tracking system. It is written in the PHP scripting language and works with MySQL, MS SQL, and PostgreSQL databases and a webserver. MantisBT has been installed on Windows, Linux, Mac OS, OS/2, and others. Almost any web browser should be able to function as a client. It is released under the terms of the GNU General Public License (GPL).
MapReduce MapReduce http://hadoop.apache.org/mapreduce/ Hadoop MapReduce is a programming model and software framework for writing applications that rapidly process vast amounts of data in parallel on large clusters of compute nodes.
Morphbank Morphbank :: Biological Imaging http://www.morphbank.net/index.php Morphbank :: Biological Imaging is a continuously growing database of images that scientists use for international collaboration, research and education. Images deposited in Morphbank :: Biological Imaging document a wide variety of research including: specimen-based research in comparative anatomy, morphological phylogenetics, taxonomy and related fields focused on increasing our knowledge about biodiversity. The project receives its main funding from the Biological Databases and Informatics program of the National Science Foundation (Grant DBI-0446224). Morphbank :: Biological Imaging was established in 1998 by a Swedish-Spanish-American group of entomologists and is currently housed at the School of Computational Science (SCS) at Florida State University. The project has grown immensely since its beginnings and presently includes a team of 15 biologists, computer scientists, and information scientists who are working on developing the software. Morphbank :: Biological Imaging is dedicated to using open-source software and is a Fair Use Web Site. The software used in the current Morphbank :: Biological Imaging system includes PHP, ImageMagick, MySQL, Apache, Java, and JavaScript.
NCSA The National Center for Supercomputing Applications http://www.ncsa.illinois.edu/ The National Center for Supercomputing Applications (NCSA), located at the University of Illinois at Urbana-Champaign, provides powerful computers and expert support that help thousands of scientists and engineers across the country improve our world. With the computing power available at NCSA, researchers simulate how galaxies collide and merge, how proteins fold and how molecules move through the wall of a cell, how tornadoes and hurricanes form, and other complex natural and engineered phenomena.
NIMBUS Nimbus Platform and Infrastructure http://www.nimbusproject.org/ An integrated set of tools that deliver the power and versatility of infrastructure clouds to scientific users. Nimbus Platform allows you to combine Nimbus, OpenStack, Amazon, and other clouds. Nimbus Infrastructure is an open source EC2/S3-compatible Infrastructure-as-a-Service implementation specifically targeting features of interest to the scientific community such as support for proxy credentials, batch schedulers, best-effort allocations, and others.
ODBC Open Database Connectivity http://en.wikipedia.org/wiki/ODBC ODBC (Open Database Connectivity) is a standard C interface for accessing database management systems (DBMS). The designers of ODBC aimed to make it independent of database systems and operating systems. An application can use ODBC to query data from a DBMS, regardless of the operating system or DBMS it uses. ODBC accomplishes DBMS independence by using an ODBC driver as a translation layer between the application and the DBMS. The application uses ODBC functions, and the driver passes the query to the DBMS.
ORNIS ORNIS http://www.ornisnet.org/home Over 5 million bird specimens are housed in North American collections, documenting the composition, distribution, ecology, and systematics of the world's estimated 10,000-16,000 bird species. Millions of additional observational records are held in diverse data sets. ORNIS addresses the urgent call for increased access to these data in an open and collaborative manner, and involves development of a suite of online software tools for data analysis and error-checking. This project, funded by the National Science Foundation, expands on existing infrastructure developed for distributed mammal (MaNIS), amphibian and reptile (HerpNet), and fish (FishNet) databases. Improved access to avian data sets will allow predictive uses to reveal patterns and processes of evolutionary and ecological phenomena that have not been apparent heretofore. Along with similar infrastructures for other vertebrate groups, it also will enable detailed and synthetic knowledge of the earth's biodiversity for tracking climate change, emerging diseases (e.g., West Nile Virus), and other conservation challenges for species in the 21st century.
PaleoPortal The Paleontology Portal http://www.paleoportal.org/ This site is a resource for anyone interested in paleontology, from the professional in the lab to the interested amateur scouting for fossils to the student in any classroom. Many resources are gathered into this single entry "portal" to paleontological information on the Internet.
RDF Resource Description Framework http://www.w3.org/RDF/ RDF is a standard model for data interchange on the Web. RDF has features that facilitate data merging even if the underlying schemas differ, and it specifically supports the evolution of schemas over time without requiring all the data consumers to be changed. RDF extends the linking structure of the Web to use URIs to name the relationship between things as well as the two ends of the link (this is usually referred to as a triple). Using this simple model, it allows structured and semi-structured data to be mixed, exposed, and shared across different applications. This linking structure forms a directed, labeled graph, where the edges represent the named link between two resources, represented by the graph nodes. This graph view is the easiest possible mental model for RDF and is often used in easy-to-understand visual explanations.
REST REpresentational State Transfer http://en.wikipedia.org/wiki/Representational_state_transfer Representational state transfer (REST) is a style of software architecture for distributed hypermedia systems such as the World Wide Web. The REST architectural style was developed in parallel with HTTP/1.1, based on the existing design of HTTP/1.0. The largest implementation of a system conforming to the REST architectural style is the World Wide Web. REST exemplifies how the Web's architecture emerged by characterizing and constraining the macro-interactions of the four components of the Web, namely origin servers, gateways, proxies, and clients, without imposing limitations on the individual participants. As such, REST essentially governs the proper behavior of participants.
Scratchpads Scratchpads http://scratchpads.eu/ Scratchpads is a social networking tool to build, share, and publish information on the diversity of life. It is part of the EU funded ViBRANT project, and the NERC funded e-Monocot project. As part of these initiatives a new version of the Scratchpads will be released in January 2012.
SOAP Simple Object Access Protocol http://en.wikipedia.org/wiki/SOAP SOAP, originally defined as Simple Object Access Protocol, is a protocol specification for exchanging structured information in the implementation of Web Services in computer networks. It relies on Extensible Markup Language (XML) for its message format, and usually relies on other Application Layer protocols, most notably Hypertext Transfer Protocol (HTTP) and Simple Mail Transfer Protocol (SMTP), for message negotiation and transmission. SOAP can form the foundation layer of a web services protocol stack, providing a basic messaging framework upon which web services can be built.
Specify Specify http://specifysoftware.org/content/welcome-specify-6 The NSF-funded Specify Software Project offers Specify 6 and allied applications for museum and herbarium research data processing. Specify 6 handles specimen information for computerizing collection holdings, for tracking specimen and tissue management transactions, and for mobilizing species occurrence data to the Internet. Specify runs on Windows, Mac OS X, and Linux computers; it is free and open source licensed. Specify 6.0 was released on 10 April 2009.
SQL Structured Query Language http://en.wikipedia.org/wiki/SQL SQL, often referred to as Structured Query Language, is a programming language designed for managing data in relational database management systems (RDBMS). Originally based upon relational algebra and tuple relational calculus, its scope includes data insert, query, update and delete, schema creation and modification, and data access control. SQL became a standard of the American National Standards Institute (ANSI) in 1986, and of the International Organization for Standards (ISO) in 1987. Since then the standard has been enhanced several times with added features. However, issues of SQL code portability between major RDBMS products still exist due to lack of full compliance with, or different interpretations of the standard.
TACC Texas Advanced Computing Center http://www.tacc.utexas.edu/ The Texas Advanced Computing Center (TACC) at The University of Texas at Austin is one of the leading centers of computational excellence in the United States. The center's mission is to enable discoveries that advance science and society through the application of advanced computing technologies.
TAPIR TDWG Access Protocol for Information Retrieval http://www.tdwg.org/activities/tapir/ TAPIR specifies a standardized, stateless, HTTP transmittable, XML-based request and response protocol for accessing structured data that may be stored on any number of distributed databases of varied physical and logical structure. TAPIR combines and extends features of the BioCASe and DiGIR protocols to create a new and more generic means of communication between client applications and data providers using the Internet.
TDWG Biodiversity Information Standards (TDWG) http://www.tdwg.org/ Also known as the Taxonomic Database Working Group, TDWG is a not for profit scientific and educational association that is affiliated with the International Union of Biological Sciences. TDWG was formed to establish international collaboration among biological database projects. TDWG promoted the wider and more effective dissemination of information about the World's heritage of biological organisms for the benefit of the world at large. Biodiversity Information Standards (TDWG) now focuses on the development of standards for the exchange of biological/biodiversity data.
TOLKIN The Tree of Life Knowledge and Information Network http://www.tolkin.org/ TOLKIN is an information management and analytical web application to provide informatics support for phylodiversity and biodiversity research projects. As a web-based application, TOLKIN is able to support collaborative projects by providing shared access to a variety of data on voucher specimens, taxonomy, bibliography, morphology, DNA samples, and sequences.
TROPICOS TROPICOS http://www.tropicos.org/ Tropicos was originally created for internal research but has since been made available to the world's scientific community. All of the nomenclatural, bibliographic, and specimen data accumulated in MBG's electronic databases during the past 25 years are publicly available here. This system has over 1.2 million scientific names and 3.9 million specimen records.
VMWare VMWare http://www.vmware.com/ A company providing virtualization software founded in 1998 and based in Palo Alto, California, USA. The company was acquired by EMC Corporation in 2004, and operates as a separate software subsidiary. The company is most notable for its hypervisors virtual machine managers (VMM), one of many hardware virtualization techniques that allow multiple operating systems, termed guests, to run concurrently on a host computer.
Windows Azure Windows Azure http://www.microsoft.com/windowsazure/ Windows Azure and SQL Azure enable users to build, host, and scale applications in Microsoft datacenters. They require no up-front expenses, no long term commitment, and enable users to pay only for the resources used.
WSDL Web Services Description Language http://www.w3.org/TR/wsdl WSDL is an XML format for describing network services as a set of endpoints operating on messages containing either document-oriented or procedure-oriented information. The operations and messages are described abstractly, and then bound to a concrete network protocol and message format to define an endpoint. Related concrete endpoints are combined into abstract endpoints (services). WSDL is extensible to allow description of endpoints and their messages regardless of what message formats or network protocols are used to communicate.
Xen Xen hypervisor http://xen.org/ The Xen hypervisor, the powerful open source industry standard for virtualization, offers a powerful, efficient, and secure feature set for virtualization of x86, x86_64, IA64, ARM, and other CPU architectures. It supports a wide range of guest operating systems including Windows, Linux, Solaris, and various versions of the BSD operating systems. Xen powers most public cloud services and many hosting services, such as Amazon Web Services, Rackspace Hosting, and Linode.
XML Extensible Markup Language http://www.w3.org/XML/ Extensible Markup Language (XML) is a simple, very flexible text format derived from SGML (ISO 8879). Originally designed to meet the challenges of large-scale electronic publishing, XML is also playing an increasingly important role in the exchange of a wide variety of data on the Web and elsewhere.
XSEDE Extreme Science and Engineering Discovery Environment https://www.xsede.org/home The Extreme Science and Engineering Discovery Environment (XSEDE) is the most advanced, powerful, and robust collection of integrated advanced digital resources and services in the world. It is a single virtual system that scientists can use to interactively share computing resources, data, and expertise. Scientists and engineers around the world use these resources and services things like supercomputers, collections of data, and new tools. XSEDE, and the experts who lead the program, will make these resources easier to use and help more people use them. The five-year, $121-million project is supported by the National Science Foundation. It replaces and expands on the NSF TeraGrid project. More than 10,000 scientists used the TeraGrid to complete thousands of research projects, at no cost to the scientists. That same sort of work only in more detail, generating more new knowledge and improving our world in an even broader range of fields continues with XSEDE.
Zoomify Zoomify http://www.zoomify.com/ A popular web-based image viewer that allows easy panning and zooming.