IT

Exploring unique values in iDigBio using Apache Spark

Data exploration for large datasets is always challenging. Often you are left with deciding between subsetting the dataset (randomly or on some facet), making slow progress waiting for results just to find that something needs to be fixed, or optimizing code for performance when you don't even know if the result is going to be interesting. Having a high-performance system capable of ad-hoc investigation has always been difficult and/or expensive.

iDigBio IT Standards Workshop

 

iDigBio Informatics and Cyberinfrastructure Workshop

Building 105, University of Florida

(Noon, Wed. 28 March – 5pm, Fri. 30 March)

A focused workshop sponsored by iDigBio will be conducted to discuss, define, and distill the standards and cyberinfrastructure requirements for the scientific collections and biodiversity sciences.  An operational assumption is that the biodiversity informatics community has already made significant effort and headway in certain tools, practices, and standards, and what’s now crucial is to identify critical gaps in the infrastructure and training that can be a further focus of community activities.

Subscribe to RSS - IT