Resources for using OpenRefine: Difference between revisions

From iDigBio
Jump to navigation Jump to search
No edit summary
No edit summary
Line 16: Line 16:


Download OpenRefine from https://openrefine.org.
Download OpenRefine from https://openrefine.org.
=Handy expressions=
value+"yourtexthere"
value.toDate().toString('YYYY-MM-dd')
value.replace(/\s+/,' ')
cells["'''COLUMN-1'''"].value[0] == cells["'''COLUMN-2'''"].value[0]
cell.cross("'''TABLE-2'''", "'''COLUMN-TO-MATCH-ON'''")[0].cells["'''COLUMN-TO-GET-VALUE-FROM'''"].value
forEach(row.record.cells[''''COLUMN''''].value,v,v).uniques().length()
forEach(value.parseJson().results[0].'''TARGET''',x,[x.types[0], x.'''TARGET'''].join("::")).join("|")
https://maps.googleapis.com/maps/api/geocode/json?latlng="+value+"&key='''KEY'''


=Join the community=
=Join the community=

Revision as of 17:47, 20 May 2022

OpenRefine logo color.png

Why use OpenRefine?

OpenRefine is an open-source tool for manipulating small or large datasets in numerous formats (CSV, JSON, XML, etc.). Because of its low barrier to entry with no prior programming knowledge needed, OpenRefine is an excellent tool to for the improvement and maintenance of data integrity for best practices in collections management. Data transformations are reversible and repeatable, and original data are locally preserved. The learning curve for OpenRefine is moderate, with a large community of users and shared knowledge base for help. You can use the resources on this wiki page as a starting point!

When to use OpenRefine

  • For quality control, e.g. to clean recent data entry prior to (or after) database ingestion, or to clean legacy data.
  • For combining and manipulating existing datasets, e.g. to transform or integrate your data with external resources like those in a taxonomic authority or Wikidata.

When not to use OpenRefine

  • For adding new records individually to an existing dataset, e.g. when transcribing specimen labels.
  • For text-heavy one-off data entry, e.g. when typing a sentence in a notes field associated with each row.
  • For projects with multiple users on separate computers.

Getting started

Download OpenRefine from https://openrefine.org.

Handy expressions

value+"yourtexthere" value.toDate().toString('YYYY-MM-dd')

value.replace(/\s+/,' ')

cells["COLUMN-1"].value[0] == cells["COLUMN-2"].value[0]

cell.cross("TABLE-2", "COLUMN-TO-MATCH-ON")[0].cells["COLUMN-TO-GET-VALUE-FROM"].value

forEach(row.record.cells['COLUMN'].value,v,v).uniques().length()

forEach(value.parseJson().results[0].TARGET,x,[x.types[0], x.TARGET].join("::")).join("|")

https://maps.googleapis.com/maps/api/geocode/json?latlng="+value+"&key=KEY

Join the community

There are many audiences for OpenRefine, and the best community to join is one that aligns with your usage context and skill level. The OpenRefine Google Group is maintained by OpenRefine, and most messages posted are more technical.

Tutorials

Data Carpentry: OpenRefine for Natural History Collection Data

Griffith University Library: Data Wrangling Introduction

Library Carpentry

YouTube & Vimeo