Introduction

Overview

Teaching: 10 min
Exercises: 0 min
Questions
  • What is OpenRefine useful for?

Objectives
  • Describe OpenRefine’s uses and applications.

  • Introduce some of OpenRefine’s features.

  • Locate helpful resources to learn more about OpenRefine.

Lesson

What is data cleaning exactly?

Motivations for the OpenRefine Lesson

Features

Before we get started

Note: this is a Java program that runs on your machine (not in the cloud). It runs inside your browser, but no web connection is needed.

Follow the Setup instructions to install OpenRefine.

If after installation and running OpenRefine, it does not automatically open for you, point your browser at http://127.0.0.1:3333/ or http://localhost:3333 to launch the program.

Getting help for OpenRefine.

You can find out a lot more about OpenRefine at http://openrefine.org and check out some great introductory videos. These videos and other on OpenRefine can also be found on YouTube, search under ‘OpenRefine’ There is a Google Group that can answer a lot of beginner questions and problems. As with other programs of this type, OpenRefine libraries are available too, where you can find a script you need and copy it into your OpenRefine instance to run it on your dataset.

What should I know when working with OpenRefine?

Key Points

  • OpenRefine is a powerful, free and open source tool that can be used for data cleaning.

  • OpenRefine will automatically track any steps allowing you to backtrack as needed and providing a record of all work done