Key Points
Introduction |
|
Working with OpenRefine |
|
Filtering and Sorting with OpenRefine |
|
Examining Outliers in OpenRefine |
|
Using Scripts |
|
Exporting and Saving Data from OpenRefine |
|
Other Resources in OpenRefine |
|
Glossary
including tab separated (tsv
), comma separated (csv
), Excel (xls
, xlsx
), JSON, XML, RDF as XML, Google Spreadsheets
- csv
- A file extension indicating that a text file that has values separated by commas (comma-separated-values).
- Clustering
- A method for finding different groups of values that may actually be representing the same thing.
- Faceting
- A method for exploring the values in a variable. In this episode it is used to explore the values in order to identify errors in data entry.
- Filter
- To select a subset of data from a dataframe.
- JSON
- A file extension indicating that the values in a text file are structured using JavaScript Object Notation (JSON).
- RDF
- A file that extension indicating that the values in a file are structured using Resource Description Framework (RDF).
- Regular expressions (regex)
- A text string for describing a search pattern. They usually incorporate the use of wildcards to match letters, numbers, punctuation, spacing, or some combination.
- tsv
- A file extension indicating that a text file that has values separated by tabs (tab-separated-values).
- xls
- A file extension indicating that a file is a spreadsheet created by Microsoft Excel.
- xlsx
- A file extension indicating that a file is a spreadsheet created by Microsoft Excel using XML.
- XML
- A file extension indicating that the values in a file are structured using Extensible Markup Language (XML).