Friday, December 3, 2010

Interesting New Tool from Google for Cleaning Data

Announcing Google Refine 2.0, a power tool for data wranglers


Google acquired Metaweb last July and has release a version of their Freebase Gridworks, an open source software project for cleaning and enhancing entire data sets as Google Refine.


Google Refine is a tool for working with messy data sets, including cleaning up inconsistencies, transforming them from one format into another, and extending them with new data from external web services or other databases. 


The product is used, among other places, in the the data journalism and open government data communities (see. for example, Chicago Tribune, ProPublica and data.gov.uk). To learn more about what you can do with Google Refine 2.0, watch the following screencasts:








No comments:

Post a Comment