March 24, 2017

0 0
Read Time:2 Minute, 31 Second

Status of the Project as of March 24th

I am currently in the process of exploring three tools:

  • OpenRefine, specifically name reconciliation and the RDF plugin
  • Karma
  • Gephi

My current challenges and end goals:

  • OpenRefine
    • I have been used OpenRefine to clean up ingest data.
    • I am also using it for name entity reconciliation, and have successfully reconciled names to Wikidata, VIAF, and LCSH.
    • My end goal, besides the normalization, would be to figure out how to reconcile names in the Whitney’s collection to other art museums with SPARQL endpoints, such as the Smithsonian American Art Museum.
  • Karma
    • I have been exploring Karma as a tool for mapping the Whitney’s Data to the CIDOC CRM.
    • OpenRefine can be used for mapping as well, and is somewhat easier to use than Karma, but it has limited options for RDF export (those being RDF/XML and and Turtle).
    • Karma supports the export of JSON-LD. My end goal would be to export the Whitney’s data mapped to CIDOC in this format.
    • Have tried to work with CSV data in Karma, I’m currently creating a MySQL database with my Whitney spreadsheets to see if this is a more effective intake format.
    • If successful, this could serve as a model for ingesting the Whitney’s data directly from TMS into Karma in the future.
  • Gephi
    • Having discovered that sections of the Getty Provenance Index and the Carnegie Museum of Art’s collection provenance are both available on GitHub as CSV files, I am currently starting to explore Gephi, in the hopes of using it to create a visualization of the overlap between these provenance sources and the Whitney Founding collection, using Molly Reese-Lerner & Hannah Sistrunk’s project (http://pfch.nyc/linked_jazz_meets_carnegie_hall/index.html)  in the Programming for Cultural Heritage class as a model.
    • I have been experimenting with working with CSV data in Gephi, but have found that I would need to create RDF files first to connect the datasets
  • Tableau
    • As Gephi is rather technical, I started working with the Whitney, Getty, and Carnegie datasets in Tableau.
    • Tableau maybe be a good visualization tool, but I’ve realized I need to do more normalization work before using it.

OpenRefine

I’ve decided to try cleaning up the Carnegie and Getty data in OpenRefine and to try reconciling some of the names in it.

I’m starting with the Carnegie’s Baring art sales data (https://github.com/arttracks/baring_art_sales) , which is kind of a mess. The seller and artwork of these pieces are both in one column, for instance, which is not very helpful for provenance. The names of the actual sellers are also often pretty vague (“Sir Addington”, etc).

In reconciling the Getty provenance data, I learned that Wikidata links to a bunch of institutional URIs for artists, including the Smithsonian and British Museum!

https://www.wikidata.org/wiki/Q704868

Karma

After some difficulty, I’m finally getting the hang of mapping with Karma.

Happy
Happy
0 %
Sad
Sad
0 %
Excited
Excited
0 %
Sleepy
Sleepy
0 %
Angry
Angry
0 %
Surprise
Surprise
0 %