Whitney Linked Open Data Fellowship Journal

August 19, 2016

Possible Directions for the Project As far as the direction of the project, after reading Joshua’s documentation of his work this past year, I’m thinking I might like to focus on enriching data on collection objects or creating relationships between them, unless you think it […]

August 26, 2016

Exploration of Past Project Work and Possible Entities Having started Matt Miller’s Program for Cultural Heritage class, I looked at a few of the past museum and art-focused projects done by students in the class to get some ideas. One interesting project was Carlos Acevedo’s […]

September 2, 2016

After my second week of classes and listening to Professor Pattuelli lecture on Linked Open Data, I have somewhat of a better sense of how the RDF triples used with LOD work, and how to use them to express relationships. Our discussion of LIDO and […]

September 9, 2016

I’m curious to what extent object metadata can be pulled from TMS. I found these applications, though given that the Whitney already has an online collection, it may be redundant http://binder.readthedocs.io/en/latest/user-manual/overview/intro.html https://github.com/smoore4moma/tms-api I’m interested in generating csv file(s) from TMS with provenance info for objects […]

September 16, 2016

I’ve started a spreadsheet to record acquisition-related constituents for purchased objects in the Founding Collection. I’m noting the constituent’s name, name authority record, objects they are the constituents of, their relation to the object (artist, dealer, etc), links to sites with more information, and possible […]

September 23, 2016

DBpedia and ULAN both seem to have richer sets of relationship properties than Wikidata, so they might be more useful resources to query. Robert Henri’s DBpedia page, for instance, has “movement”, “training”, “influenced”, “influenced by”, and “seeAlso” properties. Again, however, pulling data from DBpedia may […]

September 30, 2016

I’m kind of curious how the tables that make up the backend of the TMS database are organized. I’m taking Database Design this semester, and am eventually going to start working with MySQL in more depth. I’m wondering whether it be worthwhile to make a […]

October 7, 2016

GraphDB I received some overly-aggressive sales emails from GraphDB after downloading a free version of the app which have kind of put me off from using them.  Basically, when I downloaded the free version of the desktop app, someone in the sales department at GraphDB […]

October 14, 2016

Triple Stores Given that MySQL is not well-suited as a triple store, I think it would make sense to migrate and append Joshua’s data in a non-relational, triple store format. The Linked Jazz Project at Pratt uses Apache Marmotta for their triples, and I’m thinking […]

October 20, 2016

Discussion w/Profs Pattuelli and Miller What ontologies to use for triples? Research appropriate ontologies using Linked Open Vocabularies (https://lov.okfn.org/) Prof. Pattuelli is partial to CIDOC because it is the industry standard. Prof. Miller argues that the chosen ontology is not that important if the end […]

October 21, 2016

More on the database/triple store issue objectJSONLD.py & artistJSONLD.py – script used to create JSON-LD files from CSV (using Python application PyLD) Is it better to just focus on the CSV files rather than the MySQL database? Either CSV files or SQL could be used […]

October 25, 2016

As of MySQL Version 5.7.8, MySQL can be used to generate JSON. If these JSON files could be put into a triple store, the generation of PHP files in addition to JSON files may not be necessary: http://dev.mysql.com/doc/refman/5.7/en/json-functions.html

October 28, 2016

I started a fork of the Whitney’s opendata Github repository for my project files. Not sure if forking or branching is a better approach for keeping my contributions to the project, but I can always change the structure later: https://github.com/MollieEcheverria/opendata I’m currently trying to re-model […]

November 4, 2016

Project Overview for Rest of Semester and Spring I am aiming to have a data model for the Whitney ready and approved by the first week of December.  Once the data model is established, I will query for name authority records for Provenance-related constituents. As […]

November 10, 2016

I created a preliminary data model for the Whitney’s data using the Smithsonian and British museum’s URIs and the Carnegie Museum’s provenance model as references. I also looked at a mapping of CIDOC to LIDO done by researchers at FORTH-ICS in 2010 (http://www.cidoc-crm.org/Resources/the-lido-model), and compared […]

November 17, 2016

Issues With CIDOC Mapping – CIDOC’s Lack of Namespace URIs One issue I’ve noticed in trying to map data to CIDOC is that CIDOC doesn’t seem to provide individual namespace URIs for its different classes and properties.  Instead, these terms are all stored in a […]

November 18, 2016

Linked Data in a Relational Database Prof. Miller gave my Program for Cultural Heritage class read-only access to the NYPL’s archives database. Looking at the way this data is structured, particularly access terms for constituents, is helpful in thinking about tabular structure for the Whitney’s […]

November 22, 2016

https://neo4j.com/ https://app.graphenedb.com/dbs http://www.linkeddatatools.com/introducing-rdf http://blog.datagraph.org/2010/04/rdf-nosql-diff https://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VOSIndex http://wifo5-03.informatik.uni-mannheim.de/bizer/d2r-server/publishing/ https://sourceforge.net/projects/trdf/files/?source=navbar

December 2, 2016

Berenson Drawings of the Florentine Painters Project Hearing Alex Provo speak in our Art Documentation class on the Drawings of the Florentine Painters linked data project has given me a lot of inspiration both for the handling of the Whitney ontology and for how to […]

December 12, 2016

Drawings of the Florentine Painters Project and Whitney Model After reading more on Drawings of the Florentine Painters, I think the project offers a very concise model for how to proceed with the Whitney’s linked data in the spring. While I was originally thinking to […]

December 16, 2016

Access server – Access granted – Linux 3M – tricky. Maybe not necessary Alex – can come in to look at project Look at what is on the server – is there stuff on there Josh added that is not mirrored elsewhere Look at the […]

December 30, 2016

Setting Up Server Access I was unfortunately unable to set up access to Joshua’s Whitney database server before the break. I have an appointment to troubleshoot the issue with Alison on Thursday, January 5th at 3. VIAF My first attempt at gathering name authorities for […]

January 5, 2017

Incorporating External URIs Into the Whitney’s Linked Data After my success with querying VIAF last week, I’d like to explore incorporating person/institution URIs from other institutions into the Whitney’s constituent URIs as well. The Florentine Renaissance Drawings project plans to incorporate linked data and images from […]

January 12, 2017

Wikidata and SNAC At the most recent Linked Jazz meeting, Karen noted that Wikidata now has Social Networks Archival Context (SNAC) IDs for some URIs: (https://www.wikidata.org/wiki/Q188969) Not sure how many Whitney constituents would have Wikidata URIs/SNAC URIs, but this might present some interesting enrichment opportunities. […]

January 13, 2017

TMS Update Issues The Whitney has just activated the 2016 version of TMS (upgraded from 2012). Due to my lack of admin privileges, I cannot uninstall the old version of TMS and replace it with the new. I don’t use TMS that frequently, but this […]

January 20, 2017

Meeting to Touch Base on the Project At Cristina’s suggestion, I used Doodle to try to schedule a meeting with everyone. She also suggested everyone could Skype rather than meeting in person. Given everyone’s conflicting schedules, and that Prof. Pattuelli will be at the ARLIS […]

January 23, 2017

Meeting With Everyone I’m still waiting on a response from Matt regarding a meeting with everyone to discuss the Fellowship. I will ask him about it at the next Linked Jazz meeting on Thursday. Tabular Data Cleaning I made some good progress on Friday in […]

January 31, 2017

Data Refinement I am continuing to work on cleaning the Whitney’s constituent data in OpenRefine and Google Sheets. I started looking into Google Fusion Tables, which seems like a good alternative to Tableau as a geocoding application. Possible source for data cleaning/reconciliation info: http://freeyourmetadata.org/  As […]

February 1, 2017

End Project Deliverable Interoperability w/ SAAM – what resources does SAAM have that Whitney doesn’t? SAAM URIs for artists tend to have photos – do something with photos? Actors Distinguish between Person and Corporate Body

February 7, 2017

Data Modeling and Normalization I’m continuing the work I started last week, focused on normalizing my and Joshua’s combined data by splitting each CIDOC entity class into its own sheet (which will be the basis of a MySQL table eventually) As I work on the […]

February 14, 2017

CIDOC Namespaces Fortunately, CIDOC seems to have finally added back pages for its individual entity classes and properties: http://www.cidoc-crm.org/Version/version-6.2 Unfortunately, I don’t know if these pages can be considered persistent URIs, since each entity and property page is bookended by “/version-6.2”, and pages will not […]

February 16, 2017

Omeka as Publishing Platform As per this upcoming Museums and the Web workshop (http://mw17.mwconf.org/proposal/innovative-applications-and-data-sharing-with-linked-open-data-in-museums-exploring-principles-and-examples/), Omeka apparently supports RDF exporting in some form. I’ve never used Omeka, but will be using it in one of my courses later this semester. I’m not sure how viable it […]

February 22, 2017

Getty Provenance Info The Getty has the sales records of Knoedler Gallery available as a CSV file on Github. They also have a ton of other provenance tools: Overview of provenance-related datasets and search tools at the Getty:  http://www.getty.edu/research/tools/provenance/search.html The Getty’s Github repository, where they […]

February 24, 2017

Getty’s Knoedler Dataset I’m thinking it might be interesting to compare the Getty’s Knoedler dataset with the Whitney’s. While none of the Whitney Founding Collection objects came from Knoedler, they were an influential 19th-early 20th century gallery, and sold to Gertrude Vanderbilt Whitney’s father/grandfather(?) Cornelius […]

February 26, 2017

Getty Provenance Index as LOD         To watch: a video discussing the Getty’s linked data initiative: https://www.youtube.com/watch?v=1HRbP4zjqPM

February 28, 2017

Fast Forward: Painting from the 1980s I had the opportunity to attend a curator-led staff tour of one of the Whitney’s current exhibitions before lunch. Not directly related to project work, but interesting to get some context on the work in the show and trends […]

March 7, 2017

Installing Elysa I spent much of last week attempting unsuccessfully to install the Carnegie’s Elysa tool. Elsya has not been updated in almost two years, and I’m having issues installing Foreman, the process manager it runs on. It has been a learning process in general […]

March 17, 2017

Data Work I’m still at it with the tabular data work. I spent the morning working alternately with MySQL Workbench and OpenRefine, trying to refresh my knowledge of SQL and regular expressions in the process.   The amount of time I’ve been spending on this […]

March 21, 2017

Project Status I’m planning to meet with Farris on Friday to touch base on the project Gephi Nodes vs. edges: http://www.touchgraph.com/assets/navigator/help2/module_7_1.html Importing CSV data: https://github.com/gephi/gephi/wiki/Import-CSV-Data

March 23, 2017

OpenRefine Name Entity Reconciliation Karen Hwang from METRO/Linked Jazz just published an extremely helpful article on her own name reconciliation work. Karen’s article: http://www.mnylc.org/fellows/2017/03/17/using-openrefine-to-reconcile-name-entities/ Karen’s scripts: https://github.com/kllhwang/Named-Entity-Reconciliation-with-OpenRefine

March 24, 2017

Status of the Project as of March 24th I am currently in the process of exploring three tools: OpenRefine, specifically name reconciliation and the RDF plugin Karma Gephi My current challenges and end goals: OpenRefine I have been used OpenRefine to clean up ingest data. […]

March 28, 2017

Notes from Meeting With Farris OpenRefine – maybe just go with that if easier Farris’ notes from provenance conference What other data sets Goals: Combine as many datasets as possible. How to visualize – data in action Yearly accession data Timeline External resources Incorporating external […]

March 29, 2017

Relevant notes from a Metadata: Description and Access guest lecture by Corey Harper of Elsevier Unlike Carnegie Arttracks team, not skeptical of linked data as a tool in implementing machine learning. “Ease of integration across data sources – merging graphs” ETL – Extract, Transform, Load Formal […]

April 4, 2017

The Drawings of the Florentine Painters Alex Provo’s Florentine Renaissance Painters site just went live. The interface is very user-friendly, and definitely seems like it could be a useful art historical research tool even for those without a background working with linked data. The graphs […]

April 5, 2017

External URIs I’ve started collecting URIs for purchase-related constituents: Props Added: VIAF ID – https://www.wikidata.org/wiki/Property:P214 ✔ LCNAF ID – https://www.wikidata.org/wiki/Property:P244 ✔ ULAN ID – https://www.wikidata.org/wiki/Property:P245 ✔ British Museum person-institution – https://www.wikidata.org/wiki/Property:P1711 ✔ Smithsonian American Art Museum: person/institution thesaurus id – https://www.wikidata.org/wiki/Property:P1795 ✔

April 6, 2017

Notes from Cristina URI best practices – look at Semantic Web for the Working Ontologist Very clear-cut manual LOD View Matt in Cleveland, Cristina can only meet Thursday 4/13

April 9, 2017

Dataset Work I did some at-home work on Sunday, using the OpenRefine RDF plugin to create an RDF/XML dataset. I used the RDF plugin with the bloody-byte.net CIDOC-core schema (http://bloody-byte.net/rdf/cidoc-crm/core_5.0.1#) to map the purchase_constituents csv file generated from TMS to CIDOC. I had some difficulty […]

April 11, 2017

Dataset Work I’m continuing in my work from Sunday and attempting to create RDF/XML using Joshua’s finalObject.csv file from last semester His script for generating JSON-LD (2015-16_JDull/Code/objectsCode/objectJSONLD.py) could be helpful for format conversion, though http://rdf-translator.appspot.com/ could also work, depending on how accurate it is. Tableau […]

April 17, 2017

Defining the Whitney’s URIs https://www.lri.fr/~hamdi/datalift/tuto_inspire_2012/Suggestedreadings/egovld.pdf https://www.w3.org/TR/2008/NOTE-cooluris-20080331/ https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/60975/designing-URI-sets-uk-public-sector.pdf

April 18, 2017

Defining the Whitney’s URIs Cont’d I have been compiling this doc: https://docs.google.com/document/d/1l1TRkof1G13eee78wwtBhGd1v1N2zGWwvrcUq009MmI/edit?usp=sharing Using these resources: http://edan.si.edu/saam/id/person-institution/2297/birth/date http://edan.si.edu/saam/id/person-institution/6801 http://edan.si.edu/saam/id/person-institution/2297 http://collection.britishart.yale.edu/id/page/person/institution/1281 http://lov.okfn.org/dataset/lov/terms https://www.w3.org/ns/person#citizenship https://www.wikidata.org/wiki/Q535334 http://www.ontotext.com/proton/protontop.html#lastName http://schema.org/ExhibitionEvent http://semanticscience.org/resource/SIO_000181.rdf http://dublincore.org/documents/dcmi-terms/#ISO15836 http://collection.whitney.org/object/3270 http://www.cidoc-crm.org/Property/p70-documents/version-6.2 TMS Screenshots Fields used by the Whitney in TMS:

April 19, 2017

Whitney URI Guide Maybe define each individual property used? http://mwdl.org/docs/MWDL_DC_Profile_Version_2.0.pdf Also look at Whitney Content Standard Element Sets

April 22, 2017

Working at Home Trying to use regular expressions to separate provenance text columns in OpenRefine:

April 24, 2017

Working at Home Continued I have been working on my guidelines for Whitney URI creation over the past few days, and hope to have it done by tomorrow. The CIDOC CRM website has been offline since yesterday afternoon, which is troubling. I think the general […]

April 27, 2017

Notes from Farris Plan to have PDF version ready for IT by Tuesday Instead of collection.whitney.org, resolve to opendata.whitney.org Define what RDF, RDFS, Schema, etc, are Conceptual map as large PDF Add purchase dates to Tableau viz, create a timeline Add a section explaining OpenRefine […]

April 28, 2017

Working on URI Proposal I’ve been continuing to work on my Whitney Linked Data proposal doc I discovered that CIDOC’s namespace IRIs finally resolve correctly, meaning I no longer have to use Erlangen as a vocabulary for my proposal. Meeting with Cristina and Farris I […]

May 1, 2017

Working at Home I’ve been continuing to work on my Whitney URI guide since Friday. I added more details on using OpenRefine for name reconciliation/RDF generation. I SPARQL-queried Wikidata to find all of its art-related properties:

May 2, 2017

Presenting to Research Resources Department I presented the guidebook I created to the Whitney’s Research Resources Department. My slides are here: https://docs.google.com/presentation/d/1AAmDZXERmDbLjVWJ0bjeC4Iu4ebmxGt4cZ7kvLjPJtE/edit?usp=sharing Next week, I will be presenting to the representatives from the Whitney’s IT department, along with a representative from the Curatorial department.