Adventures in Linked Data: Building a Connected Research Environment Lisa Goddard, Memorial University Libraries & INKE Research Group Access 2012, Montreal QC October 19th, 2012 Canadian Writing Research Collaboratory (CWRC) CFI funded initiative to establish an online infrastructure for literary research in and about Canada. www.cwrc.ca Canadian Writing Research Collaboratory (CWRC) “To enable unprecedented avenues for studying the words that most move people in and about Canada.” Authoring Dissemination Discovery Collaboration “We need to stop talking around the issue of the single-author monograph as the benchmark for excellence.” (DH Manifesto, 2004) Senior Scholars Librarians Programmers Emergent Scholars Project Managers Why Linked Data for the Humanties? Improved Search Find all references to food in Joyce’s Ulysses. New Discovery Tools Instead of looking for a needle in a haystack, an effective text mining tool will show you the shape of the haystack and tell you what needles are there that you'd want to stick in your foot. (Rockwell, 2011) Interoperability Disparate data sources and incompatible data structures are among the biggest obstacles for 21st century humanities researchers. (RIN, 2011) Collaboration Linked data doesn’t just accommodate collaboration, it enforces collaboration. Big (Text) Data Scale is a new horizon of intellectual inquiry. What kinds of humanistic phenomena appear only at scale? (Liu, 2012) Heterogeneity Text data is messy. Getting Started Dev Platform The Orlando Project Identify Top Level Entities http://domain.org/doc.html Annotation Mint URIs for Entities http://this.ca/event http://this.ca/place http://this.ca/person http://this.ca/doc Annotation http://this.ca/org http://this.ca/annotation http://this.ca/film http://this.ca/book Cool URIs Don’t Change Do you really feel that the old URIs cannot be kept running? If so, you chose them very badly. - Sir Tim Sir Tim Berners-Lee is judging you. Minting URIs Abstract away from implementation details. http://tiger.cwrc.ca/person.php?id= virginia-woolf&format=rdf http://cwrc.ca/person/virginia-woolf Canonical URIs http://dbpedia.org/resource/Virginia_woolf (303 redirects and content negotiation) http://dbpedia.org/page/ Virginia_Woolf http://dbpedia.org/data/ Virginia_Woolf.rdf URI Patterns http://dbpedia.org/resource/Mary_Shelley http://id.loc.gov/authorities/names/n85300519 http://viaf.org/viaf/95216565/ http://cwrc.ca/person/6qMXIR5UNQSiupA Define Relationships http://this.ca/event http://this.ca/place http://this.ca/person employs http://this.ca/doc Annotation http://this.ca/annotation http://this.ca/org adaptedFrom http://this.ca/film http://this.ca/book RDF Statement wrote http://viaf.org/viaf/39385478/ http://dbpedia.org/resource/Mrs_Dalloway http://purl.org/dc/elements/1.1/creator subject predicate object Machine Readable Definitions http://this.ca/event http://this.ca/place ? http://this.ca/person employs http://this.ca/doc Annotation http://this.ca/annotation http://this.ca/org adaptedFrom http://this.ca/film http://this.ca/book Ontologies The semantic web is basically an accessibility initiative for machines. Role of Ontologies • • • • Define entities (classes) Define relationships (properties) Impose rules to support machine reasoning. Machine-readable (RDF) Reuse or Build? An ontology is for life. Ontology Dowsing foaf:Person (Class) frbr:Work (Class) Assigning Entities to Classes cwrc:Woolf rdf:type http://cwrc.ca/person/456789 foaf:Person http://xmlns.com/foaf/0.1/Person http://www.w3.org/1999/02/22-rdf-syntax-ns#type rdf:type cwrc: frbr:Work MrsDalloway http://cwrc.ca/person/456789 http://purl.org/vocab/frbr/core#Work http://www.w3.org/1999/02/22-rdf-syntax-ns#type Ontology: Predicates Person to Person e.g. Relationship Ontology (rel:EnemyOf) Document to Document e.g. Bibliographic Ontology (bibo:reviewOf) Annotation to Document e.g. Open Annotation Ontology (oa:hasTarget) Ontology: Predicates LifePartnerOf Person Person Domain Range Classes and predicates work together to support reasoning. Woolf Woolf Woolf creator Mrs. Dalloway livedIn Mrs. Dalloway lifePartnerOf Mrs. Dalloway CWRC RDF Ontologies Entity Relevant Ontologies Person FOAF, MADS, EAC-CPF Work FRBR, MODS, DC Place WGS84 Geo Positioning, Geonames Organization FOAF, EAC-CPF Event OWL Time, Event Annotation Open Annotation, OAI-ORE So Far • defined our major entities and relationships • minted URIs to represent those things within the CWRC data store • selected ontologies that will help computers to reason about our entities and relationships Data Model Integrating the Data Model Creates CWRC-Writer Tagging Toolbar Results from Authority File. API Lookup to Web Services. VIAF OpenSearch http://viaf.org/viaf/search?query=local.names+all+%22mar garet%20atwood%22+&maximumRecords=100&startRecor d=1&sortKeys=holdingscount&httpAccept=application/rss %2Bxml DBPedia RESTful API curl -H "Content-Type: text/xml" -d @post.xml http://dbpedia.org/fct/service > result.xml Mary Shelley Add custom identifier. Entity in XML person Bertrand Russell Bertrand Russell, 1872-1970 36924137 11923140 118604287 n79056054 definite List of named entities. Add Relationship. Relations Tab Relationship in RDF Entity in RDF ent_1798 117 16 true person Bertrand Russell Bertrand Russell, 1872-1970 36924137 11923140 118604287 n79056054 definite oa:Annotation http://cwrc.ca/annotation/12345 oa:hasSemanticTag http://cwrc.ca/work/12345 http://cwrc.ca/person/12345 “Patricia Spence was nanny to Russell’s children before becoming his third wife.” CWRC Entity Management System Enables Populates Triple Store Entity storage Discovery Tools Entity lookup Entity Management Interface (add, edit, merge, split entities) CWRC Writer (document authoring interface) Credits CWRC Susan Brown Mariana Parades-Olea Jeffery Antoniuk Denilson Barbosa Ruth Knetchel Omar Rodriguez-Arenas INKE Ray Siemens Stan Ruecker Harvey Quamen John Simpson Jentery Sayers Related Links CWRC Writer Beta: http://cwrctc.artsrn.ualberta.ca/ CWRC Writer Overview: http://www.cwrc.ca/projects/infrastructureprojects/technical-projects/cwrc-writer/ About CWRC: http://www.cwrc.ca/about/