Motivation

How do you encode the piece of knowledge:

"The theory of relativity was discovered by Albert Einstein." 

or or
There is no unique way (in XML) to represent knowledge.
Information represented in such ways is not easy to integrate. (Why?)
RDF helps to solve this problem.

Goals

  • Understand the RDF data model, including
    • URI and IRI concepts
    • Triples
    • Resources
    • Literals
    • Blank nodes
    • Lists

Prerequisites

  • Basic understanding of Web technologies, data types

RDF Overview

  • RDF = Resource Description Framework
  • W3C Recommendation since 1998
  • RDF is a data model
    • Originally used for metadata for web resources, then generalized
    • Encodes structured information
    • Universal, machine readable exchange format
  • Data structured in graphs
    • Vertices, edges

Parts of the RDF graph

  •  URIs
    • Used to reference resources unambiguously
  • Literals
    • Describe data values with no clear identity like "100 km/h"
  • Blank nodes
    • Facilitate existential quantification for an individual with certain properties without naming it

Example of an RDF graph

 

RDF Triple

 Components of an RDF triple:

  • Modeled using linguistic categories (but not always consistent)
  • Allowed assignments:
    • Subject: URI or blank node
    • Predicate: URI (a.k.a. property)
    • Object: URI, blank node or literal
  • Node and edge labels should be unambiguous, so that the original graph is reconstructable from triple list

URI

  • URI = Uniform Resource Identifier
  • Used to create globally unique names for resources
  • Every object with a clear identity can be a resource
    • Books, places, organizations ...
  • In books domain the ISBN serves the same purpose

URI Syntax

  • Extension of the URL concept
  • Not every URI denotes a web document, but the URL is often used as URI for web documents
  • Starts with URL schema, which is separated from the rest by ":"
    • examples: http, ftp, mailto, file
  • Typically hierarchical structure
    • [scheme:][//authority][path][?query][#fragment]

Self-defined URIs

  • Necessary if resource has no URI yet or URI is not known
  • Use HTTP URIs of own website to avoid naming collisions
  • Facilitates creation of documentation of URI at this location
  • Example: http://jens-lehmann.org/foaf.rdf#i


  • Separation of URI for …
    • a resource (a real-world thing)
    • and its documentation (e.g. an HTML page)
    … with the help of URI references (with “#”-attached fragments) or content negotiation
  • Example: URI for Shakespeare's "Othello":
    • bad (why?): http://de.wikipedia.org/wiki/Othello
    • good: http://de.wikipedia.org/wiki/Othello#URI

IRIs

  • IRI = Internationalized Resource Identifier
  • Generalization of URI concept
  • IRI can contain Unicode
  • Example:
    • http://www.example.org/Wüste
    • http://www.example.org/사막


Literals

  • Used to model data values
  • Representation as strings
  • Interpretation through datatype
  • Literals without datatype are treated as strings
  • Literals may never be the origin of a node of an RDF graph
  • Edges may never be labeled with literals

Turtle Syntax

  • Language to serialize RDF Triples to strings
  • Turtle – Terse RDF Triple Language  
  • URIs in angle brackets: <http://dbpedia.org/resource/Leipzig>
  • Literals in quotes
    • "Leipzig"@de 
    • "51.333332"^^xsd:float
  • Triples are subject-predicate-object sentences terminated with a dot.
    <http://dbpedia.org/resource/Leipzig> <http://www.w3.org/2000/01/rdf-schema#label> "Leipzig"@de.
    
  • Whitespace and line breaks are ignored outside of identifiers
  • Status:  W3C Recommendation, http://www.w3.org/TR/turtle/

Turtle Abbreviations (1/2)

In Turtle one can use abbreviations

  • Syntax: @prefix abbr ':'  <URI> .
  • E.g. @prefix dbr:  <http://dbpedia.org/resource/> .

One can transform

<http://dbpedia.org/resource/Leipzig> <http://www.w3.org/2000/01/rdf-schema#label> "Leipzig"@de.

into

@prefix dbr: <http://dbpedia.org/resource/> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
dbr:Leipzig rdfs:label "Leipzig"@de .

Turtle Abbreviations (2/2)

  • Triples with the same subject can be grouped together
    @prefix rdf: 
    ...
    @prefix geo: 
    
    dbr:Leipzig dbp:hasMayor dbr:Burkhard_Jung ;
                rdfs:label   "Leipzig"@de ;
                geo:lat      "51.333332"^^xsd:float ;
                geo:long     "12.383333"^^xsd:float .   
    
  • Even triples with the same subject and predicate can be grouped together
    @prefix dbr:  .
    @prefix dbp:  .
    dbr:Leipzig dbp:locatedIn dbr:Saxony, dbr:Germany;
                dbp:hasMayor  dbr:Burkhard_Jung .
    

Literals II – Datatypes

  •  Example: xsd:decimal

Datatypes in RDF

  • So far: literals are untyped, treated as strings: "02" < "100" < "11" < "2"
  • Typing allows better, in other words, semantic interpretation of values
  • Datatypes get identified by URIs and are freely choosable
  • Typically usage of XML Schema Datatypes (XSD)
  • Syntax: "data value"^^<datatype-URI>
  • rdf:HTML and rdf:XMLLiteral are the only predefined datatypes in RDF
    • Used for HTML and XML fragments

Example

 Graph:



Turtle:


@prefix dbr: <http://dbpedia.org/resource/> .
@prefix geo: <http://www.w3.org/2003/01/geo/wgs84_pos#>.
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
dbr:Leipzig    geo:lat "51.333332"^^xsd:float ; geo:long "12.383333"^^xsd:float .

Language declaration

  • Influences only untyped literals
  • Example:  
  • In RDF 1.0 the following literals were all different, but implementations typically treated them the same.