Semantic Web - Data

  • URIs are used to identify resources, not just things that exists on the Web, e.g. Tim Berners-Lee
  • RDF is used to make statements about resources in the form of triples 
  • With RDFS, resources can belong to classes (my Mercedes belongs to the class of cars) and classes can be subclasses or superclasses of other classes (vehicles are a superclass of cars, cabriolets are a subclass of cars)

Dereferencable URI

  • Disco essentially is just a nice way to represent RDF metadata such that people can actually browse it. So essentially it’s a representation mechanism for RDF triples. All the triples with the same subject are grouped on one page and then the predicates and objects form a table which someone can browse. When you click on an object, that object becomes the subject of the view and all predicates and objects of that subject become the visible.
  • The Dereferencable URI animation just means that the URI you provide must be dereferenceable or in less buzzword terms – the resource identified by the URI must be retrievable (or dereferenceable) from that URI

Faceted DBLP

  • The search interface allows to search computer science publications in the collection starting from some keyword and shows the result set along with a set of facets, e.g., distinguishing publication years, authors, or conferences. The animation shows that the RDF metadata underlies the whole system, the different RDF predicates forming the different facets that the user can use to narrow down the result set. Note that the seminal appear on the WSMT comes first for a DBLP search for Dieter Fensel ;)

Semantic Media Wiki

  • Semantic Media Wiki provides a combination of a Web 2.0 technology, namely Wikis, and semantic web. Users can add tags to the wiki data which auto generates RDF data. Information in the wiki can also be filled with queries, in the example the section on Knows is filled by asking the query <ask>[[ affiliation::DERI Innsbruck]]</ask>

Legacy Systems

Legacy Systems (cont')


KIM platform

  • The KIM platform provides a novel infrastructure and services for:
    • automatic semantic annotation, 
    • indexing, 
    • retrieval of unstructured and semi-structured content.

KIM Constituents

  • The KIM Platform includes:
    • Ontologies (PROTON + KIMSO + KIMLO) and KIM World KB
    • KIM Server – with a set of APIs for remote access and integration
    • Front-ends: Web-UI and plug-in for Internet Explorer.

KIM Ontology (KIMO)

  • light-weight upper-level ontology
  • 250 NE classes
  • 100 relations and attributes:
  • covers mostly NE classes, and ignores general concepts
  • includes classes representing lexical resources

KIM KB

  • KIM KB consists of above 80,000 entities (50,000 locations, 8,400 organization instances, etc.)
  • Each location has geographic coordinates and several aliases (usually including English, French, Spanish, and sometimes the local transcription of the location name) as well as co-positioning relations (e.g. subRegionOf.)
  • The organizations have locatedIn relations to the corresponding Country instances. The additionally imported information about the companies consists of short description, URL, reference to an industry sector, reported sales, net income,and number of employees.

KIM is Based On…

  • KIM is based on the following open-source platforms: 
  • GATE – the most popular NLP and IE platform in the world, developed at the University of Sheffield. Ontotext is its biggest co-developer.
    www.gate.ac.uk and www.ontotext.com/gate
  • OWLIM – OWL repository, compliant with
    Sesame RDF database from Aduna B.V.
    www.ontotext.com/owlim
  • Lucene – an open-source IR engine by Apache. jakarta.apache.org/lucene/

KIM Platform – Semantic Annotation

KIM platform – Semantic Annotation (contd')

  • The automatic semantic annotation is seen as a named-entity recognition (NER) and annotation process.
  • The traditional flat NE type sets consist of several general types (such as Organization, Person, Date, Location, Percent, Money). In KIM the NE type is specified by reference to an ontology.
  • The semantic descriptions of entities and relations between them are kept in a knowledge base (KB) encoded in the KIM ontology and residing in the same semantic repository. Thus KIM provides for each entity reference in the text (i) a link (URI) to the most specific class in the ontology and (ii) a link to the specific instance in the KB. Each extracted NE is linked to its specific type information (thus Arabian Sea would be identified as Sea, instead of the traditional – Location).

KIM platform – Information Extraction

  •  KIM performs IE based on an ontology and a massive knowledge base.


KIM platform - Browser Plug-in