2 Pages: 12

Intelligent Systems

    Semantic Web and Services



Agenda

  • Semantic Web - Data
    • Motivation
      • Development of the Web
        • Internet
        • Web 1.0
        • Web 2.0
      • Limitations of the current Web
    • Technical Solution: URI, RDF, RDFS, OWL, SPARQL
    • Illustration by Larger Examples: KIM Browser Plugin, Disco Hyperdata Browser
    • Extensions: Linked Open Data
  • Semantic Web – Processes
    • Motivation
    • Technical Solution: Semantic Web Services, WSMO, WSML, SEE, WSMX
    • Illustration by Larger Examples: SWS Challenge, Virtual Travel Agency, WSMX at work
    • Extensions: Mobile Services, Intelligent Cars, Intelligent Electricity Meters
  • Summary
  • References


    SEMANTIC WEB - DATA



    MOTIVATION



    DEVELOPMENT OF THE WEB



Development of the Web

  1. Internet
  2. Web 1.0
  3. Web 2.0


    INTERNET



Internet

  • “The Internet is a global system of interconnected computer networks that use the standard Internet Protocol Suite (TCP/IP) to serve billions of users worldwide. It is a network of networks that consists of millions of private and public, academic, business, and government networks of local to global scope that are linked by a broad array of electronic and optical networking technologies.”

 

    http://en.wikipedia.org/wiki/Internet


A brief summary of Internet evolution



    WEB 1.0



Web 1.0

  • “The World Wide Web (" WWW " or simply the " Web ") is a system of interlinked, hypertext documents that runs over the Internet. With a Web browser, a user views Web pages that may contain text, images, and other multimedia and navigates between them using hyperlinks”.

 

    http://en.wikipedia.org/wiki/World_Wide_Web


Web 1.0

  • Netscape
    • Netscape is associated with the breakthrough of the Web.
    • Netscape had rapidly a large user community making attractive for others to present their information on the Web.
  • Google
    • Google is the incarnation of Web 1.0 mega grows
    • Google indexed already in 2008 more than 1 trillion pages [*]
    • Google and other similar search engines turned out that a piece of information can be faster found again on the Web than in the own bookmark list
    [*] http://googleblog.blogspot.com/2008/07/we-knew-web-was-big.html


Web 1.0 principles

  • The success of Web1.0 is based on three simple principles:
    1. A simple and uniform addressing schema to indentify information chunks i.e. Uniform Resource Identifiers (URIs)
    2. A simple and uniform representation formalism to structure information chunks allowing browsers to render them i.e. Hyper Text Markup Language (HTML)
    3. A simple and uniform protocol to access information chunks i.e. Hyper Text Transfer Protocol (HTTP)


1. Uniform Resource Identifiers (URIs)

  • Uniform Resource Identifiers (URIs) are used to name/identify resources on the Web
  • URIs are pointers to resources to which request methods can be applied to generate potentially different responses
  • Resource can reside anywhere on the Internet
  • Most popular form of a URI is the Uniform Resource Locator (URL)


2. Hyper-Text Markup Language (HTML)

  • Hyper-Text Markup Language:
    • A subset of Standardized General Markup Language (SGML)
    • Facilitates a hyper-media environment
  • Documents use elements to “mark up” or identify sections of text for different purposes or display characteristics
  • HTML markup consists of several types of entities, including: elements, attributes, data types and character references
  • Markup elements are not seen by the user when page is displayed
  • Documents are rendered by browsers


3. Hyper-Text Transfer Protocol (HTTP)

  • Protocol for client/server communication
    • The heart of the Web
    • Very simple request/response protocol
      • Client sends request message, server replies with response message
    • Provide a way to publish and retrieve HTML pages
    • Stateless
    • Relies on URI naming mechanism


    WEB 2.0



Web 2.0

  • “The term " Web 2.0 " (2004–present) is commonly associated with web applications that facilitate interactive information sharing, interoperability, user-centered design, and collaboration on the World Wide Web”

 

    http://en.wikipedia.org/wiki/Web_2.0


Web 2.0

  • Web 2.0 is a vaguely defined phrase referring to various topics such as social networking sites, wikis, communication tools, and folksonomies.
  • Tim Berners-Lee is right that all these ideas are already underlying his original web ideas, however, there are differences in emphasis that may cause a qualitative change.
  • With Web 1.0 technology a significant amount of software skills and investment in software was necessary to publish information.
  • Web 2.0 technology changed this dramatically.


Web 2.0 major breakthroughs

  • The four major breakthroughs of Web 2.0 are:
    1. Blurring the distinction between content consumers and content providers.
    2. Moving from media for individuals towards media for communities .
    3. Blurring the distinction between service consumers and service providers
    4. Integrating human and machine computing in a new and innovative way


1. Blurring the distinction between content consumers and content providers

    Wiki, Blogs, and Twiter turned the publication of text in mass phenomena, as flickr and youtube did for multimedia


2. Moving from a media for individuals towards a media for communities

    Social web sites such as del.icio.us, facebook, FOAF, linkedin, myspace and Xing allow communities of users to smoothly interweave their information and activities


3. Blurring the distinction between service consumers and service providers

    Mashups allow web users to easy integrate services in their web site that were implemented by third parties


4. Integrating human and machine computing in a new way

    Amazon Mechanical Turk - allows to access human services through a web service interface blurring the distinction between manually and automatically provided services


    LIMITATIONS OF THE CURRENT WEB



Limitations of the current Web

  • The current Web has its limitations when it comes to:
    1. finding relevant information
    2. extracting relevant information
    3. combining and reusing information


Limitations of the current Web - Finding relevant information

  • Finding information on the current Web is based on keyword search
  • Keyword search has a limited recall and precision due to:
    • Synonyms :
      • e.g. Searching information about “Cars” will ignore Web pages that contain the word “Automobiles” even though the information on these pages could be relevant
    • Homonyms:
      • e.g. Searching information about “Jaguar” will bring up pages containing information about both “Jaguar” (the car brand) and “Jaguar” (the animal) even though the user is interested only in one of them


Limitations of the current Web - Finding relevant information

  • Keyword search has a limited recall and precision due also to:
    • Spelling variants:
      • e.g. “organize” in American English vs. “organise” in British English
    • Spelling mistakes
    • Multiple languages
      • i.e. information about same topics in published on the Web on different languages (English, German, Italian,…)
  • Current search engines provide no means to specify the relation between a resource and a term
    • e.g. sell / buy


Limitations of the current Web - Extracting relevant information

  • One-fit-all automatic solution for extracting information from Web pages is not possible due to different formats, different syntaxes
  • Even from a single Web page is difficult to extract the relevant information


Limitations of the current Web - Extracting relevant information

  • Extracting information from current web sites can be done using wrappers


Limitations of the current Web - Extracting relevant information

  • The actual extraction of information from web sites is specified using standards such as XSL Transformation (XSLT) [1]
  • Extracted information can be stored as structured data in XML format or databases.
  • However, using wrappers do not really scale because the actual extraction of information depends again on the web site format and layout

 

    [1] http://www.w3.org/TR/xslt


Limitations of the current Web - Combining and reusing information

  • Tasks often require to combine data on the Web
    1. Searching for the same information in different digital libraries
    2. Information may come from different web sites and needs to be combined


Limitations of the current Web - Combining and reusing information

  1. Searches for the same information in different digital libraries


Limitations of the current Web - Combining and reusing information

  1. Information may come from different web sites and needs to be combined


How to improve the current Web?

  • Increasing automatic linking among data
  • Increasing recall and precision in search
  • Increasing automation in data integration
  • Increasing automation in the service life cycle
  •  

  • Adding semantics to data and services is the solution!


    TECHNICAL SOLUTIONS



Uniform Resource Identifier

  • Uniform Resource Identifiers (URIs) are used to identify resources , not just things that exists on the Web, e.g. Dieter Fensel, University of Innsbruck


Resource Description Framework (RDF)

  • The Resource Description Framework (RDF) provides a domain independent data model
  • Resource (identified by URIs)
    • Correspond to nodes in a graph
    • E.g.:
      • http://www.w3.org/
        http://example.org/#john
        http://www.w3.org/1999/02/22-rdf-syntax-ns#Property
  • Properties (identified by URIs)
    • Correspond to labels of edges in a graph
    • Binary relation between two resources
    • E.g.:
      • http://www.example.org/#hasName
        http://www.w3.org/1999/02/22-rdf-syntax-ns#type
  • Literals
    • Concrete data values
    • E.g.:
      • "John Smith", "1", "2006-03-07 "


Resource Description Framework (RDF) – Triple Data Model

  • Triple data model:
        • < subject, predicate, object >
    • Subject : Resource or blank node
    • Predicate : Property
    • Object : Resource, literal or blank node
  • Example:
        • < ex:john, ex:father-of, ex:bill >
  • Statement (or triple) as a logical formula P(x, y), where the binary predicate P relates the object x to the object y.
  • RDF offers only binary predicates (properties)


Resource Description Framework (RDF) – Graph Model

  • The triple data model can be represented as a graph
  • Such graph is called in the Artificial Intelligence community a semantic net
  • Labeled, directed graphs
    • Nodes : resources, literals
    • Labels : properties
    • Edges : statements


RDF Schema (RDFS)

  • RDF Schema (RDFS) is a language for capturing the semantics of a domain, for example:
    • In RDF:
      • < #john, rdf:type, #Student >
    • What is a “ #Student ”?
  • RDFS is a language for defining RDF types:
    • Define classes:
      • #Student is a class”
    • Relationships between classes:
      • #Student is a sub-class of #Person
    • Properties of classes:
      • #Person has a property hasName


RDF Schema (RDFS)

  • Classes:
    • <#Student, rdf:type, #rdfs:Class>
  • Class hierarchies:
    • <#Student, rdfs:subClassOf, #Person>
  • Properties:
    • <#hasName, rdf:type, rdf:Property>
  • Property hierarchies:
    • <#hasMother, rdfs:subPropertyOf, #hasParent>
  • Associating properties with classes (a):
    • “The property #hasName only applies to #Person
    • <#hasName, rdfs:domain, #Person>
  • Associating properties with classes (b):
    • “The type of the property #hasName is #xsd:string
    • <#hasName, rdfs:range, xsd:string>


RDF Schema (RDFS) - Example



Web Ontology Language (OWL)

  • RDFS has a number of Limitations:
    • Only binary relations
    • Characteristics of Properties, e.g. inverse , transitive , symmetric
    • Local range restrictions, e.g. for class Person , the property hasName has range xsd:string
    • Complex concept descriptions, e.g. Person is defined by Man and Woman
    • Cardinality restrictions, e.g. a Person may have at most 1 name
    • Disjointness axioms, e.g. nobody can be both a Man and a Woman
  • The Web Ontology Language (OWL) provides an ontology language, that is a more expressive Vocabulary Definition Language for use with RDF
    • Class membership
    • Equivalance of classes
    • Consistency
    • Classification


OWL

  • OWL is layered into languages of different expressiveness
    • OWL Lite: Classification Hierarchies, Simple Constraints
    • OWL DL: Maximal expressiveness while maintaining tractability
    • OWL Full: Very high expressiveness, loses tractability, all syntactic freedom of RDF
  • More expressive means harder to reason with
  • Different Syntaxes:
    • RDF/XML (Recommended for Serialization)
    • N3 (Recommended for Human readable Fragments)
    • Abstract Syntax (Clear Human Readable Syntax)


OWL – Example: The Wine Ontology

  • An Ontology describing wine domain
  • One of the most widely used examples for OWL and referenced by W3C.
  • There is also a wine agent associated to this ontology that performs OWL queries using a web-based ontological mark-up language. That is, by combining a logical reasoner with an OWL ontology.
  • The agent's operation can be described in three parts: consulting the ontology, performing queries and outputting results.
  • Available here: http://www.w3.org/TR/owl-guide/


OWL – Example: The Wine Ontology Schema

    [http://mysite.verizon.net/jflynn12/VisioOWL/VisioOWL.htm]


SPARQL – Querying RDF

  • SPARQL
    • RDF Query language
    • Based on RDQL
    • Uses SQL-like syntax
  • Example:
    • PREFIX uni: <http://example.org/uni/>

      SELECT ?name
      FROM <http://example.org/personal>
      WHERE { ?s uni:name ?name. ?s rdf:type uni:lecturer }


SPARQL Queries

    PREFIX uni: <http://example.org/uni/>
    SELECT ?name
    FROM <http://example.org/personal>
    WHERE { ?s uni:name ?name. ?s rdf:type uni:lecturer }
  • PREFIX
    • Prefix mechanism for abbreviating URIs
  • SELECT
    • Identifies the variables to be returned in the query answer
    • SELECT DISTINCT
    • SELECT REDUCED
  • FROM
    • Name of the graph to be queried
    • FROM NAMED
  • WHERE
    • Query pattern as a list of triple patterns
  • LIMIT
  • OFFSET
  • ORDER BY


SPARQL Example Query 1

    “Return the full names of all people in the graph”

    PREFIX vCard: < http://www.w3.org/2001/vcard-rdf/3.0# >
    SELECT ?fullName
    WHERE {?x vCard:FN ?fullName}
   
 
result:

fullName
=================
"John Smith"
"Mary Smith"


SPARQL Example Query 2

    “Return the relation between John and Mary”

    PREFIX ex: <http://example.org/#>
    SELECT ?p
    WHERE {ex:john ?p ex:mary}
  
 
result:

p
=================
<http://example.org/#marriedTo>


SPARQL Example Query 3

    “Return the spouse of a person by the name of John Smith”

    PREFIX vCard: <http://www.w3.org/2001/vcard-rdf/3.0#>
    PREFIX ex: <http://example.org/#>
    SELECT ?y
    WHERE {?x vCard:FN "John Smith". ?x ex:marriedTo ?y}
  
 
result:

y
=================
<http://example.org/#mary>


    ILLUSTRATION BY LARGER EXAMPLES



Illustration 1 – KIM Browser Plugin



Illustration 2 – Disco Hyperdata Browser



    EXTENSIONS



Extensions: Linked Open Data

  • Linked Data is a method for exposing and sharing connected data via dereferenceable URI’s on the Web
    • Use URIs to identify things that you expose to the Web as resources
    • Use HTTP URIs so that people can locate and look up (dereference) these things
    • Provide useful information about the resource when its URI is dereferenced
    • Include links to other, related URIs in the exposed data as a means of improving information discovery on the Web
  • Linked Open Data is an initiative to interlink open data sources
    • Open: Publicly available data sets that are accessible to everyone
    • Interlinked: Datasets have references to one another allowing them to be used together


Extensions: Linked Open Data



Extensions: Linked Open Data - FOAF

  • Friend Of A Friend (FOAF) provides a way to create machine-readable pages about:
    • People
    • The links between them
    • The things they do and create
  • Anyone can publish a FOAF file on the web about themselves and this data becomes part of the Web of Data
<foaf:Person>
 
    <foaf:name>Dieter Fensel</foaf:name>
    <foaf:homepage rdf:resource="http://www.fensel.com"/>
</foaf:Person>
  • FOAF is connected to many other data sets, including
    • Data sets describing music and musicians (Audio Scrobbler, MusicBrainz)
    • Data sets describing photographs and who took them (Flickr)
    • Data sets describing places and their relationship (GeoNames)


Extensions: Linked Open Data - GeoNames

  • The GeoNames Ontology makes it possible to add geospatial semantic information to the Web of Data
  • We can utilize GeoNames location within the FOAF profile
<foaf:Person>
 
    <foaf:name>Dieter Fensel</foaf:name>
    <foaf:homepage rdf:resource="http://www.fensel.com"/>
    <foaf:based_near ” http://ws.geonames.org/rdf?geonameId=2775220"/>
</foaf:Person>
  • GeoNames is also linked to more datasets
    • US Census Data
    • Movie Database (Linked MDB)
    • Extracted data from Wikipedia (DBpedia)


Extensions: Linked Open Data - DBpedia

  • DBpedia is a community effort to extract structured information from Wikipedia and to make this information available on the Web
  • As our FOAF profile has been linked to GeoNames, and GeoNames is linked to DBpedia, we can ask some interesting queries over the Web of Data
    • What is the population of the city in which Dieter Fensel lives?
      • => 117916 people
    • At which elevation does Dieter Fensel live?
      • => 574m
    • Who is the mayor of the city in which Dieter Fensel lives
      • => Hilde Zach


    SEMANTIC WEB - PROCESSES



    MOTIVATION



Motivation

    http://www.sti-innsbruck.at/dip-movie


Motivation

  • The Web is moving from static data to dynamic functionality
    • Web services: a piece of software available over the Internet, using standardized XML messaging systems
    • Mashups: The compounding of two or more pieces of web functionality to create powerful web applications


Motivation



Limitations of the current Web Processes

  • Web services and mashups are limited by their syntactic nature
  • As the amount of services on the Web increases it will be harder to find Web services in order to use them in mashups
  • The current amount of human effort required to build applications is not sustainable at a Web scale


What is needed?

  • Formal, machine processable descriptions of processes on the Web that allows easy integration, configuration and reuse
  • Semantic support for finding, composing and executing these processes and all the other related tasks
      Solution : Combine Semantics and Web processes/services that enables the automation of many of the currently human intensive tasks around Web processes/services


    TECHNICAL SOLUTIONS



Semantic Web Services

  • Brings the benefits of Semantics to the executable part of the Web
    • Ontologies as data model
    • Unambiguous definition of service functionality and external interface
  • Reduce human effort in integrating services in SOA
    • Many tasks in the process of using Web services can be automated
  • Improve dynamism
    • New services available for use as they appear
    • Service Producers and Consumers don’t need to know of each others existence
  • Improve stability
    • Service interfaces are not tightly integrated so even less impact from changes
    • Services can be easily replaced if they are no longer available
    • Failover possibilities are limited only by the number of available services


Semantic Web Services

  • Semantic Web Services are a layer on top of existing Web service technologies and do not aim to replace them
  • Provide a formal description of services, while still being compliant with existing and emerging technologies
  • Distinguish between a Web service (computational entity) and a service (value provided by invocation)
  • Make Web services easier to:
    • Find
    • Compare
    • Compose
    • Invoke


Technical Overview



WSMO – Design Principles



WSMO – Conceptual Model



WSML – Language Family



Semantic Execution Environment



Semantic Execution Environment - WSMX



    ILLUSTRATION BY LARGER EXAMPLES



Illustration 1: SWS Challenge

  • Blue company has discovered Moon company on the Web
  • Blue company wishes to communicate with Moon company
  • Broker required to resolve data and process interoperability issues


Illustration 2: Virtual Travel Agency



Illustration 3: WSMX At Work



Illustration 3: WSMX At Work



Illustration 3: WSMX At Work



Illustration 3: WSMX At Work



Illustration 3: WSMX At Work