10 Pages: 12345678910

 


Agenda

  • Motivation
  • Semantic Web
  • Web Services
  • Semantic Web Services
  • Summary
  • References



Motivation



The traditional Web



Finding relevant information

  • Finding information on the current Web is based on keyword search
  • Keyword search has a limited recall and precision due to:
    • Synonyms
      • e.g. Searching information about “Cars” will ignore Web pages that contain the word “Automobiles” even though the information on these pages could be relevant
    • Homonyms:
      • e.g. Searching information about “Jaguar” will bring up pages containing information about both “Jaguar” (the car brand) and “Jaguar” (the animal) even though the user is interested only in one of them
  • Keyword search has a limited recall and precision due also to:
    • Spelling variants:
      • e.g. “organize” in American English vs. “organise” in British English
    • Spelling mistakes
    • Multiple languages
      • i.e. information about same topics in published on the Web on different languages (English, German, Italian,…)
  • Current search engines provide no means to specify the relation between a resource and a term
    • e.g. sell / buy  


Extracting relevant information

  • One-fit-all automatic solution for extracting information from Web pages is not possible due to different formats, different syntaxes
  • Even from a single Web page is difficult to extract the relevant information
  • Extracting information from current web sites can be done using wrappers



Extracting relevant information

  • The actual extraction of information from web sites is specified using standards such as XSL Transformation (XSLT)
  • Extracted information can be stored as structured data in XML format or databases.
  • However, using wrappers do not really scale because the actual extraction of information depends again on the web site format and layout


Combining and reusing information

  • Tasks often require to combine data on the Web    
    • Searching for the same information in different digital libraries 

    • Information may come from different web sites and needs to be combined



How to improve the current Web?

  • Increasing automatic linking among data
  • Increasing recall and precision in search
  • Increasing automation in data integration
  • Increasing automation in the service life cycle
  • Adding semantics to data and services is the solution!


Semantic Web



Semantic Web (contd')

  • “An extension of the current Web in which information is given well-defined meaning, better enabling computers and people to work in cooperation.”

    Tim Berners-Lee et al., Scientific American, 2001: tinyurl.com/i59p
  • “…allowing the Web to reach its full potential…” with far-reaching consequences
  • “The next generation of the Web”
  • The next generation of the WWW 
  • Information has machine-processable and machine-understandable semantics 
  • Not a separate Web but an augmentation of the current one
  • Ontologies as basic building block
  • Web Data Annotation
    • connecting (syntactic) Web objects, like text chunks, images, … to their semantic notion (e.g., this image is about Innsbruck, Dieter Fensel is a professor)
  • Data Linking on the Web (Web of Data)
    • global networking of knowledge through URI, RDF, and SPARQL (e.g., connecting my calendar with my rss feeds, my pictures, ...)
  • Data Integration over the Web
    • Seamless integration of data based on different conceptual models (e.g., integrating data coming from my two favorite book sellers)


Semantic Web - Ontologies

 




Components of Ontology



Ontologies

  • To make the Semantic Web working we need:

    • Ontology Languages:
      • expressivity 
      • reasoning support 
      • web compliance
    • Ontology Reasoning
      • large scale knowledge handling 
      • fault-tolerant 
      • stable & scalable inference machines
    • Ontology Management Techniques
      • editing and browsing 
      • storage and retrieval 
      • versioning and evolution Support 
    • Ontology Integration Techniques
      • ontology mapping, alignment, merging 
      • semantic interoperability determination 
    • and … Applications


Types of ontologies

 



“Semantic Web Language Layer Cake”



Introduction



Definition

  • “Loosely coupled, reusable software components that encapsulate discrete functionality and are distributed and programmatically accessible over standard Internet protocols”, The Stencil Group
  • Web service applications are encapsulated, loosely coupled Web “components” that can bind dynamically to each other, F. Curbera
  • “Web Services are a new breed of application. They are self-contained, self-describing, modular applications that can be published, located, and invoked across the Web. Web Services perform functions, which can be anything from simple request to complicated business processes”, The IBM Web Services tutorial 
  • Common to all definitions:
    • Components providing functionality
    • Distributed
    • Accessible over the Web


Definition (contd')

 



Definition (contd')

Software Architecture

  • Web Services connect computers and devices with each other using the Internet to exchange data and combine data in new ways.
  • The key to Web Services is on-the-fly software creation through the use of loosely coupled, reusable software components.
  • Software can be delivered and paid for as fluid of services as opposed to packaged products.

Web Services as a new concept for ework and ecommerce

  • Business services can be completely decentralized and distributed over the Internet and accessed by a wide variety of communication devices.
  • The internet will become a global common platform where organizations and individuals communicate among each other to carry out various commercial activities and to provide value-added services.
  • The dynamic enterprise and dynamic value chains become achievable and may be even mandatory for competitive advantage.

Web Services as a programming technology

  • Web Services are Remote Procedure Calls (RPC) over HTTP




Web Service vs. Service

  • Service
    • A provision of value in some domain (not necessarily monetary, independent of how service provider and requestor interact)
  • Web Service
    • Computational entity accessible over the Internet (using Web Service Standards & Protocols), provides access to (concrete) services for the clients.


Introduction



WSDL

  • Web Service Description Language describes interface for consuming a Web Service Interface:
    • operations (in- & output) 
    • Access (protocol binding)
    • Endpoint (location of service)


SOAP

  • Simple Object Access Protocol 
  • W3C Recommendation 
  • XML data transport:
    • sender / receiver
    • protocol binding
    • communication aspects
    • content 



UDDI

  • Universal Description, Discovery, and Integration Protocol 
  • OASIS driven standardization effort
  • Registry for Web Services:
    • provider 
    • service information
    • technical access


Restful Services

  • Another way of realizing services, other then SOAP/WSDL/UDDI approach
  • Follows the Web principles (REST principles)
  • Services expose their data and functionality through resources indentified by URI 
  • Services are Web pages that are meant to be consumed by an autonomous program
  • Uniform interfaces for interaction: GET, PUT, DELETE, POST
  • HTTP as the application protocol


Google – Unified Cloud Computing

  • An attempt to create an open and standardized cloud interface for the unification of various cloud API’s
  • Key drivers of the unified cloud interface is to create an api about other API's 
  • Use of the resource description framework (RDF) to describe a semantic cloud data model (taxonomy & ontology)


Amazon - Mechanical Turk

“People as a service”

  • Amazon Mechanical Turk
    • An API to Human Processing 
    • Power
    • The Computer Calls People
    • An Internet Scale Workforce
    • Game-Changing Economics


Amazon – S3 & EC2

“Infrastructure as a service”

 
  • Amazon Simple Storage Service (S3)
    • Write and read objects up to 5GB
    • 15 cents GB / month to store
    • 20 cents GB / month to transfer
  • Amazon Elastic Compute Cloud (EC2)
    • allows customers to rent computers
      on which to run their own computer
      applications
    • virtual server technology
    • 10 cents / hour

 


Introduction



Deficiencies of WS Technology



Deficiencies of WS Technology (contd')

  • current technologies allow usage of Web Services
  • but:
    • only syntactical information descriptions 
    • syntactic support for discovery, composition and execution
    • => Web Service usability, usage, and integration needs to be inspected manually 
    • no semantically marked up content / services
    • no support for the Semantic Web 
  • current Web Service Technology Stack failed to realize the promise of Web Services


So what is needed?

  • Mechanized support is needed for
    • Annotating/designing services and the data they use
    • Finding and comparing service providers
    • Negotiating and contracting services
    • Composing, enacting, and monitoring services
    • Dealing with numerous and heterogeneous data formats, protocols and processes, i.e. mediation
  • => Conceptual Models, Formal Languages, Execution Environments


Definition

Semantic Web Technology

  • allow machine supported data interpretation
  • ontologies as data model

+

Web Service Technology

  • automated discovery, selection, composition, and web-based execution of services

=> Semantic Web Services as integrated solution for realizing the vision of the next generation of the Web 

  • define exhaustive description frameworks for describing Web Services and related aspects (Web Service Description Ontologies) 
  • support ontologies as underlying data model to allow machine supported data interpretation (Semantic Web aspect)


Definition (contd')

  • define semantically driven technologies for automation of the Web Service usage process (Web Service aspect)
  • Tasks to be automated:
     


Definition (contd')

  • Semantic Web Services are a layer on top of existing Web service technologies and do not aim to replace them
  • Provide a formal description of services, while still being compliant with existing and emerging technologies
  • Distinguish between a Web service (computational entity) and a service (value provided by invocation)
  • Make Web services easier to: 
    • Find
    • Compare
    • Compose 
    • Invoke


Semantic Web Services benefits

  • Brings the benefits of Semantics to the executable part of the Web 
    • Ontologies as data model 
    • Unambiguous definition of service functionality and external interface
  • Reduce human effort in integrating services in SOA 
    • Many tasks in the process of using Web services can be automated
  • Improve dynamism 
    • New services available for use as they appear
    • Service Producers and Consumers don’t need to know of each others existence 
  • Improve stability
    • Service interfaces are not tightly integrated so even less impact from changes 
    • Services can be easily replaced if they are no longer available
    • Failover possibilities are limited only by the number of available services


Service Oriented Architecture



Semantically Enabled SOA (SESA)



SESA Architecture




SESA functionality

  • Middleware for Semantic Web Services
    • Allows service providers to focus on their business,
  • Environment for goal based discovery and invocation
    • Run-time binding of service requesters and providers,
  • Provide a flexible Service Oriented Architecture
    • Add, update, remove components at run-time as needed,
  • Keep open-source to encourage participation
    • Developers are free to use in their own code, and
  • Define formal execution semantics
    • Unambiguous model of system behavior.


Realizing Semantic Web Services Vision

  • Take the WSDL/SOAP web service stack as a starting point and add semantic annotations.


Realizing Semantic Web Services Vision (contd')

  • Alternative way to realize Semantic Web Services vision is to focus on further developing the Semantic Web.


Motivation

  • Are WSDL/SOAP web services really web services? - No!
  • Web services require tight coupling of the applications they integrate. 
    • Applications communicate via message exchange requiring strong coupling in terms of reference and time. 
  • The Web is strongly based on the opposite principles. Information is published in a persistent and widely accessible manner. 
    • Any other application can access this information at any point in time without having to request the publishing process to directly refer to it as a receiver of its information. 
  • Web services can use the Web as a transport media, however that is all they have in common with the Web.
  • Distributed systems dominated by messaging
    • Web services / SOAP
    • CORBA / RPC / RMI / MOM
    • Agents
  • Web architecture different
    • Persistent publication as the main principle
    • Uniform interface
    • Uniform addressing
  • Web clearly scales to a large size


Space-based Communication



Semantic Spaces

  • Persistent publication of semantic data
  • Retrieval by semantic matching
  • Mediation of data between heterogeneous services
  • Semantics-aware distribution of data
  • Coordination of concurrent access situations
  • Appropriate security and trust mechanisms
  • Use of Web service protocol stack and Semantic Web technologies


LOD Cloud March 2009

  • Linked Data 


Data Linking on the Web

  • Linked Open Data statistics:
    • data sets: 121
    • total number of triples: 13.112.409.691
    • total number of links between data sets: 142.605.717


Data linking on the Web principles

  • Use URIs as names for things
    • anything, not just documents
    • you are not your homepage
    • information resources and non-information resources
  • Use HTTP URIs
    • globally unique names, distributed ownership
    • allows people to look up those names
  • Provide useful information in RDF
    • when someone looks up a URI
  • Include RDF links to other URIs
    • to enable discovery of related information


DBpedia

  • DBpedia is a community effort to:
    • Extract structured information from Wikipedia
    • Make the information available on the Web under an open license
    • Interlink the DBpedia dataset with other open datasets on the Web
  • DBpedia is one of the central interlinking-hubs of the emerging Web of Data



The DBpedia Dataset

  • 91 languages
  • Data about 2.9 million “things”. Includes for example:
    • 282.000 persons
    • 339.000 places
    • 119.00 organizations
    • 130.000 species
    • 88.000 music albums
    • 44.000 films
    • 19.000 books
  • Altogether 479 million pieces of information (RDF triples)
    • 807.000 links to images
    • 3.840.000 links to external web pages
    • 4.878.100 data links into external RDF datasets


LinkedCT

  • LinkedCT is the Linked Data version of ClinicalTrials.org containing data about clinical trials.
  • Total number of triples: 
    • 6,998,851
  • Number of Trials:
    • 61,920
  • RDF links to other data sources:
    • 177,975
  • Links to other datasets:
    • DBpedia and YAGO(from intervention and conditions) 
    • GeoNames (from locations) 
    • Bio2RDF.org's PubMed (from references)


Summary

  • Why Semantic Web Services?
    • To overcome limitations of traditional Web-Services Technology by integrating it with Semantic Technology;
    • To enable automatic and personalized service discovery;
    • To enable automatic service invocation and execution monitoring;
    • To enable automatic service integration;
    • To enable semantic mediation of Web-Services.
  • Two new sciences are currently emerging: Web science and Service Science. 
  • Core pillar of these sciences are:
  • Semantic Web
    • the next generation of the Web in which information has machine-processable and machine-understandable semantics.
  • Semantic Web Services
    • overcome limitations of traditional Web-Services Technology using Semantic Technology to enable automatic service discovery, ranking, selection, composition, etc.


References

  • D. Fensel, M. Kerrigan, and M. Zaremba (eds.). Implementing Semantic Web Services - The SESA Framework, Springer, 2008. ISBN: 978-3-540-77019-0
  • D. Fensel, C. Bussler. The Web Service Modeling Framework WSMF, Electronic Commerce Research and Applications, 1(2): 113-137, 2002 
  • D. Fensel: Triple-space computing: Semantic Web Services based on persistent publication of information. In Proc. of the IFIP Int'l Conf. on Intelligence in Communication Systems (INTELLCOMM 2004), Bangkok, Thailand, November 23-26, 2004.
  • L. Richardson, and S. Ruby. Web services for the real world, O’Reilly, 2007. ISBN 10: 0-596-52926-0
  • SOAP: http://w3.org/TR/soap12
  • WSDL: http://w3.org/TR/wsdl20
  • UDDI: http://uddi.xml.org/
  • http://dbpedia.org/About
  • http://en.wikipedia.org/wiki/Semantic_Web_Services
  • http://en.wikipedia.org/wiki/Service_(systems_architecture)
  • http://en.wikipedia.org/wiki/Webservice
  • http://en.wikipedia.org/wiki/Service-oriented_architecture
  • http://en.wikipedia.org/wiki/Web_Services_Description_Language
  • http://en.wikipedia.org/wiki/SOAP
  • http://en.wikipedia.org/wiki/Universal_Description_Discovery_and_Integration
  • http://en.wikipedia.org/wiki/Cloud_computing
  • http://en.wikipedia.org/wiki/Amazon_Elastic_Compute_Cloud
  • http://en.wikipedia.org/wiki/Amazon_Mechanical_Turk
 


The Web today




Agenda

  • Motivation
  • Web Science
  • Web Evolution
    • Web 1.0 - Traditional Web
    • Web 2.0
      • Major breakthroughs of Web 2.0
    • Web 3.0 - Semantic Web
  • What Web Science could be
    • The computer science of 21st century
  • Summary
  • References



Motivation

  • “[…] As the Web has grown in complexity and the number and types of interactions that take place have ballooned, it remains the case that we know more about some complex natural phenomena (the obvious example is the human genome) than we do about this particular engineered one.”
  • A new science that studies the complex phenomena called Web is needed!!


Web Science

  • A new science that focuses on how huge decentralized Web systems work. 
  • “The Web isn’t about what you can do with computers. It’s people and, yes, they are connected by computers. But computer science, as the study of what happens in a computer, doesn’t tell you about what happens on the Web.” 
    Tim Berners-Lee
  • “A new field of science that involves a multi-disciplinary study and inquiry for the understanding of the Web and its relationships to us”
    Bebo White, SLAC, Stanford University
  • Shift from how a single computer works to how huge decentralized Web systems work



Endorsements for Web Science

  • “Web science represents a pretty big next step in the evolution of information. This kind of research likely to have a lot of influence on the next generation of researchers, scientists and, most importantly, the next generation of entrepreneurs who will build new companies from this.” 
    Eric E. Schmidt, CEO Google
  • “Web science research is a prerequisite to designing and building the kinds of complex, human-oriented systems that we are after in services science.” 
    Irving Wladawsky-Berger, IBM


Web Science – multi-disciplinary approach



The goals of Web Science

  • To understand what the Web is
  • To engineer the Web’s future and providing infrastructure
  • To ensure the Web’s social benefit


Scientific method

  • Natural Sciences such as physics, chemistry, etc. are analytic disciplines that aim to find laws that generate or explain observed phenomena
  • Computer Science on the other hand is synthetic. It is about creating formalisms and algorithms in order to support particular desired behaviour.
  • Web science scientific method has to be a combination of these two paradigms


What Could Scientific Theories for the Web Look Like?

  • Some simple examples:
    • Every page on the Web can be reached by following less than 10 links
    • The average number of words per search query is greater than 3
    • Web page download times follow a lognormal distribution function (Huberman)
    • The Web is a “scale-free” graph
  • Can these statements be easily validated? Are they good theories? What constitutes good theories about the Web?


Food for thought

           

  • What are the analogies for Web Science and Design? Is our understanding of the Web like that of 1800 electricity?



Evolution of the Web



Introduction

  • Web evolution
    • Web 1.0 - Traditional Web
    • Web 2.0 
    • Web 3.0 - Semantic Web
  • Future steps to realize Web science
    • Large scale reasoning
    • Rethinking Computer Science for the 21st century


Web 1.0

  • The World Wide Web ("WWW" or simply the "Web") is a system of interlinked, hypertext documents that runs over the Internet. Witha Web browser, a user views Web pages that may contain text, images, and other multimedia and navigate between them using hyperlinks.
  • The Web was created around 1990 by Tim Berners-Lee working at CERN in Geneva, Switzerland. 
  • A distributed document delivery system implemented using application-level protocols on the Internet
  • A tool for collaborative writing and community building
  • A framework of protocols that support e-commerce
  • A network of co-operating computers interoperating using HTTP and related protocols to form a ‘subnet’ of the Internet
  • A large, cyclical, directed graph made up of Web pages and links



The breakthrough



WWW components

  • Structural Components
    • Clients/browsers – to dominant implementations
    • Servers – run on sophisticated hardware
    • Caches – many interesting implementations
    • Internet – the global infrastructure which facilitates data transfer
  • Language and Protocol Components
    • Uniform Resource Identifiers (URIs)
    • Hyper Text Transfer Protocol (HTTP)
    • Hyper Text Markup Language (HTML)


Uniform Resource Identifiers (URIs)

  • Uniform Resource Identifiers (URIs) are used to name/identify resources on the Web
  • URIs are pointers to resources to which request methods can be applied to generate potentially different responses
  • Resource can reside anywhere on the Internet
  • Most popular form of a URI is the Uniform Resource Locator (URL)


Hypertext Transfer Protocol (HTTP)

  • Protocol for client/server communication
    • The heart of the Web
    • Very simple request/response protocol
    • Client sends request message, server replies with response message
    • Provide a way to publish and retrieve HTML pages
    • Stateless
    • Relies on URI naming mechanism


HTTP Request Messages

  • GET – retrieve document specified by URL
  • PUT – store specified document under given URL
  • HEAD – retrieve info. about document specified by URL
  • OPTIONS – retrieve information about available options
  • POST – give information (eg. annotation) to the server
  • DELETE – remove document specified by URL
  • TRACE – loopback request message
  • CONNECT – for use by caches


HTML

  • Hyper-Text Markup Language
    • A subset of Standardized General Markup Language (SGML)
    • Facilitates a hyper-media environment
  • Documents use elements to “mark up” or identify sections of text for different purposes or display characteristics
  • Mark up elements are not seen by the user when page is displayed
  • Documents are rendered by browsers
  • HTML markup consists of several types of entities, including: elements, attributes, data types and character references 
    • DTD (Document Type Definition)
    • Element (such as document (…), head elements () 
    • Attribute: HTML
    • Data type: CDATA, URIs, Dates, Link types, language code, color, text string, etc. 
    • Character references: for referring to rarely used characters: 
      • "&#x6C34" (in hexadecimal) represents the Chinese character for water 


Web 2.0

  • “Web 2.0 is a notion for a row of interactive and collaborative systems of the internet“
  • Web 2.0 is a vaguely defined phrase referring to various topics such as social networking sites, wikis, communication tools, and folksonomies.
  • Tim O'Reilly provided a definition of Web 2.0 in 2006: "Web 2.0 is the business revolution in the computer industry caused by the move to the internet as platform, and an attempt to understand the rules for success on that new platform. Chief among those rules is this: Build applications that harness network effects to get better the more people use them.”


Elements of the Web's next generation

  • People, Services, Technologies



Definition“ by O‘Reilly

Web 1.0

DoubleClick

Ofoto

Britannica Online content

Web sites

publishing

CMS 

directories

taxonomy

Web 2.0

Google AdSense

Flickr

Wikipedia

blogging

participation

wikis

tagging

folksonomy

improvement

personalized

tagging, community

community, free

dialogue


flexibility, freedom

community



    Characteristics of Web 2.0 applications

    • Typical characteristics of Web 2.0 applications
      • Users can produce and consume data on a Web 2.0 site
      • Web is used as a participation platform
      • Users can run software applications entirely through a Web browser
      • Data and services can be easily combined to create mashups


    Examples

    • Gmail
    • Google Notebooks (Collaborative Notepad in the Web)
    • Wikis
    • Wikipedia
      • Worlds biggest encyclopedia, Top 30 web site, 100 langueges
    • Del.icio.us (Social Tagging for Bookmarks)‏
    • Flickr (Photo Sharing and Tagging) 
    • Blogs, RSS, Blogger.com
    • Programmableweb.com: 150 web-APIs


    Blogs

    • Easy usable user interfaces to update contents
    • Easy organization of contents
    • Easy usage of contents
    • Easy publishing of comments
    • Social: collaborative (single users but strongly connected)‏


    Introduction


    • Wiki was invented by Ward Cunningham
    • Collection of HTML sites: read and edit
    • Most famous and biggest Wiki: Wikipedia (MediaWiki)
      • But: Also often used in Intranets (i. e. our group)
    • Problems solved socially instead of technically
    • Flexible structure
    • Background algorithms + human intelligence
    • No new technologies
    • social: collaborative (nobody owns contents)



    Wikis: Design Principles

    • Open
      • Should a page be found to be incomplete or poorly organized, any reader can edit it as they see fit. 
    • Incremental
      • Pages can cite other pages, including pages that have not been written yet. 
    • Organic
      • The structure and text content of the site are open to editing and evolution. 
    • Mundane
      • A small number of (irregular) text conventions will provide access to the most useful page markup. 
    • Universal
      • The mechanisms of editing and organizing are the same as those of writing so that any writer is automatically an editor and organizer. 
    • Overt
      • The formatted (and printed) output will suggest the input required to reproduce it. 
    • Unified
      • Page names will be drawn from a flat space so that no additional context is required to interpret them. 
    • Precise
      • Pages will be titled with sufficient precision to avoid most name clashes, typically by forming noun phrases.


    Wikis: Design Principles

    • Tolerant
      • Interpretable (even if undesirable) behavior is preferred to error messages. 
    • Observable
      • Activity within the site can be watched and reviewed by any other visitor to the site.
    • Convergent
      • Duplication can be discouraged or removed by finding and citing similar or related content. 


    Social tagging

    • Idea: Enrich contents by user chosen keywords
    • Replace folder based structure by a organisation using tags
    • New: Simple user interfaces for tagging and tag based search
    • First steps to Semantic Web?
    • Technically: user interfaces
    • Social: collaborative (own contents, shared tags)


    Collaborative Tagging



    Collaborative Tagging: Delicious

    • Browser plug-ins available from http://del.icio.us
    • Allows the tagging of bookmarks
    • Community aspect: 
      • Suggestion of tags that were used by other users
      • Availability of tag clouds for bookmarks of the whole community
      • Possibility to browse related bookmarks based on tags