IBM 1620 data processing machine, 1962

Who is this?

The Web
The Web accompanies the transition from an industrial to an information society and provides the infrastructure for a new quality of information handling regarding acquisition as well as provisioning
- high availability
- high relevance
- low cost
The Web penetrates society
- Social contacts (social networking platforms, blogging, ...)
- Economics (buying, selling, advertising, ...)
- Administration (eGovernment)
- Work life (information gathering and sharing)
- Recreation (games, role play, creativity, ...)
- Education (eLearning, Web as information system, ...)
The current Web
Immensely successful.
- Huge amounts of information and data.
- Syntax standards for transfer of structured data.
- Machine-processable, human-readable documents.
- Content/knowledge cannot be accessed by machines.
- Meaning (semantics) of transferred data is not accessible.
Limitations of the Web
Too much information with too little structure and made for human consumption- Content search is very simplistic
- → future requires better methods
- in terms of content
- in terms of structure
- in terms of character encoding
-
→ future requires intelligent information integration
- → requires automated reasoning techniques
What Google does not find
There are many information needs current search engines can not satisfy:
- Apartments for rent close to well rated Thai restaurants
- Bi-lingual English-German child care in Berlin reachable in 15 minutes from my place of work
- Kid-friendly holiday destinations with culture and sports activities
- Researchers working in south-east Asia on information retrieval topics
- ERP service providers with offices in Vienna and Berlin
- ...
We have subconsciously learned not to ask search engines such questions.
In principle, all the required knowledge is on the Web – most of it even in machine-readable form. However, without automated data integration, processing (and reasoning) we cannot obtain a useful answer.
What's the problem with the Web
- inability to integrate and fuse information from different sources
- there is lack of comprehensive background knowledge to interpret information found on the Web
- current Web search is restricted to text in a certain language - there are many “smaller” languages with much less information available than in English
Basic ingredients for the Semantic Web
- Open Standards for describing information on the Web
-
Methods for obtaining further information from such descriptions
Data Models, Access & Integration
Data Integration | Enterprise Information Integration sets of heterogeneous data sources appear as a single, homogeneous data source |
Data Warehousing
|
Research
|
Data Web
|
Data Access | Object relational mappings (ORM)
|
Procedural APIs
|
Query Languages
|
Linked Data
|
Data Models | RDBMS
|
LOD Cloud May 2007

LOD Cloud October 2007

LOD Cloud February 2008

LOD Cloud September 2008

LOD Cloud March 2009

LOD Cloud September 2010

LOD Cloud September 2011

LOD Cloud August 2014

LOD Cloud February 2017

The Web of Data
- >70 bilion facts
- covering many different domains (life-sciences, geo, user generated content, government, bibiographic, ...)

Map to the Semantic Web

The Semantic Data Web Stack
… also known as “layer cake”

URIs and Unicode
- URI = Uniform Resource Identifier
- Used to create globally unique names for resources
-
Every object with clear identity can be a resource
- Books, places, organizations ...
- In the books domain the ISBN serves the same purpose
- IRIs: Unicode-aware extension of URIs (I = Internationalized)
Resource Description Framework – RDF
Information is represented in RDF in triples (also called statements, facts):

- Modeled on linguistic categories, but not always consistent
- Allowed assignments:
- Subject: URI or blank node
- Predicate: URI (a.k.a. property)
- Object: URI, blank node or literal
- Node and edge labels should be unambiguous, so that the original graph is reconstructable from a list of triples
RDF Schema
Not all triples make sense:
Cinema AlbertEinstein 2012
How can we constrain the use of RDF?
RDFS (S = “Schema”) allows to define classes, properties and restrict their use.
SPARQL – Query Language for RDF

SELECT * WHERE { jwebsp:John foaf:knows ?friend }
Web Ontology Language – OWL
- OWL: acronym for Web Ontology Language, more easily pronounced than WOL
- family of languages for authoring ontologies
- since 2004, OWL 2.0 since 2009
- Semantic fragment of FOL
Features
- Instantiation of classes by individuals
- Concept hierarchies (taxonomies, inheritance): classes, terms
- Binary relations between individuals: Properties, Roles
- Properties of relations (e.g., range, transitive)
- Data types (e.g. Numbers): concrete domains
- Logical means expression
- Clear semantics!
RDFa Content Editor – RDFaCE
supports the automatic semantic annotation of texts

Literature
- Pascal Hitzler, Markus Krötzsch, Sebastian Rudolph: Foundations of Semantic Web Technologies, Chapman & Hall/CRC, 2009, 455 pages, hardcover, ISBN: 9781420090505, http://www.semantic-web-book.org
- Amit Sheth, Krishnaprasad Thirunarayan: Semantics Empowered Web 3.0: Managing Enterprise, Social, Sensor, and Cloud-based Data and Services for Advanced Applications (Synthesis Lectures on Data Management), Morgan & Claypool Publishers (December 19, 2012), ISBN: 1608457168
- Tom Heath, Christian Bizer: Linked Data (Synthesis Lectures on the Semantic Web: Theory and Technology), Morgan & Claypool Publishers; 1 edition (February 20, 2011), ISBN: 1608454304. http://linkeddatabook.com

Questions
All the corresponding questions for the Introductions are covered in the Questions part of the Deck.