Uni Göttingen
Institute for Informatics
Databases and Information Systems

dbis

Semantic Web
WS 2022/23

Prof. Dr. Wolfgang May,
Lars Runge, M.Sc., Sebastian Schrage, M.Sc.

  • Date and Time: Tue 14-16, Wed 10-12
  • Room: IFI 2.101 (North Campus)
  • This year, DBIS will use mainly non-live teaching by pre-recordings. There will be some live online meetings with BigBlueButton provided by GWDG; the rooms/meetings can be entered via StudIP.
  • Materials for self-studying (in english) will be linked below weekwise:
    • revised videos taken from summer term 2020 (as the "original" dates in the filenames indicate),
    • PDF slides
  • Please also read the general and technical information about DBIS virtual teaching.

Lecture and Exercises mixed (see announcements on this page). There will be non-mandatory exercise sheets whose solutions will be discussed as parts of the lecture.
All materials and announcements can be found HERE on the "blue DBIS pages".

Technical Data: Module Modul CSM.Inf.1142, 4 SWS, 6 ECTS credits (Studies in Applied Informatics).
The module's home is the MSc studies in Applied CS. It can also be credited in the BSc studies in Applied CS (as "Vertiefung Datenbanken"),
and in several other studies:
BSc/MSc Wirtschaftsinformatik, Mathematics (BSc/MSc), Digital Humanities, Teaching/2-Fach-Bachelor, PhD GAUSS, ...

Prerequisites

  • Basic knowledge of databases (conceptual modeling, relational model, SQL), background notions of "data model" in general, the idea of declarative set-oriented query languages.
  • Knowledge in First-Order Logic as taught in "Formale Systeme" is recommended; at least you should have some idea of it, and not be scared of formalisms. Semantic Web is an example of applied First-Order Logic and model theory.
    There is also the "sibling" lecture Deductive Databases that is also applied First-Order Logic, but with a slightly different model theory.
  • XML: RDF/XML uses XML as representation, but requires only a little bit of knowledge about XML. A short introduction to XML from that point of view will be given in the lecture.
    XML with DTD, XPath, XQuery, XSLT and XML Schema is the topic of the lecture Semistructured Data and XML (prospectively taking place again in Summer Term 2022).

Course Description

The course starts with the requirements on Web-wide data management and querying in the early 2000s: Web-wide data, and intelligent data integration aspects. In this context, the notions of metadata and ontologies are discussed. The central topic of the course is then the RDF data model, a graph data model with the corresponding SPARQL query language. On the path towards knowledge bases, RDFS (RDF Schema) and straightforward ontology portions of OWL are introduced. On the technological side, Linked Open Data (LOD) is presented. Finally, OWL-DL knowledge bases and reasoning that can solve problems that are far beyond "simple" data management are investigated. As practice-of-theory, this part of the course illustrates the problems of open world negation (und universal quantification).

  • Short Review: Basic Notions of First-Order Logic
  • RDF: N3 and RDF/XML format, semantics
  • SPARQL: the query language for RDF data
  • Linked Open Data (LOD): Web of Data and distributed querying.
  • The Mondial database is used as an example for RDF data.
    Mondial LOD entry point.
  • RDFS, OWL: having RDF data with additional reasoning
  • Description Logics: the logic underlying OWL
  • Practical experiments with RDF, Jena, Reasoners etc.
  • An experimental Web interface can be found for RDF+OWL and SPARQL here

Dates & Topics

  • First Meeting: Tue 25.10. 14-16, live online in BBB via StudIP:
    Administrativa, Overview.
  • Material for self-studying:
    • An overview of datamodel concepts and buzzwords (data models XML and RDF, and mechanisms for "intelligent databases") related to the DBIS lectures.
      Database concepts and buzzwords recording
    • Reasoning Motivation: the Einstein/Fish Puzzle will be solved declaratively (totally different than in the "Deductive Databases" lecture).
      Solve it by human reasoning, keep record of your solution steps. If you attended "Formal Systems": what reasoning calculus would fit for it (and for sudokus, which are of a similar type)?

The following is a "virtual" schedule for structuring self-studying:

The way towards the Semantic Web: Earlier Web Data Architectures, Formal Ontologies

  • 27.10. (Recording from 4.11.2020): Web Architectures
  • 1.11. (Recording from 10.11.2021): Ontologies
    accompanying notes (this document will be incremental, collecting all notes from the lecture taken with draw.io)
  • 2.11. (Recording from 11.11.): Ontologies (cont'd)
    (FOL formalism for ontologies here as an "applied 'formal system'")
    accompanying notes
  • 8.11. (Recording from 17.11.2020): Ontologies (cont'd)
    (FOL formalism for ontologies here as an "applied 'formal system'")
    accompanying notes
  • 9.11. (Recording from 18.11.2020): Ontologies (cont'd)
    notes from 17+18.11.
  • 15.11. (Recording from 24.11.2020): Review of first-order logic and basic notions of model theory and reasoning, Inference Systems
    Note: on the recording, I started in german ... for 1:40 minutes, then in English.
    Slides: Introduction to Logics.
  • "Inofficial" Exercise Sheet 0 (Tableaux)
    These (simple) exercises just review the FOL tableau calculus. Due to reasons of time, they will not be discussed at length in the lecture. Here is the [solution].
  • 16.11. (Recording from 25.11.2020): Tableau calculus, Reasoning, Inference Systems
    Further slides (from the Deductive Databases lecture) about first-order logic and the relational calculus and about reasoning and the FOL tableau calculus.

Data Model and Query Language for Data on the Web: RDF, RDFS and SPARQL

  • 22.11. 14:15 (Optional) online live meeting. Questions? Outlook ...
  • 22.11.2021 (Recording from 1.12.2020): RDF
    (The BBB session/recording on 1.12.2020 BBB broke down after 6 minutes. The lecture has afterwards been continued in the same room. Thus, there are two separate recordings (timestamps 14:03, the one listed second, and 14:25, the one listed first, with a duration of 6 and 80 minutes, respectively.)
    Slides: RDF
    Note: all example files are accessible in the RDF subdirectory of this Web page
    [ Download RDF.zip ]
    [ SPARQL Web Service interface ]
  • 23.11. (Recording from 2.12.2020): RDF, SPARQL Exercise Sheet 1: SPARQL
  • 29.11. (Recording from 8.12.2020): SPARQL
    Solution to Exercise Sheet 1
  • 30.11. (Recording from 9.12.2020): SPARQL 1.1, SPARQL Formal Semantics
  • 6.12. (Recording from 15.12.2020): SPARQL Formal Semantics
    Exercise Sheet 2: SPARQL Formal Semantics
  • 7.12. (Recording from 16.12.2020): RDF: Blank Nodes, Tree vs. Graph Data Models & Logic,
    RDFS
    Slides: RDFS
  • 13.12. (Recording from 22.12.2020): RDFS Model Theory
  • 14.12.2022 (Recording from 23.12.2020): RDFS Reasoning, querying etc (this chapter until the end)
  • Holiday work (Recording from 3.1.2021): Solutions of Exercise Sheet 2
    [pdf solution]

Data on the Web/Linked Open Data

This section gives an overview of practical usage issues

  • 20.12. (Recording from 4.1.2021):
    RDF/XML: the "back link" to the XML world which was the driving force to create the "Semantic Web": give meaning to XML element names and attribute names.
    RDF/XML ist nothing conceptually new, but (if one knows XML well) mainly craft, like using URIs, and expansion of element names to URIs with namespaces, and xml:base.
    Its design illustrates the compromise between (existing) IT technology (concrete XML as tree-structured document-data, URI/URI expansion), and the abstract data model of an RDF graph, and the (common to both worlds) aspect of namespaces. The result is a lot of syntax that finally results in a feasible IT concept.
    Slides: RDF/XML
  • 21.12.: Playing section: Linked Open Data (Recording from 7.1.2021).
    Data on the Web, Web/HTTP Technology, accessing it as data triples and with SPARQL.
    Slides: LOD
    The experimental Mondial LOD Service can be found at http://www.semwebtech.org/mondial/10
  • 21.12.: Reification and Modeling/Wikidata (Recording from 9.1.2021).
    (Slides "LOD" from above)
    Wikidata is an example of a Knowledge Graph for LOD data that uses flat RDF, but not RDFS and OWL since that would constrain the expressiveness to much (basically, the problem that penguin is at the same time (i) a class that contains individuals, and (ii) an individual of the class Species that has user-defined properties of its own).
    HTML Germany page in Wikidata
    Wikidata SPARQL query interface

Semantic Web with Reasoning: Description Logics and OWL

  • 10.1.2023, 14:15 (Optional) online live meeting. Questions? Outlook ...
  • 10./11.1.2023 (Recordings from 12./13.1.2021): Description Logics
    Slides: DL
  • 17./18.1. (Recording from 19./20.1.2021): OWL
    Slides: OWL
  • 24./25.1. (Recordings from 26./27.1.2021): OWL ...
    Exercise Sheet 3: OWL
  • 31.1./1.2. (Recordings from 2./3.2.2021): OWL, OWL 2.0 Slides: OWL 2.0
  • ... and now you can also try to solve the Einstein/Fish Puzzle.
  • 7.2.2022, 14:15 (Optional) online live meeting. Questions and Answers
  • 7.2. (Recording from 9.2.2021): OWL 2.0 (cont'd)
  • 8.2. (Recording from 10.2.2021): Discussion of Exercise Sheet 3, OWL 2.0
    Solutions to Exercise Sheet 3 (OWL) [solution]
  • ... to be extended ...
  • End of lecture period: 10.2.2023

The complete slide set can be found here (including the Semantic Web Lab Course Slides). The slides of the SSD&XML lecture can also be found there. Knowledge of XML is only required so far as RDF/XML is (in addition to the N3 format) a possible representation of RDF data. One should be able to "understand" an XML document. XPath/XQuery and XSLT are not required.

Exams

Background Literature

P. Hitzler, M. Krötzsch, S. Rudolph, Y. Sure: "Semantic Web - Grundlagen" (in German). Springer eXamen.press, 2008; ISBN 978-3-540-33994-6.
The (german language) book covers nearly exactly the contents of the lecture and also contains an introduction to first-order logic in the appendix.

P. Hitzler, M. Krötzsch, S. Rudolph, Y. Sure: "Foundations of Semantic Web Technologies" (in English). Chapman & Hall/CRC, 2009; ISBN: 9781420090505
The (english language) book covers nearly exactly the contents of the lecture and also contains an introduction to first-order logic in the appendix.

For the part on (first order) logic, and textbook on foundations of logic from the library (e.g. "Logik für Informatiker" (in German) von Uwe Schöning) or the manuscript "Formale Systeme" by Peter H. Schmitt (Uni Karlsruhe) (Kap. 1-5) can be used.

Some Links


Semantic Web Tools and Links

If you experience any problems (forgotten chmod, wrong paths, forgotten updates etc.), please notify us.

Web-wide Services

Professional Tools

Jena: RDF and SPARQL

Apache JENA ( https://jena.apache.org/ ) is a free and open source Java framework for building Semantic Web and Linked Data applications.
The course uses a lightweight housemade shell interface to Jena for querying:

  • Download most recent version (incl. OWL2, partially SPARQL 1.1, based on Jena 2.10 and compatible Openllet, migrated to log4j-2.16), Dec. 2021.
  • in the CIP Pool located at /afs/informatik.uni-goettingen.de/course/semweb-lecture/JENA-API/semweb.jar
  • Experimental Web interface

  • set alias (bashrc etc.)
     alias jena='java -jar /afs/informatik.uni-goettingen.de/course/semweb-lecture/JENA-API/semweb.jar'
  • query: (if=input-files, qf=query-file, e.g. in SPARQL)
    jena -q -if inputfiles -qf queryfile
  • general options:
    -il: input language (allows RDF/XML RDF/XML-ABBREV N-TRIPLE N3 TURTLE; N3 is default)
    -if: input files
  • query options:
    -q: query
    -il, -if: as above
    -qf: query-file
  • transform options:
    -t: transform
    -ol: output format (allows RDF/XML RDF/XML-ABBREV N-TRIPLE N3-PLAIN N3-PP N3-TRIPLE N3 TURTLE; N3 is default)
  • export class tree; options:
    -e: export class tree (gives some insight for debugging an ontology ...)
    -il, -if: as above
  • reasoner options (for -q and -e):
    activate reasoning; default: internal reasoner: option -inf (for "inference")
    or use the pellet class that comes with the semweb.jar:
     jena -q -inf -qf query-filename
     jena -q -pellet -qf query-filename

Pellet - the OWL Reasoner

  • Openllet Homepage (Open Source tool based on the last freely-available version of Pellet)
  • Pellet itself has been turned into a commercial product at Stardog.

LOD: Accessing RDF Data in the Web

  • rapper: a tool that accesses a Web page in RDF-reading mode to get RDF triples: e.g.
    rapper http://sws.geonames.org/3017382/

Mondial in RDF

The Mondial database in RDF format can be found at http://www.dbis.informatik.uni-goettingen.de/Mondial/#RDF.

Call e.g.

 jena -q -qf mondial-query.sparql
or
 jena -pellet -q -qf mondial-meta-query.sparql

Usage in the CIP Pool

From the CIP Pool computers at the IFI (ground floor or log in from remote), the software and resources are directly accessible:

  • log in from remote to login.stud.informatik.uni-goettingen.de (Linux: ssh, Windows: puTTY)
  • log through to one of the individual computers (e.g. ssh c032)
  • set the alias in your .bashrc file:
     alias jena='java -jar /afs/informatik.uni-goettingen.de/course/semweb-lecture/JENA-API/semweb.jar'
  • The lecture's RDF directory with the n3 files can be found at
     /afs/informatik.uni-goettingen.de/user/d/dbisuser/public_html/teaching/SemWeb/RDF
  • The Mondial files can be found at
     /afs/informatik.uni-goettingen.de/user/d/dbisuser/public_html/Mondial
     /afs/informatik.uni-goettingen.de/user/d/dbisuser/public_html/Mondial/Mondial-RDF