Institute for Informatics
Georg-August-Universität Göttingen

Databases and Information Systems

dbis
Uni Göttingen

Semantic Web
WS 2018/19

Prof. Dr. Wolfgang May,
Lars Runge, M.Sc., Sebastian Schrage, M.Sc.

Date and Time: Tue 14-16, Wed 10-12
Room: IFI 2.101 (North Campus)
Exercises (Übung): integrated into lecture (see announcements on this page). There will be non-mandatory exercise sheets whose solutions will be discussed as parts of the lecture.
If (and as long as) non-german-speaking participants attend, the course will be given in english.

Technical Data: 6 ECTS credits (Studies in Applied Informatics).

Prerequisites

  • Knowledge in First-Order Logic as taught in "Formale Systeme" is sufficient. Although, prospective participants are recommended to have participated in the lecture Deductive Databases.
  • XML: RDF/XML uses XML as representation, but requires only a little bit of knowledge about XML. A short introduction to XML from that point of view will be given in the lecture.
    XML with DTD, XPath, XQuery, XSLT and XML Schema is the topic of the lecture Semistructured Data and XML (prospectively taking place again in Summer Term 2019).

Note: the module is by default credited as "Core Informatics". It can also be credited as "Applied Informatics". (Decision by the Dean of Studies on 25.10.2006/2.2.2010). In this case, please prepare a personal plan of studies (that e.g. connects it with your application area etc.) and ask for approval by the DoS.

Course Description

  • Short Review: Basic Notions of First-Order Logic
  • RDF: N3 and RDF/XML format, semantics
  • SPARQL: the query language for RDF data
  • RDFS, OWL: having RDF data with additional reasoning
  • Description Logics: the logic underlying OWL
  • Practical experiments with RDF, Jena, Reasoners etc.
  • An experimental Web interface can be found for RDF+OWL and SPARQL here

Dates & Topics

  • First Meeting: Tue 16.10. 14-16: Administrativa, Overview.
    Slides: Introduction and Ontologies
  • Reasoning Motivation: the Einstein/Fish Puzzle ... will (again) be solved declaratively, but totally different than in the "Deductive Databases" lecture.
  • Wed 17.10. Introduction, Web architectures
    Smartboard Notes
  • Tue, 23.10.: Web Architectures, Ontologies
    Smartboard Notes
  • Wed, 24.10.: Ontologies (Cont'd); review of first-order logic and basic notions of model theory.
    Slides: Introduction to Logics.
    Smartboard Notes
    A slide set on first-order logic (from the Deductive Databases lecture) can be found here.
  • Tue 30.10.: Logical Formalization of Ontologies
    Smartboard Notes
  • Wed, 31.10.: Holiday - No Lecture
  • 6.11. Ontologies, Reasoning
    A slide set on reasoning and the FOL tableau calculus (from the DB theory lecture) can be found here.
    Smartboard Notes (pdf export facility failed a bit ... before it crashed completely, recall the strange behavior during the lecture).
  • "Inofficial" Exercise Sheet 1 (Tableaux)
    These (simple) exercises just review the FOL tableau calculus. Due to reasons of time, they will not be discussed at length in the lecture. Here is the [solution].
  • 7.11. Inference Systems
    Smartboard Notes (this time, Smart Notebook crashed only after saving ...).
  • 13.11. Inference Systems, RDF
    Slides: RDF
    Smartboard Notes
    Note: all example files are accessible in the RDF subdirectory of this Web page [ Download RDF.zip] .
  • 14.11. Lecture: RDF
    [no relevant SmartBoard notes]
  • 20.11. Lecture: RDF, SPARQL
    Smartboard Notes
    Exercise Sheet: SPARQL
  • 21.11. Lecture: SPARQL
    Smartboard Notes
  • 27.11: Discussion of Exercise Sheet 1 (SPARQL) [Solution],
    Lecture: SPARQL - Formal Semantics
    [Exercise Sheet: SPARQL Formal Semantics],
    Smartboard Notes
  • 28.11.: SPARQL - Formal Semantics, SPARQL 1.1,
    Smartboard Notes
  • 4.12.: Discussion of Exercises 1+2 from Sheet 2 (SPARQL Details) [Solution],
    RDF
    Slides: RDFS
    Smartboard Notes
  • 5.12.: RDF: Blank Node, Tree vs. Graph Data Models & Logic
    Smartboard Notes
  • 11.12.: Veranstaltung diesmal im SR 0.101 - versuchen herauszufinden, ob der Fehler an der SmartBoard-Hardware liegen könnte
    Discussion of rest of Exercise Sheet 2 (SPARQL Details); Lecture: RDFS
    Slides: RDFS
    No Smartboard Notes today.
  • 12.12.: Lecture: RDFS
    Smartboard Notes
  • 18.12.: RDF/XML - shortly the main ideas. RDF/XML ist nothing conceptually new, but (if one knows XML well) mainly craft, like using URIs, and expansion of element names to URIs with namespaces, and xml:base.
    Slides: RDF/XML
    The experimental Mondial LOD Service can be found at http://www.semwebtech.org/mondial/10
    No Smartboard Notes today.
  • 19.12.2018: Description Logics
    Slides: DL
    Smartboard Notes
  • 8.1.2019 DL, OWL
    Slides: OWL
    Smartboard Notes
  • 9.1. OWL ...
    Smartboard Notes
    The RDF/OWL/SPARQL Web interface can be found here
  • 15.1.: OWL
    Smartboard Notes
  • 16.1.: OWL
    Smartboard Notes
  • 22.1.: OWL Exercise Sheet 3 (OWL)
    Smartboard Notes
  • The Mondial database in RDF format can be found here.
    This is an example for a real-world ontology. Still, it is more elaborated than most real ontologies (interface-style classes like Area, Line, SmallArea; several union classes to restrict the domains as exact as possible).
  • 23.1.: OWL 2.0 Slides: OWL 2.0
    Smartboard Notes
  • ... and now you can also try to solve the Einstein/Fish Puzzle.
  • Announcement: Semantic Web Lab Course SS 2019
    We introduce a new practical course on Semantic Web Technologies which builds upon the material of the lecture.
  • Furthermore, in SS19, there is Lecture Semistructured Data and XML SS2019 (Advanced BSc/MSc) about XML and related things, and the Lecture Deductive Databases SS2019 (Advanced BSc/MSc) on Datalog and "intelligent" relational databases under closed-world semantics.
  • 29.1.: OWL 2.0, Discussion of (parts of) Exercise Sheet 3 (solution)
    Smartboard Notes
  • Comments on the discussion from today: "Can SelfRestriction be used to describe cycles in a win-move game, and to find out DrawNodes?
    No. Firstly, Pellet does not accept SelfRestriction+Transitivity (which is necessary to detect paths at all) (cf. new Slide 468, new slide numbers, see below).
    Secondly: "if a node is in a cycle such that there is no LoseNode-Exit from the cycle" cannot be tested: There is no way for the reasoner to "collect" the nodes on a cycle in a data structure.
    Thirdly: "if the node is in a cycle that does not have a LoseNode-Exit" is not sufficient, but the criterion must also consider overlapping cycles, and all (long) cyclic paths induced by them.
    This conflicts with the DL-typical "locality principle" (every statement considers at most two nodes connected by an edge), which guarantees decidability.
    In general, proving DrawNodes positively (consider Exercise 3.1 from today where it did not work for the even-numbered cycle) is a chicken-egg-problem. Reasoners usually cannot detect chicken-egg-problems because these involve a kind of higher-level perspective which can reason about the reasoner's behavior (e.g. SPARQL over the reasoner that fails to prove something, or considering all individual stable models to draw an overall conclusion).
  • Updated slides (slide and chapter numbers changed): added slides about LOD/Wikidata after the RDF/XML section
  • 30.1.: OWL 2.0
    Smartboard Notes
  • ... finally, there is also the solution to the Fishpuzzle in OWL. There are two modelings. At the beginning (10 years ago, a very old version of pellet) the short one took about 5 minutes, and the long one took 40 hours. Today, both run in some seconds, which demonstrates mainly the progress in the reasoners' internal strategy:
    • fishpuzzleLong.n3 is a very detailed and intuitive specification:
      Part 1: Specification of a row of 5 houses.
      Part 2: Specification of the properties.
      Part 3: Specification of the constraints.
    • fishpuzzleLong.sparql is the corresponding query
    • fishpuzzleShort.n3 is a shorter encoding:
      Part 1: Specification of a row of 5 houses as above.
      Part 2: tricky encoding of the properties. Instead of assigning to each house a color, a person, a brand of cigarettes, a drink, and a pet via explicit RDF edges, they are declared to be sets that are identified/mapped to each other.
      Part 3: specification of the constraints. Simpler as before, since only the equivalence classes have to be considered.
      The reasoner then has just to compute the matching (actually, this task is very similar to what StableModels does in the same situation based on clauses/disjunctions).
    • fishpuzzleShort.sparql is the corresponding query.
  • End of lectures: 1.2.2019

The SmartBoard Notes are collected here (only relevant ones, so for some dates there are no notes).

The complete slide set can be found here. Please do not print it yet (subject to change); the slides of the SSD&XML lecture can also be found there. Knowledge of XML is only required so far as RDF/XML is (in addition to the N3 format) a possible representation of RDF data. One should be able to "understand" an XML document. XPath/XQuery and XSLT are not required.

Exams

  • Oral exams, several slots to choose between February 4th and April 2019.
  • Exam procedure: about 30-40 minutes. Candidates start with talking about a topic of their choice from the lecture (5-10 minutes), then questions+answers, including sketches on paper develops dynamically. The 5-10 minutes talk at the beginning should give me as an examiner a good impression of your knowledge, and a good starting point to assess your knowledge with further questions (usually starting with the chosen topic, and then also going to other topics from the lecture).
  • There are several slots to chose. Choose one of them. Each slot has a fixed registration/deregistration end date.
    (the main reason to have different slots is that registration and deregistration in FlexNever is only possible up to one week before the first day of exams - so having different slots makes it possible to decide later)
    • Exams between Feb. 4-March 8 (first weeks after the lectures), registration/deregistration until Jan. 28
    • Exams between April 8-April 26 (summer term lectures start on April 15, Easter weekend is April 19-22), registration/deregistration until April 1
    • Contact me by mail for the individual exam appointment in the slot of your choice ... at latest with the end of registration.

Background Literature

P. Hitzler, M. Krötzsch, S. Rudolph, Y. Sure: "Semantic Web - Grundlagen" (in German). Springer eXamen.press, 2008; ISBN 978-3-540-33994-6.
The (german language) book covers nearly exactly the contents of the lecture and also contains an introduction to first-order logic in the appendix.

P. Hitzler, M. Krötzsch, S. Rudolph, Y. Sure: "Foundations of Semantic Web Technologies" (in English). Chapman & Hall/CRC, 2009; ISBN: 9781420090505
The (english language) book covers nearly exactly the contents of the lecture and also contains an introduction to first-order logic in the appendix.

For the part on (first order) logic, and textbook on foundations of logic from the library (e.g. "Logik für Informatiker" (in German) von Uwe Schöning) or the manuscript "Formale Systeme" by Peter H. Schmitt (Uni Karlsruhe) (Kap. 1-5) can be used.

Some Links


Semantic Web Tools and Links

If you experience any problems (forgotten chmod, wrong paths, forgotten updates etc.), please notify us.

Web-wide Services

Professional Tools

Pellet - the OWL Reasoner

  • Pellet Homepage (with Download)
  • Command line usage:
    • set alias (bashrc etc.)
             alias pellet='~dbis/SemWeb-Tools/pellet/pellet.sh'
    • query: pellet query -query-file queryfile inputfile
    • Multiple input files can be used via the JENA-based tool described below.
  • Usage as Web Service (see Slides)
    • For use in the CIP Pool, a Pellet instance running on ap34 can be used at http://ap34.ifi.informatik.uni-goettingen.de/pellet/.
    • If you have an own Pellet on your own computer, start it with ./pellet-dig.sh.
      Pellet usually runs at port 8081. The URL will then be http://localhost:8081.

Jena: RDF and SPARQL

Apache JENA ( https://jena.apache.org/ ) is a free and open source Java framework for building Semantic Web and Linked Data applications.
The course uses a lightweight housemade shell interface to Jena for querying:

  • Download most recent version (incl. OWL2, partially SPARQL 1.1, still based on Jena 2.10 and compatible pellet) July 2018.
  • in the CIP Pool located at /afs/informatik.uni-goettingen.de/course/semweb-lecture/JENA-API/semweb.jar
  • Experimental Web interface

  • requires java 1.5
  • set alias (bashrc etc.)
     alias jena='java -jar /afs/informatik.uni-goettingen.de/course/semweb-lecture/JENA-API/semweb.jar'
  • query: (if=input-files, qf=query-file, e.g. in SPARQL)
    jena -q -if inputfiles -qf queryfile
  • general options:
    -il: input language (allows RDF/XML RDF/XML-ABBREV N-TRIPLE N3 TURTLE; N3 is default)
    -if: input files
  • query options:
    -q: query
    -il, -if: as above
    -qf: query-file
  • transform options:
    -t: transform
    -ol: output format (allows RDF/XML RDF/XML-ABBREV N-TRIPLE N3-PLAIN N3-PP N3-TRIPLE N3 TURTLE; N3 is default)
  • export class tree; options:
    -e: export class tree (gives some insight for debugging an ontology ...)
    -il, -if: as above
  • reasoner options (for -q and -e):
    activate reasoning; default: internal reasoner: option -inf (for "inference")
    or use the pellet class that comes with the semweb.jar:
     jena -q -inf -qf query-filename
     jena -q -pellet -qf query-filename

LOD: Accessing RDF Data in the Web

  • rapper: a tool that accesses a Web page in RDF-reading mode to get RDF triples: e.g.
    rapper http://sws.geonames.org/3017382/

Mondial in RDF

The Mondial database in RDF format can be found at http://www.dbis.informatik.uni-goettingen.de/Mondial/#RDF.

Call e.g.

 jena -q -qf mondial-query.sparql
or
 jena -pellet -q -qf mondial-meta-query.sparql

Usage in the CIP Pool

From the CIP Pool computers at the IFI (ground floor or log in from remote), the software and resources are directly accessible:

  • log in from remote to login.stud.informatik.uni-goettingen.de (Linux: ssh, Windows: puTTY)
  • log through to one of the individual computers (e.g. ssh c032)
  • set the alias in your .bashrc file:
     alias jena='java -jar /afs/informatik.uni-goettingen.de/course/semweb-lecture/JENA-API/semweb.jar'
  • The lecture's RDF directory with the n3 files can be found at
     /afs/informatik.uni-goettingen.de/user/d/dbisuser/public_html/teaching/SemWeb/RDF
  • The Mondial files can be found at
     /afs/informatik.uni-goettingen.de/user/d/dbisuser/public_html/Mondial
     /afs/informatik.uni-goettingen.de/user/d/dbisuser/public_html/Mondial/Mondial-RDF