Institute for Informatics
Georg-August-Universität Göttingen

Databases and Information Systems

Uni Göttingen

Practical Training XML
Winter Term 2020/21

Prof. Dr. Wolfgang May
Lars Runge, M.Sc., Sebastian Schrage, M.Sc.

  • Date and Time: Mon 14-18
  • Room: IFI 2.101 (North Campus)
  • Virtual Meetings: We will use BigBlueButton provided by GWDG; the rooms/meetings can be entered via StudIP. There also the recordings can be found (they cannot be exported or edited at all).
    Please also read the general and technical information about DBIS virtual teaching.
  • The course yields 6 ECTS-credits; it is graded (=benotet)


  • Successful participation in the module "Semistructured Data and XML" or comparable good knowledge in XML
  • Successful participation in the "General Programming Lab/Allgemeines Programmierpraktikum" (or an equivalent course).

Course Description

The practical training builds upon the lecture Semistructured Data and XML. The training uses the concepts of the XML world: DTD, XPath, XQuery, XSLT, XLink, XML Schema, SQL/XML, XML APIs for Java (SAX, DOM, JAXB), and Web Service infrastructure (Apache Tomcat).

  • Most of the course uses the geographical sample database "Mondial" in its XML version.

Documentation: use the slides from the SSD/XML lecture (the full slide set can be found here; Sections 1-8 were the material of the lecture, the rest is intended for the practical course) and the W3C documentation linked below. For practical exercises, the XML software is installed in the IFI CIP Pool. Short descriptions of the software to be used can be found here.

The course takes place in groups of 3-4 persons. There are probably 5 units: XPath/XQuery, XSLT (both based on the material of the lecture), XML and Java I (DOM/SAX/StAX), XML and Java II (JAXB and Digester), and Web Services. For every unit, there is an exercise sheet which will be discussed with the supervisors.

Prospective Time Schedule

Part I: Review of basics, concepts and languages around XML that should be known from the XML lecture.

Part II:

  • 30.11.: Course Meeting
    XML and Java I: DOM, SAX, StAX
    Slides: XML and Java I
    Code fragments from the slides: to download
    • Exercise Sheet 3: DOM/SAX/StAX [ catdata.xml ]
    • The exercise sheet contains 8 exercises:
    • Exercises 1 and 2 are smaller "warmup" exercises for DOM and SAX/StAX, they should be done groupwise such that everybody gets into the new material. The discussion of these exercises will be rather short.
    • Amongst Exercises 3-8 (6 exercises, 6 participants), every participant should choose one to solve it and to present it; it might nevertheless be useful to work together and to communicate. For questions etc, we will use a common RocketChat channel "xml-p" then.
      The presentation of these bigger exercises should not just be like "yes, I did it", but to present the solution ideas+strategy, an informative overview of the code and the interesting aspects, and the lessons learnt by doing it.
    • Presentations/Discussion preferably in the weeks from 6.-21.12.2020 (for this, we need probably two appointments where all of us are available).
  • between Dec 1st and Dec 10th ... discussion of Exercise Sheet 2 (XSLT) groupwise
  • 14.12.: Course Meeting/doing the recording
    XML and Java II: JAXB, Digester
    Slides: XML and Java II.
    JAXB uses XML Schema: Slides XML Schema
    Code fragments from the slides: to download (including the readme with the shell commands for JAXB)
    The Digester jar files can be found in /afs/ (CIP Pool) or here.
    Note: the Digester examples on the slides work in the IFI-Pool only when /afs/ is not in the classpath (situation in 2019).
  • 21.12. Discussion of Exercise Sheet 3
  • 11.1.2021: Course Meeting - Web-Server-Technology, HTTP, Servlets.
    Slides: Web Services
    Installation instructions for tomcat
    XQuery Demo Servlet as .war-file
    Servlet Demo Source Code
    For using other people's Web Services, you need HTTP internet access to their Apaches. This is usually not possible "behind" an internet provider (dynamic IP addresses). For this, you can use the institute's shell servers (with ssh forwarding) and install tomcats there (if you are on the same shellX, then bind your individual tomcats to 8080, 8081, 8082 ...):
       * -> shell1.cip.loc  ... shell8.cip.loc
  • 18.1.: Discussion of Exercise Sheet 4
  • Exercise Sheet 5: Web Services, Project "ILIAS"
    • Ex. Sheet 5
    • DBIS Git with sources for XQuery (and SQL) Web Services
    • Sources are located in the Ilias directory here:
      • Ilias dumps:
        • SSD-Exam from SS20: Exam data in "Question and Test Interoperability specification (QTI)": Ex_SSD_qti.xml, Ex_SSD_results.xml
        • Training/Test Exam for DB20/21: Exam data in "Question and Test Interoperability specification (QTI):" DB_qti.xml, DB_results.xml, arxb.png (results incomplete since uploaded files cannot yet be exported)
        • The qti.xml are a relatively clean modeling, can be seen as an "XML document base", the result.xml are dirty dumps.
    • Exam Questions created by LaTeX.
      • call latex ssdklausur.tex and latex dbklausur.tex, which needs the follwoing files:
      • klausurmanagement.sty, klausur-pre.sty, modulklausurschein.sty, goe-ifi.sty
      • dbklausur.tex, db-ws1011-klausur-ilias.tex (in German), db-ws1011-faketeiln.tex
      • ssdklausur.tex, ssd-ss20-klausur.tex, ssd-ss20-fake-teiln.tex
      • Test first to call htlatex ssdklausur.tex (creates HTML, not XHMTL, closing tags are sometimes missing)
      • The tex4ht (TeX for Hypertext, successor of latex2html) package:
      • Call mk4ht xhlatex ssdklausur.tex (XHTML, math as png) or mk4ht mzlatex ssdklausur.tex (XHTML, math as MathML), creates ssdklausur.(x)html (if not yet installed, install tex4ht, but usually it comes with the LateX installation)
      • task: write an XSL stylesheet that translates the created XHTML into an XML format that contains the list of exercises in cleaner XML (index number, title, points to be extracted from the XHTML). Small Java program that reads the XML and feeds them "from outside" via HTTP calls into the Fake-Ilias Web Service (preferred: at least have a HTML method to read single exercises, to be able to do this process incrementally).
  • 1.2.: Discussion of Exercise Sheet 5, "Fake Ilias" Project Meeting
  • 8.2.: "Fake Ilias" Project Meeting
  • FlexNow Registration: ...should be opened next days [19.1.] ... until Feb 12th
  • ... to be extended