Web Data Integration and Data Management
- Advanced Bachelor or Master/Diploma in
Applied Computer Science or Information Systems (Wirtschaftsinformatik)
- Prerequisites/Vorbedingungen: Basic Knowledge in e.g. XML and/or RDF
- 6 ECTS
- Number of participants: max. 10
- Language: German and english are allowed. Reading of english text/documentation
There is a lot of data available in the Web and in the Semantic Web. Web data is usually
provided in a human-readable form of Web pages (including forms, the so-called Deep Web),
while it cannot be processd in a database-style way by users. Data Extraction,
e.g. from the CIA World Factbook or from Wikipedia, is thus a neverending "hot topic".
Apart from pattern-based approaches, also Natural Language Processing Approaches are
The Semantic Web (cf. lecture Semantic Web) makes
some attempts to provide, extend and/or annotate Web Data towards a machine-readable way.
For this, the RDF data format is used, together with the OWL ontology language for
Form of the Seminar
The intention of the seminar is to get an overview of the state of the art in data integration
from the Web and background data management.
For each topic, the following has to be done:
- a written tutorial-style paper that gives an overview of an
- evaluate some tools, write a report (installation, functionality,
usability, ...) [optionally german or english]
- prepare an illustrative medium-size case study using one or more tools
- a presentation giving the tutorial and showing a demo of how to
use it (about 90 minutes incl. discussion; optionally german or english).
- first meeting at the beginning of the semester:
Monday 18.4. 14h c.t. SR 2.101, IFI: First Meeting
Assignment of topics and papers.
- May/June: preparation of case studies and presentations, individual meetings
Registration/Deregistration in FlexNever is open until ??.??.
- July: presentations.
- Only one topic has been worked out:
- 18.7. 14-16, SR 2.101:
Semantic Annotations in Today's Web
(i.e., data formats that are included inside HTML Web pages) by Alexander Trautsch.