Institute for Informatics
Georg-August-Universität Göttingen

Databases and Information Systems

from the Web
Uni Göttingen

Information Integration from the Web

The Web provides access to large data sources which are not explicitly organized as databases. Instead, the information is presented as semistructured data. In contrast to integrating classical distributed databases, integration of such data raises several new problems such as schema discovery, wrapping and reorganizing the data sources and coping which changes in autonomous sources.

Information integration has been investigated by members of the group since 1997 at Freiburg University using the the FLORID system for extraction and integration of semistructured data from the Web.

The Experiences with F-Logic and FLORID have been incorporated into the LoPiX (Logic Programming in XML) project, dealing with integration of XML data (2000 - 2003).

Specific aspects of information integration in XML have been investigated in 2003-2008 with the LinXIS project.

Since 2008, information integration from the Web by Query Workflows is also investigated in the MARS project.