International Workshop on Internet Data Management (IDM'99),
Firenze, Italy, September 2, 1999. Proc. DEXA 99 Workshop, IEEE Computer Society Press, pp. 721-725.

Modeling and Querying Structure and Contents of the Web

Wolfgang May


For accessing and processing the information provided on the Web, there is a need for extraction, restructuring, and integration of semistructured data from autonomous, heterogeneous sources. In this paper, we regard the Web and its contents as a unit, represented in an object-oriented data model: the Web structure (inter-document level), given by its hyperlinks, the parse-trees of Web pages (intra-document level), and their contents. The model is complemented by a rule-based object-oriented language which is extended by Web access capabilities and allows for and navigation in the unified model. We show the practicability of our approach by using the FLORID system.

