1 This is a general library for implementing a web crawler.
3 The crawler works by retrieving an HTML page and transforming the HTML
4 (content + presentation) into content using XSLT stylesheets. Using a convention
5 for links in the converted content, it becomes possible to build a generic interface on the retrieved pages for navigating through the content.
7 A configuration file determines how a certain page must be retrieved and transformed.