From a7bac435672c2a49220443684aa7860fb8699de1 Mon Sep 17 00:00:00 2001
From: erik 
This is where the KiSS crawler comes in. This is a simple crawler which @@ -46,27 +46,232 @@ programme information from there. Then based on that it automatically records programs for you or sends notifications about interesting ones.
++ In its current version, the crawler can be used a standalone program + only and the preferred way to run it is as a scheduled task. +
+ At this moment, no formal releases have been made and only the latest + version can be downloaded. +
++ The easy way to start is the + binary version. +
+
+        The latest source can be obtained from subversion with the 
+        URL https://wamblee.org/svn/public/utils. The subversion 
+        repository allows read-only access to anyone. 
+      
+        The crawler comes with two configuration files, namely 
+        crawler.xml and programs.xml. 
+      
crawler.xml
+          First of all, copy the config.xml.example file 
+          to config.xml. After that, edit the first entry of 
+          that file and replace user and passwd
+          with your personal user id and password for the KiSS Electronic
+          Programme Guide. 
+        
programs.xml
+          The programs.xml file contains the following 
+          configuration items: 
+        
+            Notification is configured in the (surprise, surprise!) 
+            notification element. This notification element 
+            is used to configure respectively sender mail address (= reply 
+            address), recipient address, subject of the email, smtp server
+            host and port and optional username and password. 
+            In addition it contains the names of the stylesheets to 
+            generate the HTML and Text reports. These stylesheets 
+            should not be changed. 
+          
+            Interesting TV shows are described using program
+            elements. Each program element contains 
+            one or more match elements that describe 
+            a condition that the interesting program must match. 
+          
+ Matching can be done on the following properties of a program: +
+| Field name+ | Description | 
|---|---|
| name+ | Program name+ | 
| description+ | Program description+ | 
| channel+ | Channel name+ | 
| keywords+ | Keywords/classification of the program.+ | 
+            The field to match is specified using the field
+            attribute of the match element. If no field name 
+            is specified then the program name is matched. Matching is done
+            by converting the field value to lowercase and then doing a 
+            perl-like regular expression match of the provided value. As a
+            result, the content of the match element should be specified in 
+            lower case otherwise the pattern will never match.
+            If multiple match elements are specified for a 
+            given program element, then all matches must 
+            apply for a program to be interesting. 
+          
+ Example patterns: +
+| Pattern+ | Example of matching field values+ | 
|---|---|
| the.*x.*files+ | "The X files", "The X-Files: the making of"+ | 
| star trek+ | "Star Trek Voyager", "Star Trek: The next generation"+ | 
+            It is possible that different programs cannot be recorded at
+            since they overlap. To deal with such conflicts, it is possible
+            to specify a priority using the priority element. 
+            Higher values of the priority value mean a higher priority. 
+            If two programs have the same priority, then it is (more or less)
+            unspecified which of the two will be recorded, but it will at least
+            record one program. If no priority is specified, then the
+            priority is 1 (one).
+          
+            Since it is not always desirable to try to record every
+            program that matches the criteria, it is also possible to 
+            generate notifications for interesting programs only without
+            recording them. This is done by specifying the 
+            action alement with the content notify.
+            By default, the action is record. 
+            To make the mail reports more readable it is possible to 
+            also assign a category to a program for grouping interesting
+            programs. This can be done using the category
+            element. Note that if the action is 
+            notify. then the priority element
+            is not used. 
+          
+          In the binary distribution, execute the 
+          run script for your operating system
+          (run.bat for windows, and 
+          run.sh for unix). 
+        
+          With the source code, build everything with 
+          ant dist-lite, then locate the binary
+          distribution in lib/wamblee/crawler/kiss/kiss-crawler-bin.zip.
+          Then proceed as for the binary distribution. 
+        
+ The crawler, as it is now, is s standalone program which is + intended to be run from a command-line. When it runs, it + retrieves the programs for today. As a result, it is advisable + to run the program at an early point of the day as a scheduled + task (e.g. cron on unix). +
++ Modifying the program to allow it to investigate tomorrow's + programs instead is easy as well but not yet implemented. +
+
+        The best example is in the distribution itself. It is my personal
+        programs.xml file. 
+      
+ You are always welcome to contribute. If you find a problem just + tell me about it and if you have ideas am I always interested to + hear about them. +
++ If you are a programmer and have a fix for a bug, just send me a + patch and if you are fanatic enough and have ideas, I can also + give you write access to the repository. +