Automatic Recording for KiSS Hard Disk Recorders

Configuring the crawler

- The crawler comes with two configuration files, namely - crawler.xml and programs.xml. + The crawler comes with three configuration files:

crawler.xml: basic crawler configuration + tailored to the KiSS electronic programme guide.
programs.xml: containing a description of which + programs must be recorded and which programs are interesting.
org.wamblee.crawler.properties: Containing a configuration

+ For the standalone program, all configuration files are in the conf directory. + For the web application, the properties files is located in the WEB-INF/classes + directory of the web application, and crawler.xml and programs.xml + are located outside of the web application at a location configured in the properties file. +

Crawler configuration <code>crawler.xml</code> @@ -89,33 +151,7 @@ Programme Guide.

- -

- Program configuration: <code>programs.xml</code> - -

- The programs.xml file contains the following - configuration items: -

Notification configuration: Describing how to - do notification of the results of crawling the site.
Zero or more configurations of interesting programs.

- Notification configuration -

- Notification is configured in the (surprise, surprise!) - notification element. This notification element - is used to configure respectively sender mail address (= reply - address), recipient address, subject of the email, smtp server - host and port and optional username and password. - In addition it contains the names of the stylesheets to - generate the HTML and Text reports. These stylesheets - should not be changed. -

- +

Program configuration

@@ -178,7 +214,7 @@

- It is possible that different programs cannot be recorded at + It is possible that different programs cannot be recorded since they overlap. To deal with such conflicts, it is possible to specify a priority using the priority element. Higher values of the priority value mean a higher priority. @@ -204,16 +240,24 @@

- - + +

+ Notification configuration +

+ Edit the configuration file org.wamblee.crawler.properties. + The properties file is self-explanatory. +

+ + +

Installing and running the crawler

- Binary distribution + Standalone application

In the binary distribution, execute the run script for your operating system @@ -222,6 +266,21 @@

+ Web application +

+ After deploying the web application, navigate to the + application in your browser (e.g. + http://localhost:8080/wamblee-crawler-kissweb). + The screen should show an overview of the last time it ran (if + it ran before) as well as a button to run the crawler immediately. + Also, the result of the last run can be viewed. + The crawler will run automatically every morning at 5 AM local time, + and will retry at 1 hour intervals in case of failure to retrieve + programme information. +

Source distribution

@@ -235,12 +294,18 @@

General usage

- The crawler, as it is now, is s standalone program which is - intended to be run from a command-line. When it runs, it - retrieves the programs for today. As a result, it is advisable + When the crawler runs, it + retrieves the programs for tomorrow. As a result, it is advisable to run the program at an early point of the day as a scheduled - task (e.g. cron on unix). + task (e.g. cron on unix). For the web application this is + preconfigured at 5AM.

+ + If you deploy the web application today, it will run automatically + on the next (!) day. This even holds if you deploy the application + before the normal scheduled time. + +

Modifying the program to allow it to investigate tomorrow's programs instead is easy as well but not yet implemented. @@ -255,7 +320,7 @@

The best example is in the distribution itself. It is my personal - programs.xml file. + programs.xml file.