<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "http://forrest.apache.org/dtd/document-v20.dtd">
<document>
<header>
- <title>Automatic recording for KiSS hard disk recorders</title>
+ <title>Automatic Recording for KiSS Hard Disk Recorders</title>
</header>
<body>
<section id="overview">
patterns. Often you are looking for the same programs and for certain
types of programs. So, wouldn't it be nice to have a program
do this work for you and automatically record programs and notify you
- of possibly interesting ones.
+ of possibly interesting ones?
</p>
<p>
This is where the KiSS crawler comes in. This is a simple crawler which
programme information from there. Then based on that it automatically
records programs for you or sends notifications about interesting ones.
</p>
+ <p>
+ In its current version, the crawler can be used a standalone program
+ only and the preferred way to run it is as a scheduled task.
+ </p>
</section>
<section>
<title>Downloading</title>
+
+ <p>
+ At this moment, no formal releases have been made and only the latest
+ version can be downloaded.
+ </p>
+ <p>
+ The easy way to start is the
+ <a href="installs/crawler/kiss/kiss-crawler-bin.zip">binary version</a>.
+ </p>
+ <p>
+ The latest source can be obtained from subversion with the
+ URL <code>https://wamblee.org/svn/public/utils</code>. The subversion
+ repository allows read-only access to anyone.
+ </p>
</section>
<section>
<title>Configuring the crawler</title>
+
+ <p>
+ The crawler comes with two configuration files, namely
+ <code>crawler.xml</code> and <code>programs.xml</code>.
+ </p>
+
+ <section>
+ <title>Crawler configuration <code>crawler.xml</code></title>
+
+ <p>
+ First of all, copy the <code>config.xml.example</code> file
+ to <code>config.xml</code>. After that, edit the first entry of
+ that file and replace <code>user</code> and <code>passwd</code>
+ with your personal user id and password for the KiSS Electronic
+ Programme Guide.
+ </p>
+ </section>
+
+ <section>
+ <title>Program configuration: <code>programs.xml</code></title>
+
+ <p>
+ The <code>programs.xml</code> file contains the following
+ configuration items:
+ </p>
+ <ul>
+ <li>Notification configuration: Describing how to
+ do notification of the results of crawling the site. </li>
+ <li>Zero or more configurations of interesting programs. </li>
+ </ul>
+ <section>
+ <title>Notification configuration</title>
+ <p>
+ Notification is configured in the (surprise, surprise!)
+ <code>notification</code> element. This notification element
+ is used to configure respectively sender mail address (= reply
+ address), recipient address, subject of the email, smtp server
+ host and port and optional username and password.
+ In addition it contains the names of the stylesheets to
+ generate the HTML and Text reports. These stylesheets
+ should not be changed.
+ </p>
+ </section>
+
+ <section>
+ <title>Program configuration</title>
+ <p>
+ Interesting TV shows are described using <code>program</code>
+ elements. Each <code>program</code> element contains
+ one or more <code>match</code> elements that describe
+ a condition that the interesting program must match.
+ </p>
+ <p>
+ Matching can be done on the following properties of a program:
+ </p>
+ <table>
+ <tr><th>Field name</th>
+ <th>Description</th></tr>
+ <tr>
+ <td>name</td>
+ <td>Program name</td>
+ </tr>
+ <tr>
+ <td>description</td>
+ <td>Program description</td>
+ </tr>
+ <tr>
+ <td>channel</td>
+ <td>Channel name</td>
+ </tr>
+ <tr>
+ <td>keywords</td>
+ <td>Keywords/classification of the program.</td>
+ </tr>
+ </table>
+ <p>
+ The field to match is specified using the <code>field</code>
+ attribute of the <code>match</code> element. If no field name
+ is specified then the program name is matched. Matching is done
+ by converting the field value to lowercase and then doing a
+ perl-like regular expression match of the provided value. As a
+ result, the content of the match element should be specified in
+ lower case otherwise the pattern will never match.
+ If multiple <code>match</code> elements are specified for a
+ given <code>program</code> element, then all matches must
+ apply for a program to be interesting.
+ </p>
+ <p>
+ Example patterns:
+ </p>
+ <table>
+ <tr>
+ <th>Pattern</th>
+ <th>Example of matching field values</th>
+ </tr>
+ <tr>
+ <td>the.*x.*files</td>
+ <td>"The X files", "The X-Files: the making of"</td>
+ </tr>
+ <tr>
+ <td>star trek</td>
+ <td>"Star Trek Voyager", "Star Trek: The next generation"</td>
+ </tr>
+ </table>
+
+ <p>
+ It is possible that different programs cannot be recorded at
+ since they overlap. To deal with such conflicts, it is possible
+ to specify a priority using the <code>priority</code> element.
+ Higher values of the priority value mean a higher priority.
+ If two programs have the same priority, then it is (more or less)
+ unspecified which of the two will be recorded, but it will at least
+ record one program. If no priority is specified, then the
+ priority is 1 (one).
+ </p>
+
+ <p>
+ Since it is not always desirable to try to record every
+ program that matches the criteria, it is also possible to
+ generate notifications for interesting programs only without
+ recording them. This is done by specifying the
+ <code>action</code> alement with the content <code>notify</code>.
+ By default, the <code>action</code> is <code>record</code>.
+ To make the mail reports more readable it is possible to
+ also assign a category to a program for grouping interesting
+ programs. This can be done using the <code>category</code>
+ element. Note that if the <code>action</code> is
+ <code>notify</code>. then the <code>priority</code> element
+ is not used.
+ </p>
+
+ </section>
+
+
+ </section>
</section>
<section>
<title>Installing and running the crawler</title>
+
+ <section>
+ <title>Binary distribution</title>
+ <p>
+ In the binary distribution, execute the
+ <code>run</code> script for your operating system
+ (<code>run.bat</code> for windows, and
+ <code>run.sh</code> for unix).
+ </p>
+ </section>
+
+ <section>
+ <title>Source distribution</title>
+ <p>
+ With the source code, build everything with
+ <code>ant dist-lite</code>, then locate the binary
+ distribution in <code>lib/wamblee/crawler/kiss/kiss-crawler-bin.zip</code>.
+ Then proceed as for the binary distribution.
+ </p>
+ </section>
+
+ <section>
+ <title>General usage</title>
+ <p>
+ The crawler, as it is now, is s standalone program which is
+ intended to be run from a command-line. When it runs, it
+ retrieves the programs for today. As a result, it is advisable
+ to run the program at an early point of the day as a scheduled
+ task (e.g. cron on unix).
+ </p>
+ <p>
+ Modifying the program to allow it to investigate tomorrow's
+ programs instead is easy as well but not yet implemented.
+ </p>
+ </section>
+
+
</section>
<section id="examples">
<title>Examples</title>
+ <p>
+ The best example is in the distribution itself. It is my personal
+ <code>programs.xml</code> file.
+ </p>
</section>
<section>
<title>Contributing</title>
+
+ <p>
+ You are always welcome to contribute. If you find a problem just
+ tell me about it and if you have ideas am I always interested to
+ hear about them.
+ </p>
+ <p>
+ If you are a programmer and have a fix for a bug, just send me a
+ patch and if you are fanatic enough and have ideas, I can also
+ give you write access to the repository.
+ </p>
</section>