1 <?xml version="1.0" encoding="UTF-8"?>
3 Copyright 2002-2004 The Apache Software Foundation or its licensors,
6 Licensed under the Apache License, Version 2.0 (the "License");
7 you may not use this file except in compliance with the License.
8 You may obtain a copy of the License at
10 http://www.apache.org/licenses/LICENSE-2.0
12 Unless required by applicable law or agreed to in writing, software
13 distributed under the License is distributed on an "AS IS" BASIS,
14 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15 See the License for the specific language governing permissions and
16 limitations under the License.
18 <!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "http://forrest.apache.org/dtd/document-v20.dtd">
21 <title>Automatic Recording for KiSS Hard Disk Recorders</title>
24 <section id="overview">
25 <title>Overview</title>
28 In 2005, <a href="site:links/kiss">KiSS</a> introduced the ability
29 to schedule recordings on KiSS hard disk recorder (such as the
30 DP-558) through a web site on the internet. When a new recording is
31 scheduled through the web site, the KiSS recorder finds out about
32 this new recording by polling a server on the internet.
33 This is a really cool feature since it basically allows programming
34 the recorder when away from home.
37 After using this feature for some time now, I started noticing regular
38 patterns. Often you are looking for the same programs and for certain
39 types of programs. So, wouldn't it be nice to have a program
40 do this work for you and automatically record programs and notify you
41 of possibly interesting ones?
44 This is where the KiSS crawler comes in. This is a simple crawler which
45 logs on to the KiSS electronic programme guide web site and gets
46 programme information from there. Then based on that it automatically
47 records programs for you or sends notifications about interesting ones.
50 In its current version, the crawler can be used a standalone program
51 only and the preferred way to run it is as a scheduled task.
56 <title>Downloading</title>
59 At this moment, no formal releases have been made and only the latest
60 version can be downloaded.
63 The easy way to start is the
64 <a href="installs/crawler/kiss/kiss-crawler-bin.zip">binary version</a>.
67 The latest source can be obtained from subversion with the
68 URL <code>https://wamblee.org/svn/public/utils</code>. The subversion
69 repository allows read-only access to anyone.
74 <title>Configuring the crawler</title>
77 The crawler comes with two configuration files, namely
78 <code>crawler.xml</code> and <code>programs.xml</code>.
82 <title>Crawler configuration <code>crawler.xml</code></title>
85 First of all, copy the <code>config.xml.example</code> file
86 to <code>config.xml</code>. After that, edit the first entry of
87 that file and replace <code>user</code> and <code>passwd</code>
88 with your personal user id and password for the KiSS Electronic
94 <title>Program configuration: <code>programs.xml</code></title>
97 The <code>programs.xml</code> file contains the following
101 <li>Notification configuration: Describing how to
102 do notification of the results of crawling the site. </li>
103 <li>Zero or more configurations of interesting programs. </li>
106 <title>Notification configuration</title>
108 Notification is configured in the (surprise, surprise!)
109 <code>notification</code> element. This notification element
110 is used to configure respectively sender mail address (= reply
111 address), recipient address, subject of the email, smtp server
112 host and port and optional username and password.
113 In addition it contains the names of the stylesheets to
114 generate the HTML and Text reports. These stylesheets
115 should not be changed.
120 <title>Program configuration</title>
122 Interesting TV shows are described using <code>program</code>
123 elements. Each <code>program</code> element contains
124 one or more <code>match</code> elements that describe
125 a condition that the interesting program must match.
128 Matching can be done on the following properties of a program:
131 <tr><th>Field name</th>
132 <th>Description</th></tr>
135 <td>Program name</td>
139 <td>Program description</td>
143 <td>Channel name</td>
147 <td>Keywords/classification of the program.</td>
151 The field to match is specified using the <code>field</code>
152 attribute of the <code>match</code> element. If no field name
153 is specified then the program name is matched. Matching is done
154 by converting the field value to lowercase and then doing a
155 perl-like regular expression match of the provided value. As a
156 result, the content of the match element should be specified in
157 lower case otherwise the pattern will never match.
158 If multiple <code>match</code> elements are specified for a
159 given <code>program</code> element, then all matches must
160 apply for a program to be interesting.
168 <th>Example of matching field values</th>
171 <td>the.*x.*files</td>
172 <td>"The X files", "The X-Files: the making of"</td>
176 <td>"Star Trek Voyager", "Star Trek: The next generation"</td>
181 It is possible that different programs cannot be recorded at
182 since they overlap. To deal with such conflicts, it is possible
183 to specify a priority using the <code>priority</code> element.
184 Higher values of the priority value mean a higher priority.
185 If two programs have the same priority, then it is (more or less)
186 unspecified which of the two will be recorded, but it will at least
187 record one program. If no priority is specified, then the
192 Since it is not always desirable to try to record every
193 program that matches the criteria, it is also possible to
194 generate notifications for interesting programs only without
195 recording them. This is done by specifying the
196 <code>action</code> alement with the content <code>notify</code>.
197 By default, the <code>action</code> is <code>record</code>.
198 To make the mail reports more readable it is possible to
199 also assign a category to a program for grouping interesting
200 programs. This can be done using the <code>category</code>
201 element. Note that if the <code>action</code> is
202 <code>notify</code>. then the <code>priority</code> element
213 <title>Installing and running the crawler</title>
216 <title>Binary distribution</title>
218 In the binary distribution, execute the
219 <code>run</code> script for your operating system
220 (<code>run.bat</code> for windows, and
221 <code>run.sh</code> for unix).
226 <title>Source distribution</title>
228 With the source code, build everything with
229 <code>ant dist-lite</code>, then locate the binary
230 distribution in <code>lib/wamblee/crawler/kiss/kiss-crawler-bin.zip</code>.
231 Then proceed as for the binary distribution.
236 <title>General usage</title>
238 The crawler, as it is now, is s standalone program which is
239 intended to be run from a command-line. When it runs, it
240 retrieves the programs for today. As a result, it is advisable
241 to run the program at an early point of the day as a scheduled
242 task (e.g. cron on unix).
245 Modifying the program to allow it to investigate tomorrow's
246 programs instead is easy as well but not yet implemented.
253 <section id="examples">
254 <title>Examples</title>
257 The best example is in the distribution itself. It is my personal
258 <code>programs.xml</code> file.
263 <title>Contributing</title>
266 You are always welcome to contribute. If you find a problem just
267 tell me about it and if you have ideas am I always interested to
271 If you are a programmer and have a fix for a bug, just send me a
272 patch and if you are fanatic enough and have ideas, I can also
273 give you write access to the repository.