X-Git-Url: http://wamblee.org/gitweb/?a=blobdiff_plain;f=crawler%2Fkiss%2Fdocs%2Fcontent%2Fxdocs%2Findex.xml;h=50397ef2a3dbc51fb0242d5626c669aa8cef9919;hb=17424bb52a491016f76953acd3f14454d71c7749;hp=02d1a4668fe6609e68ac1f56835925650c8876ef;hpb=78c80d39126d6774f26bce978bf2e421f5c3a8ab;p=utils
diff --git a/crawler/kiss/docs/content/xdocs/index.xml b/crawler/kiss/docs/content/xdocs/index.xml
index 02d1a466..50397ef2 100644
--- a/crawler/kiss/docs/content/xdocs/index.xml
+++ b/crawler/kiss/docs/content/xdocs/index.xml
@@ -20,7 +20,51 @@
Automatic Recording for KiSS Hard Disk Recorders
-
+
+
+ KiSS makes regular updates to their site that sometimes require adaptations
+ to the crawler. If it stops working, check out the most recent version here.
+
+
+ Changelog
+
+
+ 31 August 2006
+
+ - Added windows bat file for running the crawler under windows.
+ Very add-hoc, will be generalized.
+
+
+
+ 24 August 2006
+
+ - The crawler now uses desktop login for crawling. Also, it is much more efficient since
+ it no longer needs to crawl the individual programs. This is because the channel page
+ includes descriptions of programs in javascript popups which can be used by the crawler.
+ The result is a significant reduction of the load on the KiSS EPG site. Also, the delay
+ between requests has been increased to further reduce load on the KiSS EPG site.
+ -
+ The crawler now crawls programs for tomorrow instead of for today.
+
+ -
+ The web based crawler is configured to run only between 7pm and 12pm. It used to run at
+ 5am.
+
+
+
+
+
+ 13-20 August 2006
+
+ There were several changes to the login procedure, requiring modifications to the crawler.
+
+
+ - The crawler now uses the 'Referer' header field correctly at login.
+ - KiSS now uses hidden form fields in their login process which are now also handled correctly by the
+ crawler.
+
+
+
Overview
@@ -238,7 +282,9 @@
The screen should show an overview of the last time it ran (if
it ran before) as well as a button to run the crawler immediately.
Also, the result of the last run can be viewed.
- The crawler will run automatically every morning at 5 AM local time.
+ The crawler will run automatically every morning at 5 AM local time,
+ and will retry at 1 hour intervals in case of failure to retrieve
+ programme information.
@@ -256,11 +302,17 @@
General usage
When the crawler runs, it
- retrieves the programs for today. As a result, it is advisable
+ retrieves the programs for tomorrow. As a result, it is advisable
to run the program at an early point of the day as a scheduled
task (e.g. cron on unix). For the web application this is
preconfigured at 5AM.
+
+ If you deploy the web application today, it will run automatically
+ on the next (!) day. This even holds if you deploy the application
+ before the normal scheduled time.
+
+
Modifying the program to allow it to investigate tomorrow's
programs instead is easy as well but not yet implemented.