SERV_QUEUE_TOP
Retrieve target website and store within Virtuoso
WS.WS.SERV_QUEUE_TOP
(in target varchar,
in WebDAV_collection varchar,
in update integer,
in debug integer,
in function_hook varchar,
in data any);
Description
Web Robot site retrieval can be performed with the WS.WS.SERV_QUEUE_TOP PL function
integrated in to the Virtuoso server.
To run multiple walking robots all you simply need to do is kick them off from
separate ODBC/SQL connections and all robots will walk together without overlapping.
From a VSP interface, after calling the retrieval function you may
call http_flush to keep running tasks
in the server and allowing the user agent to continue with other tasks.
Parameters
target –
URI to target site.
WebDAV_collection –
Local WebDAV collection to copy the content to.
update –
Flag to set updatable, can be 1 or 0 for on or off respectably.
debug –
Debug flag, must be set to 0
function_hook. –
Fully qualified PL function hook name. If not supplied or NULL then
the default function will be used.
data –
application dependent data, usually an array, is passed to the PL function
hook to perform next queue entry extraction. In our example we use an array with
names of non-desired sites.
Examples
Retrieve External Sites
WS.WS.SERV_QUEUE_TOP (
'www.foo.com', 'sites/www_foo_com', 0, 0, 'DB.DBA.my_hook',
vector ('www.skip.me','www.bar.com')
);