Previous Next Table of Contents

3. Non-CGI Perl programs and shell scripts

3.1 bin/addsl.pl - generate HTML subject listings from ROADS templates

NAME

bin/addsl.pl - generate HTML subject listings from ROADS templates

SYNOPSIS

bin/addsl.pl [-ANacdhi] [-f config_dir] [-l view]
  [-m filename] [-n database_name] [-o override_file]
  [-p pattern] [-s source_dir] [-t target_dir] [-u name]
  [-w waylay_url] [handle1, handle2 ... handleN]

DESCRIPTION

The addsl.pl program generates a set of subject listing files for the templates with the specified handles. These listing files are also converted into static HTML documents which can be placed on the WWW. The program can also generate HTML lists in numerical and alphabetical order based on the contents of a subject descriptor mapping file.

The addsl.pl program can generate a number of different subject listings. This allows, for example, a subject listing of UK based resources in addition to a subject listing of all resources. The views also allow easy selection of which subject listing a template should be added to in the admin-cgi/mktemp.pl editor.

USAGE

You can arrange for the ROADS software to generate listings of some or all of your templates broken down by subject area. Note that each template which you would like to appear in a subject listing should contain at least one URI attribute and at least one Subject-Descriptor cluster.

You may have as many different views of your templates as you like. Each view is normally a collection of statically generated HTML documents created by addsl.pl, though in version 2 of ROADS you can also browse dynamically through your database using "canned" queries. The subject listings may be customized in a number of ways - notably via HTML outline files may be used to specify the overall format of each HTML document generated by the ROADS software. These have some extra pseudo-HTML tags which allow you to indicate where in the resulting documents you would like the subject listing information to appear.

It is also possible to specify a pattern which the URIs in the resource description templates will have to match in order to be included in a subject listing. This can be used to generate, for example, lists of resources which are found in the UK academic community, resources which are generated dynamically by scripts, all resources of a particular type (e.g. MPEG movies), and so on.

addsl.pl will also generate customizable lists of the available subject categories in both alphabetical and numerical order (assuming the Subject-Descriptor classification is numeric.

A default set of subject categories based on the different programme areas in the UK Electronic Libraries Programme (to match our sample database) is distributed with the ROADS software as config/classmap, under the top level ROADS installation directory. You will probably want to change this to reflect your installation.

The file format of the subject listing views is explained in detail below. Essentially, it should contain pointers to the location of each of the following:

A typical view specification would look like this:

HTML-Directory:         subject-listing
WWW-Directory:          subject-listing
Listing-Directory:      subject-listing
Mapping-File:           class-map
Subject-Scheme:         DDC
AlphaList-File:         alphalist.html
NumList-File:           numlist.html

The meanings of these path names are explained below. It is worth noting that they can be either relative (to the various directories involved in generating the subject listings, such as the ROADS config, guts and htdocs directories), or absolute - e.g. /usr/local/roads/guts/subject-listing/Default. You may prefer to refer to them by the full path name to avoid confusion, but be aware that this may cause you problems if you move the ROADS installation to another directory tree.

Note that the ROADS software comes shipped with defaults for the Default, DefaultAlpha and DefaultNumber outlines. The outline HTML used to generate the actual subject listings lives by default under config/multilingual/*/subject-listing-views. In version 2 of ROADS we switched to using our generic HTML rendering code, away from the old hard-coded HTML rendering embedded in the older versions of this code.

If your Subject-Descriptor-Scheme is UDC (the default), you should be able generate subject listings for all your templates using the default view by running addsl.pl with the -a argument:

% addsl.pl -a

You will not need to do this if you are creating templates from scratch using the WWW based forms editor - this gives you the option of entering new templates into the subject listings automatically. In fact, it runs addsl.pl behind the scenes. If you only want to add a subset of your templates (such as those which have changed recently), addsl.pl should be called without the -a argument, and with the handles of the templates as arguments, e.g.

% addsl.pl 0123 0124 0125

If you would like to create more than one view of your resource description templates, e.g. to have a separate AllUK listing of resources which pertain to the UK higher education community (Internet domain - ac.uk), you will need to make another view file and run addsl.pl with the -l specifying this, e.g. the view file for AllUK might look something like this:

Outline-File: subject-listing/Default
HTML-Directory: subject-listing/AllUK/
Listing-Directory: subject-listing/AllUK/
Mapping-File: subject-listing/classmap
Alpha-Outline: subject-listing/DefaultAlpha
Number-Outline: subject-listing/DefaultNumber

Whilst in this example the same HTML outline documents have been used for both views, this is entirely under the control of the ROADS server administrator. To create the AllUK view, you would need to run addsl.pl with both the -l and -p arguments, e.g.

% addsl.pl -a -p '\.ac\.uk' -l AllUK

The resulting subject listing files will be generated in the directory specified in the view file as HTML-Directory, e.g. /usr/local/www/ROADS/subject-listing/AllUK. The following files will be generated:

Should you ever need to completely re-generate your subject listings, it will be necessary to remove the files in the directory specified by the Listing-Directory entry in the view file, e.g. /usr/local/roads/guts/subject-listing/AllUK. You may also choose to remove the HTML documents generated by addsl.pl in the HTML-Directory. Alternatively, cullsl.pl the subject list culling tool, may be adequate for your needs - see its manual page for more information.

Note that if they do not exist already, you will need to create parent directories for the directories referred to in a subject listing view configuration file.

OPTIONS

A number of options are available for the addsl.pl program to control which files are used for generating the subject listings and where configuration options are located. Note that most of these can also be supplied in the addsl.pl view config file (see below), and that settings which appear in this will usually override command line arguments.

These options are then followed by zero or more templates' handles (note - not filenames). If the -a option is given, no handles need be given on the command line; all templates in the database will be added to the subject listings.

FILES

config/class-map - where to get default mappings from.

Subject-Descriptor-Scheme attributes in templates to filenames used for generating HTML.

config/subject-listing/* - view files, each of which describing a particular way of rendering the templates into HTML.

config/multilingual/*/subject-listing-views/* - HTML rendering rules for addsl.pl subject listing views, with a separate directory per view. The actual rendering rules are as per search results.

guts/subject-listing/*.lst - default location of the internal files used to maintain state between runs of subject listing tools.

htdocs/subject-listing - default location of the HTML generated by addsl.pl

FILE FORMATS

The various attributes currently defined in the view file are:

SEE ALSO

the manual page for bin/addwn.pl, the manual page for bin/cullsl.pl, the manual page for bin/cullwn.pl, the manual page for bin/mkinv.pl

COPYRIGHT

Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.

Author

Jon Knight <jon@net.lut.ac.uk>, Martin Hamilton <martinh@gnu.org>

3.2 bin/addwn.pl - add what's new entries for specified templates

NAME

bin/addwn.pl - add what's new entries for specified templates

SYNOPSIS

bin/addwn.pl [-acdh] [-f directory] [-l number]
  [-n name] [-p pattern] [-r] [-s directory]
  [-w name] [-z date] [handle1 handle2 ... handleN]

DESCRIPTION

The addwn.pl program adds templates with the specified handles to a What's New listing file. This listing file is then converted into a static HTML document which can be placed on the WWW. The Whats New file is intended to show end users what resources have just been catalogued by a subject service and/or when some aspect of a catalogued resource's template has changed.

USAGE

The ROADS software can generate lists of resource descriptions which have been entered recently or changed recently. The configuration of this is very similar to that of the resource listings. Essentially, each What's New view is specified by an HTML outline file, a file to add the new resource information to, and an internal file. The default What's New view can be found in the file config/multilingual/*/whats-new-views/Default under the top level ROADS installation directory.

The default What's New view installed by the ROADS software will be configured to create a listing file called whats-new.html in the ROADS directory on your WWW server, and use sub-directories of the ROADS installation for its outline and internal files, e.g.

Outline-File:   whats-new/outlines/Default
HTML-File:      whats-new.html
Listing-File:   whats-new/Default.lst

If you create your resource description templates using the WWW based template editor, you will be given the option of entering them into a What's New list - addwn.pl will be called to do this. Alternatively, if you wish to generate these listings manually, you can run addwn.pl yourself. Use the -a option to add all your templates, e.g.

% addwn.pl -a

If you only want to include a subset of the resource description templates in your database, addwn.pl takes a similar set of options to addsl.pl - e.g. the -p option can be used to restrict the templates which are included based on the contents of their URIs, and individual templates to include can be specified on the command line.

Note that your templates must include at least one URI attribute.

OPTIONS

A number of options are available for the addwn.pl program to control which files are used for generating the subject listings and where configuration options are located:

These options are then followed by zero or more templates handles (note - not filenames). If the -a option is given, no handles need be given on the command line; all templates in the database will be added to the subject listings.

FILES

config/whats-new/* - "What's New" view specifications

config/multilingual/*/whats-new-views/* - rendering rules for the various "What's New" views

htdocs/whats-new.html - default location of listing.

FILE FORMAT

The addwn.pl can generate a number of different subject listings. This allows, for example, a subject listing of UK based resources in addition to a subject listing of all resources. The views also allow easy selection of which subject listing a template should be added to in the mktemp.pl editor.

The view is specified by a view file. A sample file is:

HTML-File:      /WWW/htdocs/ROADS/whats-new.html
Listing-File:   /usr/local/ROADS/guts/whats-new/Default.lst

The various attributes currently defined in the view file are:

SEE ALSO

the manual page for bin/addwn.pl, the manual page for bin/cullsl.pl, the manual page for bin/mkinv.pl

COPYRIGHT

Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.

AUTHOR

Jon Knight <jon@net.lut.ac.uk>

3.3 bin/bg_exterminate.pl - background reindexing to remove stale templates

NAME

bin/bg_exterminate.pl - background reindexing to remove stale templates

SYNOPSIS

bin/bg_exterminate.pl

DESCRIPTION

This Perl program launches a process to remove stale templates from a ROADS server database. On completion it sends email to the server's system admin and database admin contacts.

It is intended for invocation from a World-Wide Web CGI program, a cron job, or an at job.

OPTIONS

None.

OUTPUT

Mail to server maintainers.

SEE ALSO

the manual page for bin/exterminate.pl

COPYRIGHT

Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.

AUTHOR

Martin Hamilton <martinh@gnu.org>

3.4 bin/bg_lc.pl - background link checking run for WWW or cron

NAME

bin/bg_lc.pl - background link checking run for WWW or cron

SYNOPSIS

bin/bg_lc.pl

DESCRIPTION

This Perl program launches a process to check the validity of the links (URLs) in a ROADS server database. On completion it sends email to the server's system admin and database admin contacts.

It is intended for invocation from a World-Wide Web CGI program, a cron job, or an at job.

OPTIONS

None.

OUTPUT

Mail to server maintainers. Link check log file left in logs/lc.

SEE ALSO

the manual page for bin/lc.pl

COPYRIGHT

Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.

AUTHOR

Martin Hamilton <martinh@gnu.org>

3.5 bin/bogus.pl - flag possible errors in ROADS installation

NAME

bin/bogus.pl - flag possible errors in ROADS installation

SYNOPSIS

 bin/bogus.pl [-h]

DESCRIPTION

This Perl program tests the following aspects of the ROADS installation:

  1. expected global variables are present and correct
  2. directories which are needed are present
  3. external programs which are needed can be found
  4. directories which should be writeable actually are
Note that the tests may generate different results depending on the Unix user and group which the program is run under. If in doubt, it should be tested with the identity of any admin users who will be running components of the ROADS package from the command line, and as with the identities used to run any WWW servers which will have access to the ROADS server and database.

OPTIONS

OUTPUT

List of phases, and problem information if any problems found.

SEE ALSO

the manual page for admin-cgi/bogus.pl

COPYRIGHT

Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.

AUTHOR

Martin Hamilton <martinh@gnu.org>

3.6 bin/countattr.pl - count the attributes used in a template

NAME

bin/countattr.pl - count the attributes used in a template

SYNOPSIS

 bin/countattr.pl [-adh] [-s sourcedir] [file1 file2 ... fileN]

DESCRIPTION

This Perl program runs through a set of IAFA (or IAFA style) templates and generates a report of which fields have been used and how many times.

OPTIONS

OUTPUT

Mail to server maintainers.

SEE ALSO

the manual page for admin-cgi/countattr.pl

COPYRIGHT

Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.

AUTHOR

Jon Knight <jon@net.lut.ac.uk> Martin Hamilton <martinh@gnu.org>

3.7 bin/cullsl.pl - cull entries from subject listings

NAME

bin/cullsl.pl - cull entries from subject listings

SYNOPSIS

bin/cullsl.pl [-ANacdh] [-f directory] [-l view]
  [-m filename] [-n name] [-p pattern]
  [-s directory] [-t directory] [-u name]
  [handle1 handle2 ... handleN]

DESCRIPTION

The cullsl.pl program removes one or more templates from a set of subject listing files. These changed listing files are also converted into static HTML documents which can be placed on the WWW. The program also generates HTML lists in numerical and alphabetical order based on the contents of a subject descriptor mapping file. This program shares many of its configuration files with addsl.pl.

USAGE

cullsl.pl which lets you remove selected templates' details from the subject listings generated by addsl.pl. This uses the same mechanism as addsl.pl, and simply takes the handles of the templates you wish to remove as its arguments when run, e.g.

% cullsl.pl 814010256-14355

OPTIONS

A number of options are available cullsl.pl program to control which files are used for generating the subject listings and where configuration options are located:

These options are then followed by zero or more templates handles (note - not filenames). If the -a option is given, no handles need be given on the command line; all templates in the database will be added to the subject listings.

FILES

config/class-map - default mappings from

Subject-Descriptor-Scheme attributes in templates to filenames used for generating HTML.

config/subject-listing/* - view files, each of which describing a particular way of rendering the templates into HTML.

config/multilingual/*/subject-listing-views/* - HTML rendering rules for subject listing views.

guts/subject-listing/*.lst - default location of the internal files used to maintain state between runs of subject listing tools.

htdocs/subject-listing - default location of the HTML generated by cullsl.pl

FILE FORMATS

The various attributes currently defined in the view file are:

SEE ALSO

the manual page for bin/addsl.pl, the manual page for bin/addwn.pl, the manual page for bin/cullwn.pl, the manual page for bin/mkinv.pl

COPYRIGHT

Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.

AUTHOR

Jon Knight <jon@net.lut.ac.uk>, Martin Hamilton <martinh@gnu.org>

3.8 bin/cullwn.pl - cull stale entries from what's new listings

NAME

bin/cullwn.pl - cull stale entries from what's new listings

SYNOPSIS

bin/cullwn.pl [-cdh] [-f directory] [-n name]
  [-w name] hhmmssDDMMYYYY

DESCRIPTION

The cullwn.pl program removes entries from a What's New listing file that were added before a certain date. The new listing file is then converted into a static HTML document which can be placed on the WWW. The What's New file is intended to show end users what resources have just been catalogued by the ROADS service and/or when some aspect of a catalogued resource's template has changed.

USAGE

It is anticipated that you will want to remove What's New listing entries which are past their use-by date, and the ROADS software provides a tool to help you do this. cullwn.pl will remove any What's New entries which are older than a given date - or the current date if no date is specified. At the moment you have to run this from the command line, but in a future version of the software we will be providing a World-Wide Web front end.

The cullwn.pl tool uses the same view configuration information as the addwn.pl tool - see the section on this for more information. It can be run either with or without a date from which to begin culling, e.g.

(start culling from now...)

% cullwn.pl

(start culling from the 15th of January 1997...)

% cullwn.pl 00000015011997

OPTIONS

A number of options are available for the cullwn.pl program to control which files are used for generating the subject listings and where configuration options are located:

These options are then followed by a 14 character time and date string in the following format:

hhmmssDDMMYYYY

where (in order):

This time and date string specifies the culling time; all entries in the what's new list generated before that date are removed. Thus the string 10452312021995 tells cullwn.pl to remove any entries added the What's New list before 10:45:23am on 12th March 1995.

FILES

config/whats-new/* - "What's New" view specifications

config/multilingual/*/whats-new-views/* - rendering rules for the various "What's New" views

htdocs/whats-new.html - default location of listing.

FILE FORMAT

The cullwn.pl program can generate a number of different subject listings. This allows, for example, a subject listing of UK based resources in addition to a subject listing of all resources. The views also allow easy selection of which subject listing a template should be added to in the mktemp.pl editor.

The view is specified by a view file. An example file is:

HTML-File:      /WWW/htdocs/ROADS/whats-new.html
Listing-File:   /usr/local/ROADS/guts/whats-new/Default.lst

The various attributes currently defined in the view file are:

SEE ALSO

the manual page for bin/addsl.pl, the manual page for bin/addwn.pl, the manual page for bin/cullsl.pl, the manual page for bin/mkinv.pl

COPYRIGHT

Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.

AUTHOR

Jon Knight <jon@net.lut.ac.uk>, Martin Hamilton <martinh@gnu.org>

3.9 bin/deindex.pl - remove templates from index

NAME

bin/deindex.pl - remove templates from index

SYNOPSIS

bin/deindex.pl [-c ci_path] [-dh] [-i index_dir]
  [-s source_dir] [-t tmp_dir] handle1 handle2 ... handleN

DESCRIPTION

The deindex.pl script removes one or more templates from a filesystem based inverted index of IAFA templates created by mkinv.pl. The inverted index allows the search.pl and admin.pl programs programs to rapidly match keywords and boolean expressions in a large number of IAFA templates. The deindex.pl program removes all keywords from the inverted index associated with the specifed template(s).

OPTIONS

A number of options are available for the deindex.pl program to control where it looks for its files:

The options are then followed by one or more template handles to be deindexed. The deindex.pl script removes all traces of these templates from the selected inverted index. The script also archives a copy of the template in a .archive subdirectory of the IAFA template source directory. This archiving uses the GNU Revision Control System (RCS) if available, allowing multiple copies of a template's change history to be recorded.

FILES

config/guts - default location of index data

config/source - default location of template database

config/source/.archive - location of archived templates

SEE ALSO

the manual page for bin/admin.pl, the manual page for bin/mkinv.pl, the manual page for bin/search.pl

COPYRIGHT

Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.

AUTHOR

Jon Knight <jon@net.lut.ac.uk>

3.10 bin/dodgy.pl - find persistently stale templates

NAME

bin/dodgy.pl - find persistently stale templates

SYNOPSIS

 bin/dodgy.pl [-l basename] [-n grace]

DESCRIPTION

This Perl program analyses the results of the last three runs of the ROADS link checking tool, and returns a list of the templates which have been unreachable at least a given number of times.

It is intended for invocation from the likes of a World-Wide Web CGI program, a cron job, an at job. Another tool has been written to take the results of this program and modify the actual templates so as to remove them from the portion of the ROADS server's database which is visible to the end user.

OPTIONS

OUTPUT

List of filenames for the templates which have been persistently unreachable.

SEE ALSO

the manual page for admin-cgi/dodgy.pl

COPYRIGHT

Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.

AUTHOR

Martin Hamilton <martinh@gnu.org>

3.11 bin/dup_urls.pl - check for duplicate URLs in a collection of IAFA templates

NAME

bin/dup_urls.pl - check for duplicate URLs in a collection of IAFA templates

SYNOPSIS

bin/dup_urls.pl [-ad] [-s sourcedir] [file1 file2 ... fileN]

DESCRIPTION

This program looks for duplicate URLs in IAFA templates, such as may be found on a ROADS server.

dup_urls.pl produces a report listing any duplicate URLs it comes across, and the handle names of the templates in which they are found.

OPTIONS

dup_urls.pl takes the following arguments:

SEE ALSO

the manual page for admin-cgi/dup_urls.pl

COPYRIGHT

Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.

AUTHOR

Martin Hamilton <martinh@gnu.org>

3.12 bin/exterminate.pl - remove templates with persistently unreachable URLs

NAME

bin/exterminate.pl - remove templates with persistently unreachable URLs

SYNOPSIS

 bin/exterminate.pl

DESCRIPTION

This Perl program runs another tool in order to discover which templates have been persistently unreachable. Each of the resulting templates is modified so that any existing Status attribute is stripped out, and a new one introduced:

Status: stale

Finally, the ROADS server resource description database is reindexed.

The program is intended for invocation from a World-Wide Web CGI program, a cron job, or an at job.

OPTIONS

None.

BUGS

It is assumed that there is only one template per file.

SEE ALSO

the manual page for admin-cgi/exterminate.pl

COPYRIGHT

Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.

AUTHOR

Martin Hamilton <martinh@gnu.org>

3.13 bin/freq.pl - term frequency counter for IAFA style templates

NAME

bin/freq.pl - term frequency counter for IAFA style templates

SYNOPSIS

freq.pl [-ad] [-f maxhits] [-m min-count] [-s sourcedir]
  [-t tmpdir] [-A attrib1|attrib2|...|attribN]

DESCRIPTION

This Perl program will look at all the IAFA style templates in a given directory, and count the number of times each term found in the templates occurs. This has a number of uses - notably in determining an appropriate stop-list of words which should not be indexed, and in helping the user to devise an effective query.

Frequently appearing terms such as a, and the will likely cause large numbers of spurious hits when people search your database. To reduce the likelihood of this, we have added a ``stoplist'' feature to the ROADS search back end - this lets you arrange for certain search terms to be automatically removed, and we ship a sample stop list with the ROADS distribution.

The default behaviour is to sort the frequency count into order, and return the top fifty terms. This can be overridden by a set of command-line options.

OPTIONS

OUTPUT FORMAT

The output of freq.pl consists of the frequency count for a term, followed by a single space character, followed by the term itself, e.g.

310 research
283 mailing
270 available
268 University

DEPENDENCIES

An external program called "sort" is used to sort the frequency count into descending order. This is a standard feature of most (all?) implementations of Unix, but the command line options it takes may differ from version to version. Let us know if you find a version which does not understand -r, -n or -T!

TODO

Nothing ? :-)

SEE ALSO

the manual page for admin-cgi/freq.pl

COPYRIGHT

Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.

AUTHOR

Martin Hamilton <martinh@gnu.org>

3.14 bin/harvest_centroid.pl - extract centroid from SOIF or Harvest Broker/Gatherer

NAME

bin/harvest_centroid.pl - extract centroid from SOIF or Harvest Broker/Gatherer

SYNOPSIS

bin/harvest_centroid.pl [-d] [-h host] [-p port] [-s serverhandle]

DESCRIPTION

This program tries to extract a WHOIS++ compatible centroid from one of the following :-

If invoked with a host name or IP address to contact, this program will try to establish whether it is talking to a Harvest Gatherer or Broker, and send the appropriate command to fetch a dump of the entire contents of the Gatherer or Broker's database.

With no -h argument, this program will expect to receive a collection of SOIF templates on STDIN, such as you could get by

gzip -dc /usr/local/harvest/gatherers/*/All-Templates.gz

or

gdbmutil dump /usr/local/harvest/gatherers/*/PRODUCTION.gdbm

Note that when generating a centroid from a flat file collection of SOIF templates, the -s argument should be used to specify a serverhandle for the resulting centroid.

OPTIONS

BUGS

We should let people specify the starting time for the poll, and pass this on to the Broker/Gatherer, so that it's possible to do a relative "poll" of the Harvest server.

We don't do anything special about character sets/encodings.

Not up to date with current CIP specifications - this is really intended for use with a WHOIS++ server which speaks the old RFC 1913 indexing protocol.

Should be integrated with wpp_shim.pl, so that WHOIS++ servers which cannot load a centroid from a flat file can think they're polling a WHOIS++ server - when in fact the shim would simply be returning a centroid which had been calculated already.

SEE ALSO

the manual page for bin/harvest_shim.pl, RFC 1913

COPYRIGHT

Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.

AUTHOR

Martin Hamilton <martinh@gnu.org>

3.15 bin/harvest_shim.pl - search gateway between WHOIS++ and Harvest Broker

NAME

bin/harvest_shim.pl - search gateway between WHOIS++ and Harvest Broker

SYNOPSIS

bin/harvest_shim.pl [-h host] [-p port]

DESCRIPTION

This program relays WHOIS++ search requests to a Harvest Broker, and returns the results in WHOIS++ result format.

Before passing the WHOIS++ query on to the Harvest Broker, it is munged to remove WHOIS++ search syntax which would confuse the Broker. The search results, if any, are massaged into WHOIS++ templates using the template type FILE

OPTIONS

BUGS

Should be rewritten to allow for stand-alone operation.

SEE ALSO

the manual page for bin/harvest_centroid.pl, RFC 1913

COPYRIGHT

Copyright (c) 1988, Peter Valkenburg <valkenburg@terena.nl>, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.

AUTHOR

Peter Valkenburg <valkenburg@terena.nl>, Martin Hamilton <martinh@gnu.org>, Jon Knight <jon@net.lut.ac.uk>.

3.16 bin/iafa2digger.pl - convert IAFA templates to Digger v2 input format

NAME

bin/iafa2digger.pl - convert IAFA templates to Digger v2 input format

SYNOPSIS

 bin/iafa2digger.pl [-ad] [-o outlinedir] [-s sourcedir]
   [file1 file2 ... fileN]

DESCRIPTION

This Perl program converts IAFA templates such as those generated by the ROADS template editor into the format accepted by version 2 of Bunyip's "Digger" WHOIS++ server. This is necessary because Digger takes its input in the WHOIS++ on-the-wire format, which is slightly different to the IAFA templates used internally within the ROADS software.

OPTIONS

OUTPUT

A single file containing the Digger formatted templates.

COPYRIGHT

Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.

AUTHOR

Martin Hamilton <martinh@gnu.org>

3.17 bin/iafa_lint.pl - perform sanity check on a collection of IAFA templates

NAME

bin/iafa_lint.pl - perform sanity check on a collection of IAFA templates

SYNOPSIS

bin/iafa_lint.pl [-ad] [-o outlinedir] [-s sourcedir]
  [file1 file2 ... fileN]

DESCRIPTION

This program performs some basic checks on the contents of a collection of IAFA templates, such as may be found on a ROADS server.

The contents of each template are checked against an outline version of that template type. Outline templates are used within the ROADS software to indicate the fields which a template may contain, and provide some of the configuration information used by the WWW based template editor.

iafa_lint.pl produces a report listing any of these problems which it finds with the IAFA templates it processes. The following checks are performed:

OPTIONS

iafa_lint.pl takes the following arguments:

OUTLINE FILE FORMAT

You may need to either modify an existing outline file or create a new one, depending on whether you have invented a new template type or changed the attributes in an existing one. A set of default template outlines are distributed with the ROADS software, and can be found in the directory "\$ROADS::Config" on your installation.

It is necessary to have outline files for each template type which you will be checking using iafa_lint.pl

Each outline file must feature the Template-Type and Handle attributes. Attributes which only occur once should be written as they appear in the template, e.g. Title. Attributes which may occur multiple times should be written as variants, e.g. URI-v*. Finally, it is possible to refer to clusters of attributes drawn from another type of template by writing its name in brackets after a disambiguating prefix, e.g. Admin-(USER*).

A sample outline specification for a very short SERVICE template would look like this:

Template-Type: SERVICE
Handle:
Title:
URI-v*:
Admin-(USER*):

Note that other information may appear after the ":" character. This is not used by iafa_lint.pl.

SEE ALSO

admin-cgi/iafa_lint.pl

COPYRIGHT

Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.

AUTHOR

Martin Hamilton <martinh@gnu.org>

3.18 bin/info.pl - display information about the ROADS server installation

NAME

bin/info.pl - display information about the ROADS server installation

SYNOPSIS

bin/info.pl

DESCRIPTION

This Perl program scans the ROADS installation for the following information:

  1. Settings made during the software installation/configuration.
  2. Operating system and hardware architecture information for the computer the ROADS software is running on.
It is intended for invocation from a World-Wide Web CGI program, a cron job, or an at job. Another of the ROADS tools provides an automated server registration feature using this information, and of course the ROADS server maintainer is free to configure their server so as to allow access to as they see fit.

OPTIONS

None.

OUTPUT

The contents of ROADS.pm from the ROADS library directory, and the result of doing a uname -a.

SEE ALSO

the manual page for admin-cgi/info.pl

COPYRIGHT

Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.

AUTHOR

Martin Hamilton <martinh@gnu.org>

3.19 bin/lc.pl - Perl based HTML/IAFA link checker

NAME

bin/lc.pl - Perl based HTML/IAFA link checker

SYNOPSIS

bin/lc.pl [-acdilPsvux] [-b base_url] [-g guts_dir]
  [-p proxyurl] [-r seconds] [-t templatedir]
  [-w when_changed] [file1 file2 ... fileN]

DESCRIPTION

This program will take a set of URLs on their own, in a set of IAFA templates, or in HTML documents and attempt to check their accessibility. It can be passed a list of file names to examine on the command line or via standard input, e.g.

find . -print | lc.pl -i

or

lc.pl -v *.html > logfile

Normal behaviour is to ignore directories, files whose names begin with a dot ".", and files which do not appear to contain HTML - based on their suffix. This last restriction can be removed with a command line option which tells the program to assume the files are all IAFA templates.

Currently the only URL schemes which can be checked with lc.pl are "http:", "gopher:", "ftp:" and "wais:". A future version may try to check other URL schemes.

lc.pl will not follow links in HTML documents recursively!

PROXIES AND CACHING

It is recommended that a World-Wide Web cache server be used as a go-between in the link checking process. This can be enabled via environmental variables, e.g. in the style of csh and tcsh:

setenv http_proxy "http://wwwcache.lut.ac.uk:3128/"
setenv gopher_proxy "http://wwwcache.lut.ac.uk:3128/"
setenv ftp_proxy "http://wwwcache.lut.ac.uk:3128/"
setenv wais_proxy "http://wwwcache.lut.ac.uk:8001/"
setenv no_proxy "lut.ac.uk"

Or in the sh/bash/ksh/zsh style:

http_proxy="http://wwwcache.lut.ac.uk:3128/"
gopher_proxy="http://wwwcache.lut.ac.uk:3128/"
ftp_proxy="http://wwwcache.lut.ac.uk:3128/"
wais_proxy="http://wwwcache.lut.ac.uk:8001/"
no_proxy="lut.ac.uk"
export http_proxy gopher_proxy ftp_proxy wais_proxy no_proxy

The -p and -P options may also be used to affect proxying and hence caching behaviour. Note that if you use -p to specify a single proxy server for all your requests, this must be capable of handling any "wais:" URLs that may be passed to it. You can run lc.pl with the -l option to check for these in advance of doing the actual link check.

In addition to cache support via the proxy HTTP mechanism - URLs which have already been visited during an link checking session will not be requested again in the same session, and the HTTP "HEAD" method is used whenever an "http" URL is requested. The time to sleep between requests is configurable, defaulting to two seconds.

OPTIONS

OUTPUT FORMAT

The basic format for lc.pl output is

<HTTP response code> <name of file containing URL> <URL>

e.g.

404 SOSIG347 http://www.iss.u-tokyo.ac.jp/center/SSJ.html

Libwww-perl automatically translates the result codes of requests in protocols other than HTTP into their HTTP equivalents. If you use the -v option to get the results of successful requests too, the successful requests will be stamped with a 200 repsonse code, e.g.

200 SOSIG345 http://www.ssd.gu.se/enghome.html

The output generated by the -u and -l options takes the form

<name of file containing URL> <URL>

e.g.

SOSIG345 http://www.ssd.gu.se/enghome.html

DEPENDENCIES

The libwww-perl package is used to parse HTML documents, and to check the links themselves. At the time of writing, libwww-perl version 5 and Perl version 5.003 or above are recommended

TODO

Add support for other protocol schemes ? "finger:" should be easily done via proxy HTTP, but the cache servers don't speak this protocol scheme yet (and neither do many WWW authors?) "mailto:" and "mailserver:" could be done up to a point with code which checked for valid domain names, MX records and so on. An SMTP session to the remote server would be do-able, but then we wouldn't be able to take advantage of the current caching infrastructure... "telnet:" is another case in point. We could check the machine had a working DNS entry, and perhaps try to ping it, or even connect to the listed port. How far to take this is a matter for debate!

SEE ALSO

the manual page for admin-cgi/lc.pl, the manual page for bin/report.pl, the manual page for admin-cgi/report.pl

COPYRIGHT

Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.

AUTHOR

Martin Hamilton <martinh@gnu.org>

3.20 bin/lc2sd.pl - convert from Library-Catalog to Subject-Descriptor

NAME

bin/lc2sd.pl - convert from Library-Catalog to Subject-Descriptor

SYNOPSIS

bin/lc2sd.pl [-h] [-s directory] [-u name] 

SUMMARY

The lc2sd.pl program is intended to change any Library-Catalog fields in a set of templates into Subject-Descriptor fields. Older versions of the ROADS software (prior to v0.2.0) generated Library-Catalog and they were in several old versions of the Internet Draft describing IAFA templates. This program converts these templates into a format compatible with the latest IAFA Internet Draft.

OPTIONS

A number of options are available for the lc2sd.pl program:

SEE ALSO

the manual page for bin/addsl.pl, the manual page for bin/cullsl.pl

COPYRIGHT

Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.

AUTHOR

Jon Knight <jon@net.lut.ac.uk>

3.21 bin/mail_owners.pl - send mail to people whose links have gone stale

NAME

bin/mail_owners.pl - send mail to people whose links have gone stale

SYNOPSIS

 bin/mail_owners.pl [-d] [-m mailtemplate] [-t ownertable]

DESCRIPTION

This Perl program takes the results of the link checking tool and uses either a prepared table of maintainers for the various parts of the filesystem or stat to find out who is responsible for bad URLs.

This collected data of failed URLs is then mailed to each of these maintainers, if and only if there are bad URLs on their pages. Hopefully, these users will then take the appropriate actions.... :)

It is suitable for invocation from a World-Wide Web CGI program, a cron job, or an at job.

OPTIONS

INPUT FORMAT

Link checker summary report in the format

<HTTP-RC> <file> <URL>

e.g.

200 /home/roads/source/SOSIG106 gopher://nisp.ncl.ac.uk:70/

Where HTTP-RC is the HTTP (or equivalent) response code for the request. Non-HTTP response codes will have been translated into HTTP style response codes before the link checker report is dumped out.

OUTPUT FORMAT

Warning messages to information providers.

BUGS

This is really geared up to WWW server maintainers, rather than ROADS server maintainers. It should have a way of extracting the contact address from the templates if desired.

SEE ALSO

the manual page for bin/lc.pl

COPYRIGHT

Copyright (c) 1988, Mattias Borrell <mattias@munin.ub2.lu.se>. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

It was developed by Lund University NetLab, as part of the DESIRE project. DESIRE is funded under the European Commission Telematics for Research Programme.

AUTHOR

Mattias Borrell <mattias@munin.ub2.lu.se>

3.22 bin/makethes.pl - create thesaurus file or another DB(M) database

NAME

bin/makethes.pl - create thesaurus file or another DB(M) database

SYNOPSIS

bin/makethes.pl [-d] [-f filename]

DESCRIPTION

This program will create a DB(M) database based on a series of whitespace separated attribute/value pairs in a line delimited text file.

OPTIONS

FILES

config/Thesaurus* - default DB(M) database and input files

SEE ALSO

the manual page for admin-cgi/mktemp.pl, the manual page for admin-cgi/dumpdbm.pl

COPYRIGHT

Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.

AUTHOR

Martin Hamilton <martinh@gnu.org>

3.23 bin/mkinv.pl - build ROADS database index

NAME

bin/mkinv.pl - build ROADS database index

SYNOPSIS

bin/mkinv.pl [-adhu] [-i directory] [-m minsize]
  [-s directory] [-t directory] [-x stoplist]
  [-y stopattr] [-z alltemps] [handle1 handle2 ... handleN]

DESCRIPTION

The mkinv.pl program generates an index of IAFA templates which can be searched using the search.pl and admin.pl CGI programs. The index is used by these programs to rapidly match keywords and boolean expressions in a large number of IAFA templates.

OPTIONS

A number of options are available to the mkinv.pl program to control where it looks for its files:

If the -a option is not used, the mkinv.pl script expects one or more filenames containing IAFA templates to be given. These files are then processed, and all the templates in them are indexed.

FILES

config/stopattr - default list of attributes to exclude from the index.

config/stoplist - default list of terms to exclude from the index.

guts/index* - index files themselves.

guts/alltemps - list of template handle to filename mappings.

source - the source templates themselves.

SEE ALSO

the manual page for admin-cgi/admin.pl, the manual page for bin/deindex.pl, the manual page for admin-cgi/deindex.pl, the manual page for cgi-bin/search.pl, the manual page for admin-cgi/mktemp.pl

BUGS

The indexer will only correctly index IAFA templates that have a Template-Type attribute first and a Handle attribute second. All other attributes can be in any order. All templates generated by the ROADS software are in this format but the actual IAFA Internet Draft is not as strict. If you are processing templates derived from outside the ROADS system, be sure to ensure that these conditions hold before attempting to index them with mkinv.pl.

COPYRIGHT

Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.

AUTHORS

Jon Knight <jon@net.lut.ac.uk>, Martin Hamilton <martinh@gnu.org>

3.24 bin/rebuild.pl - rebuild ROADS index, subject/what's new listings

NAME

bin/rebuild.pl - rebuild ROADS index, subject/what's new listings

SYNOPSIS

rebuild.pl [-adp] [-s source_dir] [-t index_dir]
  [-S subject_listing_view] [-W whats_new_view]
  [handle1 handle2 ... handleN]

DESCRIPTION

To allow the indexing and addition to the subject lists and whats new files to take place, the bin/rebuild.pl program must have access to the bin/deindex.pl, bin/mkinv.pl, bin/addsl.pl, bin/addwn.pl scripts.

OPTIONS

FILES

config/subject-listing - subject listing views.

config/whats-new - "What's New" views.

guts/pending - holding area for templates created using the offline mode in the template editor.

guts/index* - index files.

source - templates themselves.

SEE ALSO

the manual page for bin/addsl.pl, the manual page for bin/addwn.pl, the manual page for bin/deindex.pl, the manual page for bin/mkinv.pl,

COPYRIGHT

Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.

AUTHOR

Jon Knight <jon@net.lut.ac.uk>, Martin Hamilton <martinh@gnu.org>

3.25 bin/report.pl - generate report based on link check results

NAME

bin/report.pl - generate report based on link check results

SYNOPSIS

bin/report.pl [-h] [-l logname] [-s sortpath]

DESCRIPTION

This Perl program generates a human digestable summary report of the errors which arose in the specified link checking run, i.e. those requests for which the response was not HTTP 200 or equivalent.

The often cryptic response codes are translated into plain English using the libwww-perl package, and the report is broken into sections, each of which deals with the occurrences of a particular problem.

OPTIONS

OUTPUT

List of link checker problems.

FILES

logs/lc - log file created by link checker run.

DEPENDENCIES

The Unix sort program is used, as is the libwww-perl-5 package. The latter is also a dependency for the link checker itself.

COPYRIGHT

Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.

AUTHOR

Martin Hamilton <martinh@gnu.org>

3.26 bin/review.pl - generate template review info breakdown

NAME

bin/review.pl - generate template review info breakdown

SYNOPSIS

 bin/review.pl [-dnr] [-a attribute] [-o owner]
   [-s sourcedir] [-v view]

DESCRIPTION

This Perl program checks resource descriptions to see whether they have passed their review date. It is intended for invocation from a World-Wide Web CGI program, a cron job, or an at job.

The report which this tool generates can be customized via a view file, which specifies the attributes which should appear in the listings of templates which are due for review.

USAGE

The review.pl tool lets you automatically search your database for templates which are due to be checked. This works by scanning the To-Be-Reviewed-Date attribute in each template, if present. It has the limitation that it only understands the following two ways of writing the date and time:

Fri Aug  1 23:00:00 1997
Tue, 23 May 98 13:51:41 GMT

To deal with the ``year 2000'' problem, years which are only two digits will automatically have 1900 added to them. We've tried to make the ROADS software immune to year 2000 bugs - please let us know if you spot any problems in this area so that we can fix them.

OPTIONS

OUTPUT

Summary report on templates which are due for review.

FILES

config/review-views - alternative sets of attributes to return in review.pl reports.

SEE ALSO

the manual page for admin-cgi/review.pl

COPYRIGHT

Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.

AUTHOR

Martin Hamilton <martinh@gnu.org>

3.27 bin/simplethes.pl - simple sample thesaurus plug-in

NAME

bin/simplethes.pl - simple sample thesaurus plug-in

SYNOPSIS

 bin/simplethes.pl [-d] [-f filename]

DESCRIPTION

This is a simple example program which is intended to illustrate the possibilities for using Perl and DB(M) databases to perform query expansion. The query to be expanded is passed as an environmental variable QUERY_STRING, as per the CGI specification.

OPTIONS

SEE ALSO

the manual page for bin/wppd.pl, the manual page for admin-cgi/dumpdbm.pl, the manual page for bin/makethes.pl

COPYRIGHT

Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.

AUTHOR

Jon Knight <jon@net.lut.ac.uk>

3.28 bin/snarf.pl - do a WHOIS++ search and snarf the resulting handles

NAME

bin/snarf.pl - do a WHOIS++ search and snarf the resulting handles

SYNOPSIS

bin/snarf.pl [-dfl] [-p port] server query

DESCRIPTION

The snarf.pl program performs a WHOIS++ search on the specified server and returns a list of the matching handles on a line by line basis. Note that the search must be structured as per the WHOIS++ query syntax defined in RFC 1835, the WHOIS++ protocol specification.

If the search was performed successfully, snarf.pl returns 0, otherwise it returns -1.

OPTIONS

SEE ALSO

the manual page for cgi-bin/search.pl, the manual page for bin/wppd.pl

COPYRIGHT

Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.

AUTHOR

Martin Hamilton <martinh@gnu.org>, Jon Knight <jon@net.lut.ac.uk>

3.29 bin/templateadmin.pl - template editor ACL manager

NAME

bin/templateadmin.pl - template editor ACL manager

SYNOPSIS

bin/templateadmin.pl [-h handle] [-o operation] [-u user]

DESCRIPTION

This program provides a mechanism for adding users to and removing users from the access control lists used by the ROADS template editor. The access control lists (if present) control which users are allowed to update the nominated templates.

OPTIONS

FILES

config/template_users - DB(M) database of template ACLs.

SEE ALSO

the manual page for admin-cgi/tempuserauth.pl, the manual page for admin-cgi/mktemp.pl

COPYRIGHT

Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.

AUTHOR

Jon Knight <jon@net.lut.ac.uk>

3.30 bin/wig.pl - gather indexes (centroids)

NAME

bin/wig.pl - gather indexes (centroids)

SYNOPSIS

bin/wig.pl [-d] spec_file

DESCRIPTION

The wig.pl program is used to gather WHOIS++ index and Common Indexing Protocol (CIP) centroids from remote servers. Its is intended to be run either from the command line or, more likely, from cron periodically. It implements the protocol described in RFC 1913, and the client side of the Common Indexing Protocol. Please note that at the time of writing, CIP was still under development by the IETF's FIND working group. Please let us know if you find any interoperability problems.

The upshot is that wig.pl lets you configure your ROADS WHOIS++ server to grab the database indexes from other people's WHOIS++ and CIP aware servers, e.g. CNIDR's Iknow and Bunyip's Digger. When a search performed on your server matches information in one or more of these indexes, the client will be returned a "referral" to the relevant server or servers. The ROADS WWW based WHOIS++ client, search.pl, will automatically follow these referrals and search the indexed WHOIS++ servers in addition to your own.

OPTIONS

FILES

config/wig/* - index gatherer specification files

guts/wig/* - per-server centroids

Note that the config file name in config/wig should both be the same as the indexed server's WHOIS++ server handle. This is the "Serverhandle" parameter in lib/ROADS.pm. Each server you index must have a unique server handle.

FILE FORMATS

EXAMPLE

To cross search the WHOIS++ server running on sosig.ac.uk, the Social Science Information Gateway at the University of Bristol, you would create the file config/wig/sosigacuk01. As a bare minimum, this file would need to contain the host name of the server to contact, but in practice you will probably want to include the following:

Host-Name: sosig.ac.uk
Host-Port: 8237
Description: Muppet Gateway; lets put on makeup and light up lights.

It's typically necessary for you to contact the remote server's administrator at this stage, because most WHOIS++ implementations will only let you index a server if you've been given permission to by its administrator. The ROADS WHOIS++ server uses an access control list based on the file config/hostsallow, and comes with some default settings which let the ROADS developers index your server by default. To add a new machine, we recommend that you put both its domain name and IP address into config/hostsallow, e.g.

bork.swedish-chef.org: poll
198.168.254.252: poll

Once this has been done, the ROADS WHOIS++ server will automatically allow the machine doing the indexing to "poll" it for centroids. Now all you need to do at the local end is run wig.pl, e.g.

bin/wig.pl sosigacuk01

If the index is successful, subsequent searches of your server will result in the centroid from SOSIG also being searched, and referrals being returned for any matches in this.

SEE ALSO

the manual page for wppd.pl:

BUGS

If you want to set up an index server which has no local data of its own, you'll still need to build the main ROADS index, e.g. with bin/mkinv.pl. It's debatable whether this is a bug or a feature!

COPYRIGHT

Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.

AUTHORS

Jon Knight <jon@net.lut.ac.uk>, Martin Hamilton <martinh@gnu.org>

3.31 bin/wppd.pl - LUT WHOIS++ server

NAME

bin/wppd.pl - LUT WHOIS++ server

SYNOPSIS

 bin/wppd.pl [-cCdDiLR] [-a admin-maxhits] [-e expansions]
   [-f maxfull] [-g wgipath] [-h serverhandle]
   [-l logfile] [-m maxhits] [-p portnumber]
   [-r restrictionsfile] [-s sourcedir]
   [-S stoplistfile] [-T thesaurus_prog] [-t indexdir]

DESCRIPTION

This is a WHOIS++ server (see RFC 1835) which can be used to make the contents of the ROADS server's database available for searching over the Internet using the WHOIS++ protocol.

OPTIONS

FILES

config/admin-restrict - search restrictions for admin users.

config/adminpasswd - password(s) for admin users in /etc/passwd format.

config/expansions - list of simple query expansions, e.g. 'color' to 'colour'.

config/hostsallow - TCP wrapper format list of client domain names and IP addresses, and allowed operations.

config/outlines - template outline definitions (schemas).

config/search-restrict - search restrictions for end users.

guts/alltemps - list of template handle to filename mappings.

guts/index* - database index used in searching.

guts/wppd.pid - WHOIS++ server process ID.

source - the actual templates themselves.

SEE ALSO

the manual page for bin/wppdc.pl, the manual page for admin-cgi/wppdc.pl, the manual page for bin/snarf.pl, the manual page for cgi-bin/search.pl, the manual page for admin-cgi/admin.pl, ...

COPYRIGHT

Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.

AUTHORS

Martin Hamilton <martinh@gnu.org>, Jon Knight <jon@net.lut.ac.uk>, with apologies to Tom Christiansen, and Larry Wall :-)

3.32 bin/wppdc.pl - control wppd.pl WHOIS++ server

NAME

bin/wppdc.pl - control wppd.pl WHOIS++ server

SYNOPSIS

bin/wppdc.pl [coldstart|status|restart|start|stop|safetyfirst]

DESCRIPTION

This program lets you drive your LUT WHOIS++ server by remote control, making it possible to have it automatically restarted, shutdown and so on from things like cron jobs and WWW CGI programs.

OPTIONS

There is only one option, which is the operation to be performed. This may be one of the following:

BUGS/CAVEATS

This program uses the ps command to find out what processes are running. The options this program takes and the results it produces typically vary quite a bit between different versions of Unix. If you find that this program fails on your system, please get in touch so that we can fix it!

SEE ALSO

the manual page for bin/wppd.pl, the manual page for admin-cgi/wppdc.pl

COPYRIGHT

Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.

AUTHOR

Martin Hamilton <martinh@gnu.org>

3.33 bin/z3950_centroid.pl - extract centroid from NWI/EWI objects

NAME

bin/z3950_centroid.pl - extract centroid from NWI/EWI objects

SYNOPSIS

bin/z3950_centroid.pl [-d] [-h hashtemp1] [-H hashtemp2]
  [-s serverhandle] < filename

DESCRIPTION

This Perl program creates a WHOIS++ compatible centroid from the attributes and values in a collection of NWI/EWI index objects, as created by the Combine harvester. Note that you should give a server handle when invoking this program, or the default value of 'undefined' will be used.

The Combine harvester creates its database in a two level directory hierarchy, with a separate file for each indexed object. You can combine them together for feeding into this program using a simple find invocation :-

find HDB/hdb -type f -exec cat {} \; | z3950_centroid.pl -s test01

Or perhaps something more complicated!

OPTIONS

BUGS

We could traverse the filesystem and look at the timestamps on the index objects - this would let us do a relative centroid.

We don't do anything special about character sets/encodings.

Not up to date with current CIP specifications - this is really intended for use with a WHOIS++ server which speaks the old RFC 1913 indexing protocol.

SEE ALSO

the manual page for bin/harvest_centroid.pl, RFC 1913

COPYRIGHT

Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.

AUTHOR

Martin Hamilton <martinh@gnu.org>

3.34 bin/z3950_shim.pl - search gateway between WHOIS++ and Z39.50 server

NAME

bin/z3950_shim.pl - search gateway between WHOIS++ and Z39.50 server

SYNOPSIS

bin/z3950_shim.pl [-d database] [-h host] [-p port]
  [-z path_to_zbatch]

DESCRIPTION

This program relays WHOIS++ search requests to a Z39.50 server and tries to munge the results back into WHOIS++ result format. It runs from the command line listening to STDIN and writing its results to STDOUT, and hence is suitable for launching via inetd.

Before passing the WHOIS++ query on to the Z39.50 server, it is munged to remove WHOIS++ search syntax which would confuse it. The search results, if any, are massaged into WHOIS++ templates using the template type GILS-NWI.

OPTIONS

BUGS

This program depends on the zbatch program from the CNIDR Isite distribution - see http://www.cnidr.org. It should be rewritten to include native Z39.50 support!

Z39.50 is a very complex protocol, and it's highly likely that you won't be able to use this tool to talk to an arbitrary Z39.50 server. Be prepared to get your hacking gloves out!

Should be rewritten to allow for operation as a stand-alone server.

SEE ALSO

the manual page for bin/z3950_centroid.pl, RFC 1913

COPYRIGHT

Copyright (c) 1988, Peter Valkenburg <valkenburg@terena.nl>, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.

AUTHOR

Peter Valkenburg <valkenburg@terena.nl>, Martin Hamilton <martinh@gnu.org>, Jon Knight <jon@net.lut.ac.uk>.


Previous Next Table of Contents