ROADS::Auth - A class to check user authentication for admin tools
use ROADS::Auth; CheckUserAuth("app_users"); # check user against app_users ACL
This class implements a simple access control list mechanism which piggybacks on top of the access controls provided by HTTP. It assumes that the user has been authenticated already by HTTP, and that the authenticated user name is available in the REMOTE_USER environment variable - usually set in the process of launching a CGI program.
Looks in the user registry registry_name, which is a DB(M) database keyed on the user name, for a record keyed on the REMOTE_USER environmental variable. Exits with an error page if authentication fails.
config/multilingual/*/lib/authfail.html - message returned on an authentication failure
config/auth/* - DBM databases of per-program registry information.
The CheckUserAuth method should return a response code rather than bombing out if the user couldn't be authenticated.
This should really be a class to manipulate authentication objects, rather than just a checker.
the manual page for admin-cgi/mktemp.pl
Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.
Jon Knight <jon@net.lut.ac.uk>, Martin Hamilton <martinh@gnu.org>
ROADS::CGIvars - A class to unpack and unescape CGI variables
use ROADS::CGIvars; print unescape("http://www.net.lut.ac.uk/%7Emartin/"), "\n"; cleavargs; print $CGIvar{templatetype}, "\n";
This class implements a method for unpacking the CGI parameters bundled with an HTTP request, and a method for turning hex escapes used for illegal characters back into the characters they represent, e.g. '%7E' to the ASCII tilde character.
This method reads CGI parameters from the environmental variable QUERY_STRING (if called with the environmental variable REQUEST_METHOD set to 'GET') or from STDIN (if REQUEST_METHOD is 'POST'), and adds them to a hash array in the main program namespace - CGIvars. If the REQUEST_METHOD is neither GET nor POST, the method will bomb out with an HTML error message. If the REQUEST_METHOD is 'POST', the number of bytes to read from STDIN will be set by reading the CONTENT_LENGTH variable from the environment. These environmental variables are normally set by HTTP servers when launching CGI programs.
Entries are added with the CGI variable name as their key, and the CGI value as their value. If an entry already exists, it will be appended to, using a comma ',' as delimiter. Note that if the CGIvar hash array already exists, these new elements will be added to the existing entries. There is no return value from this method.
This method takes a string as its parameter and performs hex unescaping on it. In addition, any '+' characters in the string will be replaced with spaces. The result is returned.
There is no check on the existance of the CONTENT_LENGTH variable in the process' environment.
admin-cgi and cgi-bin programs.
Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.
Jon Knight <jon@net.lut.ac.uk>
ROADS::CIPv3 - A class to generate and process CIP centroids
use ROADS::CIPv3; # public methods CIPv3PollHandler (FD, $IafaSource, $TargetIndex); OutputSchema ($outline, $firstbit); SendCIPv3Request (FD); ReadCIPv3Response (FD); ProcessTaggedIndexObject ($ThisTemplate);
This class implements support for generating Common Indexing Protocol
(version 3) style centroids from ROADS databases. It uses some of the code
in the ROADS WHOIS++ centroid library module - see the manual page for
ROADS::Centroid:.
Handle the CIPv3 polling in the ROADS WHOIS++ server. The arguments are the reference to the file handle on which to read in the poll request (usually a socket) and the location of the ROADS index file that the CIP payload should be generated from.
Used by CIPv3PollHandler to dump out the CIP tagged index object IO-Schema definition for a given object type. The first parameter is the ROADS object type we're interested in, and the second is any prefix which should be applied at the start of the IO-Scheme block in the resulting tagged index object.
Sends a CIP poll request for the tagged index object type to the CPI aware server connected via the file descriptor FD.
Reads a CIP tagged index object response from the CIP aware server connected via the file descriptor FD.
Used by ReadCIPv3Response to munges tagged index object into the format used internally by the bin/wig.pl. This is passed separately via a temporary file. The parameter is the ROADS object type to use when adding the terms from the tagged index object to server's centroid.
the manual page for Net::Centroid:, the manual page for
bin/wppd.pl, RFC 1913, the manual page for bin/wig.pl
Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.
Jon Knight <jon@net.lut.ac.uk>
ROADS::Centroid - A class to generate and process centroids
use ROADS::Centroid; # public methods @servers_to_ask = doreferrals("sex and drugs"); PollCommand(FILEHANDLE, "/roads/source", "/roads/guts/index"); # private methods if (ROADS::Centroid::referrallookup("sex and drugs")) { ... } ROADS::Centroid::AuthenticatePoll("FULL", "foobar", "yourserverhandle");
This class implements support for generating RFC 1913 style centroids from ROADS databases and for doing referral lookups within a collection of centroids.
Given a query, this method checks any available centroids for servers which can satisfy it. A list of these servers' handles is returned as the result. referrallookup is called behind the scenes for each centroid.
This method is used to process an incoming request (via FILEHANDLE) for a poll of a ROADS database, and generates an RFC 1913 style centroid using the ROADS templates and index specified in the second and third arguments respectively. The centroid is returned on STDOUT.
This method is used to perform the actual check for a given query in a given centroid.
Process authentication information in the poll request. The first argument is the poll type, the second is the authentication data (e.g. clear text password) and the third is the server handle of the polling server. This routine is called behind the scenes by PollCommand.
The referral lookup code only lets you search across all of the centroids which are available - it should let you specify just certain servers' centroids and ideally all centroids except those from certain servers.
The authentication code is a NOOP - but the ROADS WHOIS++ server has its own access control list mechanism based on ip addresses / domain names / password protection. We should also be passing the authentication type as well, though only "password" (clear text passwords) has been defined so far.
PollCommand assumes that it should be sending its output on STDOUT, which isn't necessarily a good thing.
the manual page for bin/wppd.pl, RFC 1913, the manual page for
bin/wig.pl
Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.
Jon Knight <jon@net.lut.ac.uk>
ROADS::DatabaseNames - A class to read in the list of ROADS databases
use ROADS::DatabaseNames; &ReadDBNames; # let's see the database details for the 'cross domain' DB print <<EOF; database: $database{xdomain} host: $host{xdomain} port: $port{xdomain} serverhandle: $serverhandle{xdomain} invserverhandle: $invserverhandle{xdomain} EOF
This method reads the list of WHOIS++ servers/databases which this ROADS installation knows about and puts their details into hash arrays.
When this method is called, it creates the following hash arrays by parsing the list of known databases :-
Holds the Destination tag (if any) used with this database when making a WHOIS++ search. Multiple Destination tags may be used to differentiate between multiple WHOIS++ databases held in a single server.
Holds the Internet domain name or address of the host running the WHOIS++ server for this database.
Holds the port number of the WHOIS++ server.
Holds the server handle of this WHOIS++ server.
Converts the server handle back into a friendly service name.
All but the last of these arrays are indexed on the friendly service name, whereas the last is indexed on the server handle. So, in the example above...
$invserverhandle{$serverhandle{"xdomain"}} eq "xdomain"
config/databases - default location for databases list, overridden by dbnames variable if defined.
The following fields are defined in the databases file, in the following order :-
Long/friendly name of the service, e.g. "ROADS-U-LIKE".
Domain name or IP address of the host running the WHOIS++ server.
Port number of the WHOIS++ server.
Tag used in the Destination attribute - used if this server has multiple virtual databases in one collection of templates.
The WHOIS++ server's serverhandle.
We should do something cleverer with this list, like have a "database" object which included protocol info and hooks for the methods to use to communicate with it. This would make it easier to link in other sources of info ?
the manual page for cgi-bin/search.pl, the manual page for
admin-cgi/admin.pl, the manual page for
cgi-bin/tempbyhand.pl, the manual page for
admin-cgi/lookupcluster.pl,
Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.
Jon Knight <jon@net.lut.ac.uk>, Martin Hamilton <martinh@gnu.org>
ROADS::ErrorLogging - A class to log errors and optionally bomb out
use ROADS::ErrorLogging; WriteToAdminLog("$0", "something weird is #;:<>!%&*"); WriteToErrorLog("$0", "Uh oh... :-("); WriteToErrorLogAndDie("$0", "time for tubby byebye: $!");
This class defines three methods which may be used to log messages and program names to log files. Log file entries are written after the fashion of the common HTTP error logging style, and the log files are locked while active to avoid corruption when multiple processes attempt to write to the same file at the same time.
This method writes message to the admin log, stamped as being from the program progname.
This method writes message to the error log, stamped as being from the program progname.
This method writes message to the error log, stamped as being from the program progname, and then kills itself off. It adds the tag '(FATAL)' to the message, and if the environmental variable GATEWAY_INTERFACE isn't set, also sends the program name and message to STDERR.
logs/admin - where admin logs are written to.
logs/errors - where error logs are written to.
Both the admin and error log files are structured as follows :-
We should make it possible to specify alternative log files. The code actually understands the ROADS::Logs variable, but the actual log file name is hard coded in at the moment. It probably ought also to be configurable whether we bomb out - which would potentially leave us with just a single logging routine instead of three almost identical ones!
Most of the ROADS tools!
Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.
Jon Knight <jon@net.lut.ac.uk>, Martin Hamilton <martinh@gnu.org>
ROADS::Expand - A class to perform simple query expansions
use ROADS::Expand; $expanded_query = expand("color");
This class defines a simple method to perform limited query expansion. It is intended to cater for the small number of very common word substitutions which typically cause problems with Internet searching, e.g. the use of "colour" versus "color".
This method takes the original_query_string variable and performs query expansion on it, returning the result as a string ready for variable assignment.
config/expansions - list of search terms and expansions, found using the globally scoped variable expansionfile or pre-initialized into the hash array EXPAND.
Each line of the file consists of a term, e.g. "colour", and its expansions, separated by whitespace, e.g.
colour color
Now that we have WGI based thesaurus lookup, this seems anachronistic. Should we make it capable of using a DB(M) lookup, or perhaps junk it?
the manual page for bin/wppd.pl, the manual page for
ROADS::Index:
Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.
Jon Knight <jon@net.lut.ac.uk>, Martin Hamilton <martinh@gnu.org>
ROADS::HTMLOut - A class to dump out HTML in various forms
use ROADS::HTMLOut; EditorViewSelection $postprocessed_strings = GenericSubs($string); $dir = GetMessageDir($program, $view, $language, $charset); InitLookup; InitLang; if (LangFileExists($program, $file, $language, $charset)) {...} print ListMissingMandatory; OutputHTML($program, $file, $language, $charset); print SelectDatabases; print SubjectListingSelection; print TemplateTypeSelection; print WhatsNewSelection;
This class contains a number of methods for turning text containing ROADS specific psuedo-HTML tags into normal HTML using variable interpolation.
This method looks at the keys of the views hash array, and generates an HTML SELECT menu with an element for each of them.
This method knows about a large number of generic substitutions which may be carried out on a string, typically involving replacing a "fake" HTML tag with the results of a variable interpolation. These are listed separately in the ROADS technical documentation.
This method tries to find the most appropriate HTML messages directory to use for a given combination of program name, rendering view, language and character set.
This method seeds a hash array LanguageLookup with the available language details from the ROADS installation, typically config/languages. The array is keyed on the language code and character set, e.g. "en-gb-ISO-8859-1", and the value for a given element is a path relative to the ROADS config directory, or an absolute path. This path points to the outline HTML message files for a particular language and character set combination.
This method initializes the Language and CharSet variables (used to select outline HTML for rendering to the end user) based on the following algorithm :-
If the command line switch -L or -C is set, its value will be used ... otherwise if the HTTP Accept-Language or Accept-Charset header is set, its value will be used ... otherwise if the CGI variable Language or Charset is set, its value will be used ... otherwise the default values of "en" and "iso-8859-1" will be used
The tests for language and character set are actually independent, though we've grouped them together here for simplicity.
This method tests for the existence of a message file for a particular program and language/character set combination.
This method returns a string containing an HTML list structure, each entry of which is one of the elements in the scalar array MissingMandatory. This is normally used by mktemp.pl, the ROADS template editor, to indicate fields which should have ben filled in but weren't.
This method tries to send the message file file with any variable substitutions which may be necessary for the program program in the requested language and charset if possible. We try to use HTTP content negotiation to control the directory which is searched in for message files for a given language and charset combination.
Note that this method does not send the HTTP Content-type header. This is something that any code which calls it will have to do itself.
This method returns an HTML SELECT structure each element of which corresponds to a database configured in the ROADS installation.
This method returns an HTML SELECT structure each element of which corresponds to a subject listing view.
This method returns an HTML SELECT structure each element of which corresponds to one of the available template types.
This method returns an HTML SELECT structure with an element for each of the avavailable "What's New" views.
config/languages - specifies the directories where the HTML messages file may be found for a particular language and character set/encoding combination.
config/multilingual/* - default location of outline HTML messages distributed with the ROADS software. Each program has its own sub- directory under this. Programs which support multiple "views" of a data set typically have a directory program-views.
HTML message files are formatted as normal, which additional "pseudo-HTML" tags as described separately in the ROADS technical documentation.
This tag is understood by bin/addwn.pl and bin/cullwn.pl:
Replaced by time at which template was last modified, found by doing a stat(2) of the file it lives in.
This tag is understood by cgi-bin/survey.pl, the user survey program.
Replaced with a unique identifier generated from the current time and the process ID of the running CGI program.
The following tags are handled by the OutputHTML routine. This is quite flexible in terms of the directories it will look in for its HTML outlines - mainly because of the support we are adding for internationalisation.
OutputHTML is invoked with the name of a program and an associated message file, e.g. tempbyhand and nohandle.html. It then checks to see whether there this file is
NB this part is still under development!
This should be named the same as the name of the program, e.g. config/tempbyhand/nohandle.html.
The tags we understand are:
Replaced by \$ROADS::WWWAdminCgi.
Replaced by \$ROADS::WWWAdminCgi.
Replaced by a SELECT menu of all of the template types available, found by looking at the filenames in the \$ROADS::Config/outlines directory. An additional item, ALL will be added, and marked selected by default. See also TEMPLATETYPELIST.
Replaced by a hidden field setting the value of the HTML form variable charset to the value of \$CharSet if present, i.e.
<INPUT TYPE="hidden" NAME="charset" VALUE="$CharSet">
Replaced by \$ROADS::WWWCgiBin.
Replaced by a SELECT menu of all of the databases which are known to the ROADS server - i.e. present in \$ROADS::Config/databases. In this context a database is essentially the combination of WHOIS++ server hostname, port number, and Destination attribute to search on. An extra entry, selected by default, will be added for ALL of the databases. See also REALDATABASES.
Replaced by \$Handle if present
Replaced by \$ROADS::WWWHtDocs.
Replaced by a hidden field setting the value of the HTML form variable language to the value of \$Language if present, e.g.
<INPUT TYPE="hidden" NAME="language" VALUE="$Language">
Replaced by \$matches if present.
Replaced by a bullet-point list of the contents of the @MissingMandatory array - used by the template editor to signal mandatory attributes which have not been filled in.
Replaced by \$additional if present.
Replaced by \$default if present.
Replaced by \$CGIvar{mode} if present.
Replaced by \$CGIvar{op} if present.
Replaced by \$CGIvar{view} if present.
Replaced by \$myurl if present.
Replaced by \$longname, the full name of this subject category.
Replaced by \$CGIvar{originalhandle} if present.
Replaced by \$query if present.
Replaced by a SELECT menu of all the subject listing views which are known to the ROADS server - i.e. present in \$ROADS::Config/subject-listing/views.
Replaced by a SELECT menu of all of the template editor views for this particular template type which are known to the ROADS server - i.e. present in the appropriate file in \$ROADS::Config/mktemp-views/.
Replaced by a SELECT menu of all the What's New views which are known to the ROADS server - i.e. present in \$ROADS::Config/whats-new/views.
Replaced by a SELECT menu of all of the databases which are known to the ROADS server - i.e. present in \$ROADS::Config/databases. In this context a database is essentially the combination of WHOIS++ server hostname, port number, and Destination attribute to search on. See also DATABASES.
Replaced by \$ROADS::DBAdminEmail.
Replaced by \$ROADS::ServiceName.
Replaced by \$ROADS::SysAdminEmail.
Replaced by \$scheme_name, the Subject-Descriptor scheme specified on the command line, or UDC if not present.
Replaced by \$CGIvar{templatetype} if present.
Replaced by a SELECT menu of all of the template types available, found by looking at the filenames in the \$ROADS::Config/outlines directory. An additional item, ALL will be added, and marked selected by default. See also ALLTEMPLATETYPES.
Creates an HTML form using the POST method, with \$myurl as the action, i.e.
<FORM ACTION="$myurl" METHOD="POST">
Note that you must supply the closing
</FORM>
Creates an HTML form using the GET method, with \$myurl as the action, i.e.
<FORM ACTION="$myurl" METHOD="GET">
Note that you must supply the closing
</FORM>
The HTML output by the ROADS tools is capable of being internationalized by
allowing a different set of HTML documents to be sent back to the end user
depending upon the language and character set in use. The language and
character set can be specified by (in order of decreasing priority)
browser HTTP headers, CGI parameters, command line options to the scripts
and built in defaults. The CGI parameters for are called language
and charset, the HTTP headers are HTTP_ACCEPT_LANGUAGE
and HTTP_ACCEPT_CHARSET and the options are usually -L
and -C. Whilst older browsers rarely allowed the user to specify
the HTTP headers, many of the newest browsers do allow the headers to be
easily configured by the end user using GUI control panels (see your
particular browser's documentation for details of how to do this - there
are far too many browsers in use to permit us to detail this).
The out-of-the-box default language and charset for ROADS is for a language of "en" (International English) and a character set of "iso-8859-1" (ISO Latin 1 - Western European characters). The mapping between these parameters and the actual set of language pages is made using the \$ROADS::Config/languages file. This file looks something like this:
en-uk ISO-8859-1 multilingual/UK-English en-gb ISO-8859-1 multilingual/UK-English en-us ISO-8859-1 multilingual/UK-English en ISO-8859-1 multilingual/UK-English en iso-8859-1,*,utf-8 multilingual/UK-English de ISO-8859-1 multilingual/Deutsch
Each line has a language, character set and path to a directory. The path can either be an absolute path to anywhere in the filesystems on the machine or a path relative to \$ROADS::Config (as shown in the default file above). Inside the directory, each ROADS program has its own subdirectory and it is within these subdirectories that the actual HTML is located. Currently ROADS is distributed with a full set of International English HTML files and a small demonstration subset for the mktemp.pl introduction FORM in German. Hopefully over time, contributed translations of the ROADS HTML will be made available.
The use of HTML FORMs within ROADS does currently lead to some problems for internationalisation (I18N). Both the HTML 2.0 standard (RFC1866) and the W3C's HTML 3.2 Recommendation both used coded character sets based on the ISO-8859-1 Latin-1 character set. This provides support for most Western European characters. The newer W3C HTML 4.0 recommendation is based upon Unicode and therefore allows a greater range of characters to be represented in HTML documents. It also provides support for detailing the language in use and direction that sections of the text should be render/read in.
Until the development of HTML 4.0, all form data being submitted from web browsers to CGI programs had to consist of ASCII text. Even with HTML 4.0, CGI scripts using the GET method or scripts using the POST method with the widely used application/x-www-form-urlencoded MIME type can only receive ASCII text. Only FORMs using the POST method between HTML 4.0 compliant browsers where the enclosure type is something like multipart/form-data can be used to pass non-ASCII characters. Unfortunately, HTML 4.0 browsers that support these features are currently still quite rare and the HTML 4.0 specification was only released towards the end of the ROADS v2 development phase. It is hoped that ROADS v3 will be able to make use of these new features and by that time the bulk of the web browsers in use will also support them.
In the meantime, although the ROADS indexing software is capable of indexing characters from outside of the ASCII character set, it is very difficult for cataloguers and end users to enter multilingual strings. For this reason we encourage sites that do wish to provide a multilingual service to provide at least an English version of there data,and if possible a Romanized version of the native language form(s) of their data, so that existing browsers can search their databases.
Some confusion over variable scoping. It's also unclear whether programs should need to use the language and character set parameters (and initialize these themselves), or whether these should automatically be initialized to sensible values.
OutputHTML sends its output to the currently selected file descriptor. It might make more sense to have it return its output as a string or scalar array for further processing, e.g. by a user defined module.
admin-cgi and cgi-bin programs.
Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.
Jon Knight <jon@net.lut.ac.uk>, Martin Hamilton <martinh@gnu.org>
ROADS::Index - A class to support searching of ROADS database indexes
use ROADS::Index; %results = index_search($restrictfile, $expansionfile, $stopfile, $iquery, $caseful, $stemming); initExpansions($expansionfile); initIndexCache($targetindex); initRestrictions($restrictionfile); initStoplist($stopfile);
# private if ($op = boolop) { ... } %results = expr; lookup; term;
This class defines a series of methods for working with the ROADS database index. The main method which is called by outside code is index_search. This calls the other methods in turn to initialise the working environment, then carry out the search and process the results.
This method takes the query iquery, together with a list of search options, and results the search results as a list of record handles for later processing with other tools. The search options are :-
This parameter specifies a file listing the templates and template attributes which should be made visible to end users.
This parameter specifies a file listing simple query expansions which should be carried out e.g. "color" as a synonym of "colour".
This parameter specifies a file listing the terms which it should not be possible to search for, such as "a", "the" and so on.
This Boolean variable controls whether the search terms will be matched in a case sensitive or case insensitive way.
This Boolean variable controls whether stemming will be active for this search. When stemming is turned on, query expansion will be performed on the search terms to increase the number of matches.
This method initialises the list of expansions which will be used during query expansion.
This method opens the DB(M) based indirection index used to speed up searching of ROADS databases. This is nomally created by mkinv.pl when a database is indexed.
This method intialises the search restrictions will be in force.
This method initialises the stoplist which will be in force.
This method examines the string variable iquery in the main namespace, and searches for the presence of a Boolean AND or OR operator. If one of these is found it will be returned as the result of the method, otherwise a zero (0) will be returned.
This method also operates on iquery, splitting it recurisvely into left and right sub-expressions and intervening operators, until there are no more sub-expressions left, and passing each sub-exprssion on to the term method.
This method tries to find ROADS database entries which match the search term, and returns the resulting handles.
This method examines a search term, and potentially calls expr or lookup to further process it.
config/admin-restrict - default search restrictions for admin users.
config/expansions - default query expansions
config/search-restrict - default search restrictions for end users.
config/stoplist - terms which have been stop-listed and should be excluded from searches.
guts/index* - the actual ROADS database index.
guts/alltemps - list of template handles to filenam mappings.
Some of this code is very messy, there is a fair bit of duplication of effort, and too much reliance on global variables - often making it hard to tell what's actually going on!
the manual page for bin/wppd.pl
Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.
Jon Knight <jon@net.lut.ac.uk>, Martin Hamilton <martinh@gnu.org>
ROADS::LookupRender - A class to render HTML resulting from a cluster lookup
use ROADS::LookupRender; # Do a WHOIS++ search or three, then ... lookuprender($query, $view, @results);
This class defines a mechanism for rendering WHOIS++ templates as HTML - or other formats, though HTML is the primary goal.
@results )
The WHOIS++ query which generated these results.
The view to use when rendering the results - many of the ROADS tools which generate HTML support multiple versions or 'views' of the same data using different HTML rendering rules.
This is a list of results in the format produced by the wppd code in the ROADS::WPPC class.
config/multilingual/*/mktemp/notemplateoutline.html - if no template outline (schema definition) could be found.
config/lookupcluster-views/* - directory containing HTML rendering rules for each cluster type.
We're not using the generic HTML rendering code for this, but we should be.
the manual page for admin-cgi/mktemp.pl.
Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.
Jon Knight <jon@net.lut.ac.uk>, Martin Hamilton <martinh@gnu.org>
ROADS::MeshTraversal - A class to perform centroid mesh traversal
use ROADS::MeshTraversal; # @results is populated by doing a WHOIS++ search, then... @newresults = traverse("tubby and toast", 5, 1, @results);
This class defines a method which takes the results of performing a WHOIS++ search and follows up any referrals which may have been returned by the WHOIS++ servers which were queried.
@results )The traverse method takes four arguments :-
The WHOIS++ query which resulted in the original results being returned.
The upper limit on the number of servers to contact when doing mesh traversal - this is to avoid going bonkers and trying to contact every WHOIS++ server on the Internet. We try not to hit any server more than once, so this should be quite effective.
Whether to return debugging output or not.
Original search results which are being expanded, as generated by the ROADS::WPPC WHOIS++ client.
We're a bit mixed up about where we're storing information, and this whole result processing could do with a re-think. If we're objectifying ROADS a bit more, the obvious thing to do would be to make each result item into an object, and hang the referrals off the bottom of them ?
the manual page for admin-cgi/admin.pl, the manual page for
cgi-bin/search.pl, the manual page for ROADS::WPPC:
Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.
Jon Knight <jon@net.lut.ac.uk>
ROADS::Override - A class to override unusual/odd URL protocol schemes
use ROADS::Override; Override; if ($override{"wais"}) { $page_to_return = $override{"wais"}; } else { $page_to_return = $regular_page; }
This class defines a method which constructs a hash array keyed on the protocol scheme element of a URL. Looking up a protocol scheme which have been overridden returns a filename which should be used instead of the filename which would normally be used. This provides a simple mechanism for insinuating intermediate pages of HTML when (for example) rendering search results into HTML, which can be used to add instructions or additional information as necessary.
This method loads the list of protocols to override and HTML pages to return into the hash array override.
config/protocols unless overridden by the protocols variable.
The protocols file is formatted with one entry per line. Each entry contains the following fields :-
The protocol scheme to override.
The intermediary HTML page to return when this protocol scheme is requested.
the manual page for admin-cgi/lookupcluster.pl, the manual page
for cgi-bin/search.pl, the manual page for
cgi-bin/tempbyhand.pl, the manual page for
cgi-bin/waylay.pl
Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.
Martin Hamilton <martinh@gnu.org>
ROADS::Porter - A class to perform stemming using the Porter algorithm.
use ROADS::Porter; print join(" ", stem("wubbleyou")), "\n";
This class defines an implementation of the Porter stemming algorithm.
The stem method operates on a single term term at a time, and so must be wrappered by any code which is aiming to stem multiple search terms. It returns a scalar array of terms found through the stemming algorithm, including itself.
It's not clear that the Porter algorithm is a very useful for this sort of thing! Some of the results look very silly.
the manual page for bin/wppd.pl, the manual page for
ROADS::Index:
Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.
Jon Knight <jon@net.lut.ac.uk>
ROADS::PreferredURL - A class to extract the URL we prefer most
use ROADS::PreferredURL; # %TEMPLATE is a hash array we read in earlier... $like_this_best = preferredURL(%TEMPLATE);
This class defines a mechanism for examining the URLs contained in a template (loaded into a hash array which is keyed on attribute name), and returned the most desirable one.
This method examines the template attributes encoded in HASH_ARRAY (keys are attribute names, values are attributes' values from template), and discards all those which aren't URIs or URLs. It sorts the values of the remaining attributes according to a simple scheme whereby :-
http is preferred over gopher is preferred over ftp is preferred over telnet is preferred over wais is preferred over mailto
The preferred URL is then returned as a string.
the manual page for bin/addsl.pl, the manual page for
bin/addwn.pl, the manual page for bin/cullsl.pl
Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.
Jon Knight <jon@net.lut.ac.uk>
ROADS::Rank - A class to rank WHOIS++ search results
use ROADS::Rank; # @results are the results of a WHOIS++ query done already @ranked_results = rank($query, @results);
This class defines a mechanism for sorting the results of a WHOIS++ search according to the number of occurrences of the search terms in the resulting templates.
@results )This method takes an array of WHOIS++ template handles results, and the original search terms query which gave rise to them. It sorts the handles according to the frequency of the search terms in the templates which they point to, and returns the sorted list.
We probably don't cope very well with some of the possible permutations of search terms and punctuation. Perhaps we should strip them down to just alphanumerics before doing the comparison ?
the manual page for admin-cgi/admin.pl, the manual page for
cgi-bin/search.pl
Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.
Martin Hamilton <martinh@gnu.org>
ROADS::ReadTemplate - A class to read in templates
use ROADS::ReadTemplate; %ALLTEMPS = readalltemps(); # readtemplate should call readalltemps if necessary %MYTEMP = readtemplate("XDOM01"); print $MYTEMP{title}, "\n";
This class implements two methods associated with reading in IAFA templates and template handle to filename mappings.
The readalltemps method reads in the list of template handles and the filenames in which they can be found from the alltemps file, usually found in the ROADS guts directory.
The readtemplate method tries to read in the template with the handle handle, and returns it as a hash array. The hash array has the template's attributes as its keys, and the attributes' values as its values.
The realalltemps method will be used to discover the filename corresponding to the template handle, unless the ALLTEMPS variable has been set already. Filenames with relative paths are assumed to be in the ROADS source directory.
An alternative invocation of readtemplate, this allows the programmer to specify the filename which the template should be loaded from.
guts/alltemps - list of template handle to filename mappings.
source - actual templates themselves
The alltemps file is line structured, with a separate template's entry on each line. The fields are :-
The handle of the template
The filename of the template
We tend to assume that the template handle and the filename will be the same. This will have to change when we move to a hierarchical source directory structure - which we'll have to do in order to scale ROADS to large numbers of records ?
the manual page for bin/addsl.pl, the manual page for
bin/addwn.pl, the manual page for bin/cullsl.pl, the
manual page for bin/deindex.pl, the manual page for
bin/rebuild.pl, the manual page for cgi-bin/suggest.pl,
the manual page for cgi-bin/search.pl, the manual page for
admin-cgi/admin.pl, the manual page for
admin-cgi/mktemp.pl
Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.
Jon Knight <jon@net.lut.ac.uk>
ROADS::Render - A class to render HTML outlines + variable substitutions
use ROADS::Render; # Do a WHOIS++ search or three... render($query, $view, @results);
This class defines a mechanism for rendering WHOIS++ templates as HTML - or other formats, though HTML is the primary goal.
@results )
The WHOIS++ query which generated these results
The view to use when rendering the results - many of the ROADS tools which generate HTML support multiple versions or 'views' of the same data using different HTML rendering rules.
This is a list of results in the format produced by the wppd code in the ROADS::WPPC class.
config/multilingual/*/scriptname/noconnect.html - HTML returned when connection to server couldn't be established.
config/multilingual/*/scriptname/nohits.html - HTML returned when there were no hits for a given query.
config/multilingual/*/scriptname-views - directory containing alternative views of rendering.
Each view is actually a directory. Views typically consist of
HTML (or whatever...) to return for beginning of page.
HTML (or whatever...) to return for end of page.
Default rendering rules
The following additional custom rendering outlines are available :-
where format is the name of the WHOIS++ response format. This lets you treat, for example, referrals to other servers differently from regular records. You could include a special logo for referrals, for instance.
where serverhandle is the server handle of one of the servers you expect to get results back from. This lets you give all results which come from this particular server their own custom HTML.
where handle is the handle of a resource which you would to have rendered differenly from all the other resources. If you have a few 'stand out' resources this could be a good way of drawing attention to them.
where serverhandle and format are as before. This lets you treat (for example) referrals from this server differently to regular records.
where template_type is the template type of the record. This lets you render different types of records in different way - e.g. to display a picture icon beside an image.
This lets you customize down to the level of particular types of template from particular servers.
Phew! This lets you customize down to the level of an individual template handle, from a particular server, of a particular type and response format :-)
You probably won't need to tangle with this stuff early on (if ever?) but we've tried to build plenty of flexibility in so that if you do want to use it you can get some quite dramatic results with a minimum of effort. Check out the technical guide for more information about customizing HTML rendering.
This specifies a substitution pattern (see below) which is to be executed for each instance of the attribute specified in its right hand side. The format is
<FOREACH "[default value]"> [substitution pattern]
where the text marked
Replaced with the URL of the WWW server, formed from \$ROADS::MyHostname and \$ROADS::MyPortNumber.
Replaced with \$query, the WHOIS++ query.
Replaced with \$ROADS::ServiceName.
Attributes from the template being rendered may be referred to by placing an @ sign in front of the attribute name, e.g. @KEYWORDS refers to the Keywords attribute. If this does not occur within a FOREACH tag, only the first occurrence of the attribute's value will used.
The format of these references is
<@[attributename] "[default value]">
e.g.
<em>Keywords:</em><br> <@KEYWORDS "no keywords supplied">
Outline HTML files are found in config/search-views. Note that you can only effect substitution for a single attribute per line of your HTML outline file at the moment.
The integration of hard coded HTML for the template editor and some aspects of the subject/what's new listings is sub-optimal - to say the least!
admin-cgi and cgi-bin programs, the manual page for
bin/addsl.pl, the manual page for bin/addwn.pl, the
manual page for bin/cullsl.pl, the manual page for
bin/cullwn.pl.
Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.
Jon Knight <jon@net.lut.ac.uk>, Martin Hamilton <martinh@gnu.org>
ROADS::WPPC - A class to talk to WHOIS++ servers
use ROADS::WPPC; @results = wppc($host, $port, $request); foreach $I (@results) { print $I, ": ", $TEMPLATE{$I}, "\n"; }
This class implements a simple WHOIS++ client which returns raw WHOIS++ responses in a hash array with global scope, together with an index of matching templates and server handles.
Invoking this method causes the following to happen :-
The format of the per-template information returned by wppc will be one of the following :-
serverhandle:handle - if the response is a simple template. serverhandle:referral - if the response is a referral. localcount:NN - if the response is a COUNT template containing hit count information. referralcount:NN - if the reponse is a COUNT template containing hit count information. noconnect - if the server couldn't be contacted.
The format of the individual templates, as indexed by the serverhandle:handle notation, is as they appear on the wire when send from the WHOIS++ server back to the client. This means that any code which processes them will need to do some extra work to get at individual fields in the templates. This will change in a future version of the ROADS software.
Here's a sample on-the-wire WHOIS++ record :-
# FULL DOCUMENT MULTICS xdom01 Title: cross domain search test Description: this is really just a test URI: http://www.roads.lut.ac.uk/ # END
Note that it will not include the variant suffixes, since these are generally not used in WHOIS++ implementations. Note also that the four fields on the first line of the record correspond to :-
config/multilingual/*/lib/toomanyhits.html - if the number of hits was so great that it exceeded a pre-defined administrative upper limit.
We shouldn't be trying to do HTML rendering in this code. We also shouldn't be trying to return a list of templates (and other things - all mushed up together!) and also poking around in the shared global namespace at the same time. Searches and search results probably ought to be objects in their own right, with search results being comprized of metadata objects.
It would be neat if we could open connections to multiple servers and use select()/poll() to divide our time between them. Currently we're limited to contacting servers strictly in series :-(
the manual page for admin-cgi/admin.pl, the manual page for
admin-cgi/lookupcluster.pl, the manual page for
cgi-bin/search.pl
Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.
Martin Hamilton <martinh@gnu.org>