cgi-bin/redirect.pl - CGI program to redirect client while logging access
cgi-bin/redirect.pl
redirect.pl is a Common Gateway Interface (CGI) program which takes a URL as its parameter and tries to redirect its HTTP client to this URL, whilst at the same time logging the URL which is being redirected to. This provides a simple way of logging the accesses to resources which are being catalogued using the ROADS software, and can in fact be used for this purpose with any URL.
Redirected URLs are normally logged to the file redirect-hits, in the ROADS logs directory, though if the program is called using another name, this will be reflected in the filename prefix, e.g. wibble-hits if the program is launched as wibble.
Each redirected URL is logged on a line of its own.
Depending on the HTTP server you use, it may be necessary to run this program as nph-redirect.pl rather than redirect.pl. Some HTTP servers (e.g. Apache versions post 1.2) automatically detect when a CGI program creates its own HTTP headers, and others require use of the nph- naming convention to indicate a program which will do this.
Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.
Jon Knight <jon@net.lut.ac.uk>
cgi-bin/search.pl - user/admin CGI front end to ROADS search
cgi-bin/search.pl> [-C charset] [-L language] [-d] [-f form] [-l logfile] [-o protocols] [-u url] [-v view] [-w waylay_url]
aka admin.pl
The search.pl program is a Common Gateway Interface (CGI) program used to provide an end user search front end to ROADS databases. When accessed with no CGI query, the program can return an HTML form to the user to fill in to make a query. This form can be customized by the ROADS administrator and can include a number of options.
When the ROADS software is installed, a symbolic link to the program is made from the ROADS admin-cgi directory under the name admin.pl. You may find that following symbolic links is disabled by default on your server for security reasons, though this can usually be overridden on a per directory basis. We used to actually copy cgi-bin/search.pl over to admin-cgi/admin.pl, but this made maintenance unnecessarily complex.
It is desirable to differentiate between the search program running as an admin user (who will be able to edit, create and delete records) and the search program running as an end user (who will only be able to search for and view records). This differentiation is done in practice by checking the name by which the program was invoked.
The ROADS software comes with its own search subsystem, which is capable of dealing with small to medium size databases of tens of thousands of records. This consists of a Common Gateway Interface (CGI) based WWW front end, and as the back end, a WHOIS++ server which uses a simple inverted index. Whilst using our WHOIS++ implementation has benefits for distributed searching, it's not essential that you use this - e.g. we also provide tools to convert your ROADS data into a variety of other formats, such as the Summary Object Interchange Format (SOIF) used by Harvest and Glimpse, the Generic Record Syntax (GRS-1) format used by some Z39.50 servers, and the input format used by Bunyip's Digger WHOIS++ server.
The basic model for searching using the ROADS software is as follows:
In addition, it is possible to constrain a search to a particular WHOIS++ server, search case sensitively or insensitively, display only the titles of the results, rank the results according to relevance, use stemming to match other similar words in addition to the ones supplied in the query, and perform query expansion using a thesaurus. All but the last of these options is configurable at search-time, whereas the thesaurus support is either enabled or disabled for searches as a whole.
The default ROADS configuration assumes that you are going to run a single WHOIS++ server to make your ROADS database available for searching. A side effect of the use of WHOIS++ in searching is that it is also possible to search other WHOIS++ servers. You can tell your ROADS installation about other WHOIS++ servers by editing the file config/databases, to add their names and addresses.
With the default configuration we ship, the names of the other WHOIS++
servers your ROADS installation knows about will appear on the HTML
returned by the ROADS search tool search.pl. You may wish to
alter the HTML outline for this page to list only those WHOIS++ servers
you want to make visible to end users. The end user can choose to have
their search directed some or all of these, and search.pl will
combine the results and present them in the form of HTML. See also the
manual page for bin/wig.pl for a more advanced way of searching
across multiple servers using centroids.
Note that whilst you may be able to see multiple WHOIS++ servers using the admin search tool admin.pl, you can only edit ROADS database entries which are held locally.
Within the ROADS search subsystem there are a number of possibilities for local customisation (without modifying any code), substitution of locally written code for individual modules of the search subsystem, and enabling or disabling search features:
It is assumed that you will not want to make all of the information in your templates visible to the world at large. The attributes which can be searched on and the information which appears when a template is rendered into HTML are limited to those attributes and templates which are listed in the file config/search-restrict under the top level ROADS installation directory. The admin.pl program has its own list of restrictions in the file config/admin-restrict.
The defaults shipped have entries for a small subset of the attributes which may be found in the DOCUMENT, SERVICE and USER templates. If you want your users to be able to search on or see the contents of any other templates, you will need to add them to one or both of these lists. More information is provided on the admin.pl manual page and the search.pl manual page.
Character set to use.
Language to use.
Whether to run in debug mode or not - default is not.
The default HTML form to return to the end user.
Log file to record search requests and results in
Protocols to override using the waylay.pl program.
The URL of this program
The search results view to use
The URL of the waylay.pl program. See its documentation for more information.
There are a number of inputs that the form must have for the program to execute correctly; these are listed below. Note that the end user need not necessarily be presented with these on their browser if an input type of "hidden" is used.
It is important to note that there are two way of composing queries - one way is to use a simple text entry box query, and the other is to use up to three attribute/value pairs, e.g. attrib1 and term1 would comprise one attribute/value pair. In the HTML form which the user fills in to generate a query, the attributes, the values, or even both, may be generated using a combination of HTML elements such as drop down lists and text entry boxes. This can be used to provide (for example) a way of selecting the attribute to search on using an HTML SELECT menu, or to constrain the value being searched for similarly.
When constructing the query out of attribute/value pairs, these variables
are the attributes corresponding to the terms term
When constructing the query out of a combination of attrib
This is a Boolean variable that specifies whether a search should be case sensitive or not. The value "on" specifies that the search should take notice of the case of the terms, any other value (or none at all) implies that the search will be case insensitive.
The character set to use.
This is a CGI variable that allows the database(s) that are to be searched for the query in this form to be specified. A fake database name of "ALL" tells the search.pl program to search through all the databases it knows about.
This is a Boolean variable which specifies whether the search.pl program should operate in debug mode - in debug mode it generates copious extra HTML documenting its progress.
The HTML form to return to the end user if no query is supplied. The default form is search.html. This will be the name of a file in the config/multilingual/*/search/ directory, or the config/multilingual/*/admin/ directory.
This is a Boolean variable that specifies whether a search should return headlines instead of full template discriptions. It is included for compatibility with previous versions of ROADS, and actually has the effect of setting the results "view" to "headlines".
This is a Boolean variable which specifies whether search results should have matches (rendered in bold) for the original query highlighted.
The language to use.
This is the query as entered by the user. This will typically be a text
input element in the form. See also the CGI variables admin
This is a Boolean variable which specifies whether the results should be ranked into order, based on the frequency with which the words in the query occur in the records which were returned as a result of the search.
A Boolean variable specifying whether or not the search.pl program should follow referrals generated in the process of carrying out a WHOIS++ search.
This is a Boolean variable which indicates to search.pl whether the query terms should be stemmed when searching the database. The ROADS software currently implements the Porter stemming algorithm, with hooks for user supplied stemming or thesaurus lookup. If the value "on" is returned, the software will use stemming, otherwise the search terms will be used as is.
This CGI variable permits the end user or ROADS administrator to limit the returned resources down to those that are in an IAFA template of the specified type. A special template type of "ALL" is understood by search.pl to mean all template types. All the template types should be in upper case.
When constructing the query out of attribute/value pairs, these fields are
the values corresponding to the attributes attrib
The name of a "view" to use when rendering the search results into HTML. The default view is "default". This will be the name of a subdirectory of config/multilingual/*/search-views/ or of config/multilingual/*/admin-views/.
config/databases - known WHOIS++ servers.
config/protocols - protocols to override using waylay.pl.
config/multilingual/*/search/nohits.html - default HTML form sent to end user when no query is specified.
config/multilingual/*/search/noconnect.html - default HTML form sent to end user when no query is specified.
config/multilingual/*/search/nosearchterm.html - default HTML form sent to end user when no query is specified.
config/multilingual/*/search/search.html - default HTML form sent to end user when no query is specified.
config/multilingual/*/search/syntax.html - default HTML form sent to end user when no query is specified.
config/multilingual/*/search-views/* -
logs/search-hits - searches carried out and result details.
All of the search and search-views files and directories have admin and admin-views equivalents when the program is run as admin.pl.
The format of the search-hits and admin-hits logfiles is as per the WWW Common Log File format :-
If domain name lookups enabled on HTTP server or IP address.
as returned by AUTH/IDENT lookup if enabled on the HTTP server.
as provided by HTTP authentication, if authentication is required by the HTTP server configuration.
i.e. hits resulting from local records on the WHOIS++ servers being queried.
i.e. hits resulting from referrals sent back by the WHOIS++ servers being queried.
This file can be used to assess which terms are being searched for most frequently, how many searches are not matching anything in the available database and other statistics which may provide useful feedback to the ROADS administrator.
the manual page for admin-cgi/mktemp.pl, the manual page for
cgi-bin/tempbyhand.pl
Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), and the European Commission Telematics for Research Programme, and the TERENA development programme.
Martin Hamilton <martinh@gnu.org>, Jon Knight <jon@net.lut.ac.uk>
cgi-bin/suggest.pl - suggest a resource for inclusion in the database
cgi-bin/suggest.pl [-C charset] [-f form] [-L language] [-u myurl]
suggest.pl program is a Common Gateway Interface (CGI) program run from an HTTP daemon. This is a cut down equivalent of the regular ROADS template editor intended for use by end users. It simply renders the HTML form suggest.html (by default) to the end user and returns any fields on the form whose names are prefixed by SUGGEST in an email message to the ROADS database administrator once the form is submitted.
It is necessary to include a field called SUGGESTurl on the form, since use of this is hard coded into the program.
These options are only practically useful for debugging.
Character set to use.
HTML form to return to end user.
Language to use.
URL of the suggest.pl program.
Character set to use.
HTML form to return to end user. Note that only alphanumeric characters will be used.
Language to use.
config/multilingual/*/suggest/done.html - message returned to end user when template submitted.
config/multilingual/*/suggest/mailerror.html - message returned to end user if mail message couldn't be sent.
config/multilingual/*/suggest/suggest.html - default HTML form returned to end user.
the manual page for admin-cgi/mktemp.pl
Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.
Jon Knight <jon@net.lut.ac.uk>, Martin Hamilton <martinh@gnu.org>
cgi-bin/survey.pl - dump out a questionnaire and record the results
cgi-bin/survey.pl [-C charset] [-f form] [-L language] [-r resultsfile] [-u url]
The survey.pl program is a Common Gateway Interface (CGI) program used to provide a survey form to end users of a ROADS based subject service. The survey form is an HTML file that is presented to the end user by the program if it does not receive any CGI parameters. This form can include multiple choice questions, selections and questions requiring free text answers. When this form is submitted back to the program, the values of the CGI variables are saved in comma separated value format. This result file can then be processed offline by other programs to analyze the survey's returns. The link to the survey.pl program can be made from any of the other HTML in the ROADS system, such as that returned by the addsl.pl, addwn.pl search.pl programs.
The survey.pl program uses a HTML form normally called survey.html to read the HTML form for the survey in from. This file can contain any HTML. However, within the FORM there is an extra "fake" HTML tag that is required. This is the X-HANDLE tag which is replaced by an unique handle when the program returns the form to the end user. This fake tag MUST be present in the form as each IAFA-like template must have a unique handle.
It is also necessary to list all of the form fields which should be stored in the survey results file - in the order which they should appear in the file. This list should be stored in the hidden form field list, using commas to separate the fields.
These options are only practically useful for debugging.
Character set to use.
HTML form to return to end user.
Language to use.
File to save survey results to.
URL of the survey.pl program.
Character set to use.
HTML form to return to end user. Note that only alphanumeric characters will be used.
Language to use.
logs/survey-results - the survey results themselves.
config/multilingual/*/survey/survey.html - default HTML form to return to end user.
config/multilingual/*/survey/done.html - HTML message to return to end user once the session is complete.
In addition to the fields specified on the WWW form, the following fields will also be saved for each entry in the survey log :-
* The time in UTC (GMT) * The HTTP user agent if available * The client machine's IP address * The value returned by the IDENT/AUTH server on the remote machine if available * The domain name of the remote machine if available
Typically domain name and IDENT lookups have to be configured on the WWW server which is running the survey.pl program. Note also that some browsers will withhold HTTP user agent information.
the manual page for bin/addsl.pl, the manual page for
bin/addwn.pl, the manual page for cgi-bin/search.pl
Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.
Jon Knight <jon@net.lut.ac.uk>
cgi-bin/tempbyhand.pl - given a template handle, render it as HTML
cgi-bin/tempbyhand.pl [-d databases] [-C charset] [-f form] [-L language] [-o protocols] [-u url] [-w waylay_url] [-v view]
The tempbyhand.pl program is a Common Gateway Interface (CGI) program used to return a template to an end user given the handle of the template. It is called from the search.pl and admin.pl to display the full details from a template when the user has selected the titles only option.
tempbyhand.pl actually carries out a WHOIS++ search behind the scenes.
These options are intended for debugging use only.
File containing list of databases to use.
The character set to use.
HTML form to return to the end user if no handle to lookup is supplied.
The language to use.
URL protocol schemes to override, e.g. wais. See the manual page
for cgi-bin/waylay.pl for more information.
The URL of this program, if not passed as CGI variable - default is tempbyhand.pl in the nominated CGI executables directory.
The URL of the program to use when "waylaying" URLs which are odd or
unusual. See the manual page for cgi-bin/waylay.pl for more
information.
The tempbyhand.pl program uses two CGI parameters to determine which template to display:
Character set to use.
This is the name of the database from which the template is to be retieved. This may be a local or remote WHOIS++ database - but it must be listed by name in the file config/databases.
HTML form to return to the end user.
Language to use.
This is a WHOIS++ query to send, usually simply "handle=", followed by the handle of the template to display.
config/databases - list of servers and databases.
config/multilingual/*/tempbyhand/baddbase.html - the database requested couldn't be found.
config/multilingual/*/tempbyhand/nohandle.html - the handle requested couldn't be found.
config/multilingual/*/tempbyhand/noconnect.html - the WHOIS++ server couldn't be contacted.
config/multilingual/*/tempbyhand/nohits.html - there were no hits in the WHOIS++ server.
config/multilingual/*/tempbyhand/tempbyhand.html - the HTML form returned by default to the end user when no handle is supplied.
config/multilingual/*/tempbyhand-views/* - HTML rendering rules for lookup results.
the manual page for admin-cgi/admin.pl, the manual page for
cgi-bin/search.pl
Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.
Jon Knight <jon@net.lut.ac.uk>, Martin Hamilton <martinh@gnu.org>
cgi-bin/waylay.pl - generate a page of HTML for a given URL protocol scheme
cgi-bin/waylay.pl [-C charset] [-L language] [-u myurl]
This Perl program is intended for use when rendering ROADS database records which make use of unusual or bizarre protocol schemes. For example, the wais protocol scheme is not widely supported, and the mailto protocol scheme has no attached semantics to tell the user whether they are going to be dealing with a human being or a software agent.
It is assumed that the software which generates URLs referencing this program knows whether or not there is a page of HTML which may be used to describe the protocol scheme in question.
These options are intended for debugging use only.
The character set to use.
The language to return.
The URL of this program, if not passed as CGI variable - default is waylay.pl in the nominated CGI executables directory.
The character set to use.
The language to return.
The URL of this program - default is waylay.pl in the nominated CGI executables directory.
config/protocols - list of protocol schemes to waylay and message files to return
config/multilingual/*/waylay/*.html - per-protocol explanations
The file config/protocols is line oriented, with two fields on each line, colon delimited:
Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.
Martin Hamilton <martinh@gnu.org>