ROADS Software
Information Gateways
ROADS News
ROADS Liaison
InterOperability
Template Registry
Cataloguing Guidelines
What is ROADS
ROADS Guidebooks
Papers and Reports
Related Initiatives
Resource Organisation And Discovery in Subject-based Services

ROADS

ROADS API: Informal requirements

April to May 1998

The API sessions have now been completed...

The API sessions took place, via IRC, on Wednesday 6th and Thursday 7th of May, 1998.

There are two summaries, produced by Martin, of the sessions:

What is the API?

The API is the interface that sits between the basic body of ROADS software and
  1. the cataloguing interface (otherwise known as the user, as in cataloguer, interface, or the template editor)
  2. other external databases and applications, such as SQL databases
By providing "hooks" in the API, people can write software/modules that can use the ROADS software more easily.

Where is the IRC ROADS channel

You can download the software to do an IRC session from: http://www.mirc.ac.uk

The ROADS channel runs at:
Server: maple.ilrt.bris.ac.uk
Port: 8282
Channel: #roads
...there is often someone from ROADS and/or a gateway hanging around on that channel. If you want to experiment with this, then send an email to roads-liaison@bris.ac.uk.


API comments and notes: Martin Hamilton

Martin Hamilton wrote some notes before the first IRC session regarding the API

Firstly, let's think about what we want (or might want) APIs for...

1. Reading in a template, for those developing their own code to manipulate ROADS templates
2. Writing out a template (ditto)
3. Template creation
4. Template updating
5. Template deletion
6. Reindexing the database
7. Rebuilding subject listings
8. Rebuilding what's new listings
9. Plugging in alternative search backends to replace the built-in ROADS WHOIS++ server's "database"
10. Plugging in modules to do query expansion, e.g. using a thesaurus to find similar terms
11. Plugging in external back ends to the ROADS WWW based template editor

... before moving on to ...

12. Future developments

This is destined to be a new chapter in the ROADS manual, after some modifications, but I want to put it forward for discussion at the ROADS APIs meetings first. I'd contend that we already have "APIs" of one sort or another for all of the above - with one notable exception which I'll leave until the end!

In the text which follows, references to $ROADS:: refer to variables defined within the ROADS run-time configuration, typically found in lib/ROADS.pm if running ROADS from a single top level directory. See the ROADS manual for more information on this.

1. Reading in a template

We provide a library routine readtemplate in the ROADS::ReadTemplate module for this.

- From its embedded POD documentation:

  use ROADS::ReadTemplate;

  %MYTEMP = readtemplate("XDOM01");
  print $MYTEMP{title}, "\n";

2. Writing out a template

We provide no specific API for writing out a template. This is partly because different ROADS tools process templates in different ways, and partly because writing out a template is a trivial operation:
  Write a properly formatted template to $ROADS::IafaSource

After writing a template out to disk, you will probably want to rebuild the ROADS database index. If you are not planning to rebuild the index immediately you may prefer to write the template out to $ROADS::Guts/pending, which is a holding area for templates that have been added but not entered into the ROADS software proper. See 6. for more information on both of these.

Note that is it advisable for your template to contain at least one URI attribute, URI-v1:, and that it must contain as its first two attributes Template-Type: and Handle:. Templates should follow the WHOIS++ schema definitions in config/outlines if they are to be correctly processed by all of the ROADS tools.

3. Template creation

We provide no specific API for creating a template. If you want to create a template, follow the procedure outlined in 2. Note that it's advisable for the template's filename and handle (in the Handle: attribute) to match.

Since there is a danger that other processes may attempt to update templates which you are in the process of modifying, it's advisable to create the file $ROADS::Guts/mktemp.lock and use flock to lock it. This will cause any instances of mktemp.pl (the ROADS WWW based template editor) to block until the file is unlocked again.

For example:

    open(MKTEMPLOCK,">>$ROADS::Guts/mktemp.lock");
    flock(MKTEMPLOCK,2);
    # do stuff here
    flock(MKTEMPLOCK,8);
    close(MKTEMPLOCK);

4. Template updating

Combine the procedures for reading templates into memory and writing them out, as outlined in 1. and 2.

5. Template deletion

We have provided a routine, readalltemps, defined in the ROADS::ReadTemplate module (see 1.) - this reads a list of all the template handles and corresponding filenames from $ROADS::Guts/alltemps and stores it in an associative array. Your program can use this to figure out the filename associated with a given template handle, e.g. from its embedded POD documentation:

  use ROADS::ReadTemplate;

  %ALLTEMPS = readalltemps();
  # find filename corresponding to template with handle XDOM01
  $filename = $ALLTEMPS{"XDOM01"};
  # ... and delete it
  unlink($filename);

Since references to the template may still exist in other places, e.g. the ROADS database index and subject or what's new listings, we recommend that you also reindex your database - as per 6.

6. Reindexing the database

We provide a program bin/mkinv.pl which may be used to reindex the ROADS database. From its embedded POD documentation:

  # index all templates
  mkinv.pl -a

  # ... or just index the listed templates
  mkinv.pl XDOM01 XDOM02 XDOM03

Note that running this program on its own will not regenerate subject and what's new listings. We also provide a program bin/rebuild.pl which acts as a front end to the process of building the main database index, subject listings and what's new listings. This is intended to be called periodically as a cron job, e.g. on a nightly basis. From its embedded POD documentation:

  # rebuild all database, what's new and subject listing entries
  # incorporating pending templates 
  rebuild.pl -a -p -S CrossROADS -W Default

7. Rebuilding subject listings

We provide a program bin/addsl.pl, which may be used to rebuild just ROADS subject listing breakdowns. For example (from the ROADS manual):

  addsl.pl -ai

This causes the default subject listing view to be regenerated for all of the templates in the database, ignoring timestamps. addsl.pl has a number of other options - see its embedded POD documentation for more information.

In addition, bin/rebuild.pl (see 6.) may be used to build the main ROADS database, subject listings and what's new listings - all in one go.

8. Rebuilding what's new listings

We provide a program bin/addwn.pl, which may be used to rebuild just ROADS what's new listing breakdowns. For example (from the ROADS manual):

  addwn.pl -a

This causes the default what's new view to be regenerated for all of the templates in the database. addwn.pl has a number of other options - see its embedded POD documentation for more information.

In addition, bin/rebuild.pl (see 6.) may be used to build the main ROADS database, subject listings and what's new listings - all in one go.

9. Plugging in alternative search backends

The ROADS WHOIS++ server supports a CGI like feature which we call the WHOIS++ Gateway Interface (or WGI :-). WGI is documented separately - see the ROADS manual for its specification. This may be used to bolt on an alternative search engine, but you will still have to write your own code to parse the WHOIS++ request and turn it into something your search engine can understand. This is necessary because of the wide variety of search engines (and search engine APIs!) in existence.

To use your own WGI backend, simply add a line to lib/ROADS.pm telling bin/wppd.pl (the ROADS WHOIS++ server) where to find it, e.g.

  $WGIPath = "/roads/local/my_search.pl";

In practice, running wppd.pl will be overkill for most people, since all we are doing is listening on a given TCP port for an incoming connection, forking a new process to service the client, retrieving a query from it - and then handing control over to the WGI program. If your environment allows it (most development tools nowadays seem to come with Internet integration!) you may prefer to write your own WHOIS++ server. WHOIS++ (defined in RFC 1835) is a very, very simple protocol.

10. Plugging in modules to do query expansion

As opposed to replacing the whole of the ROADS search engine, you can also use WGI for query expansion - in this case, the WGI program will be called for each search term supplied in a query, and is expected to return a list of words rather than a WHOIS++ format response.

To use this feature, simply add a line to lib/ROADS.pm telling bin/wppd.pl where to find your query expansion code, e.g.

  $WGIThesaurus = "/roads/local/my_expand.pl";

Note that forking to run an external program for each search term of each query does add significantly to the overhead on your machine!

11. Plugging in external template editor backends

The ROADS WWW based template editor normally follows the procedure outlined in the previous sections of this document, but you can override this with your own choice of program to run to add or delete a template from the database. These are defined as, respectively, ExtDBAdd and ExtDBDel in lib/ROADS.pm.

Note that these programs should expect to receive the details of the template handle being added/deleted, and the filename to be operated on, as environmental variables - $HANDLE and $IAFAFILE respectively. When updating an existing template, the WWW based template editor will try to delete the old copy of the template before adding the new one.

12. Future developments

We're constrained by our management committee's decision that ROADS version 3 should not incorporate major changes - the original plan had been for version 3 to be a complete rewrite of sections of the ROADS codebase. This is now going to be a separate activity...

[From John Kirriemuir: note - this is due to ROADS not wishing to release code not tested enough close to the end of the project funding, which is currently December of this year. However - the future funding situation for ROADS appears to be quite bright, and so we're optimistic of carrying on development and other work into 1999 - see the posting on the RDN mailing list].

It would be a useful addition and non-disruptive to the code base for us to have a "writetemplate" function analagous to "readtemplate". We currently have a number of pieces of code which independently create templates, which this could replace.

A programmatic API for rebuilding the database and/or HTMLized listings might seem attractive at first glance, but essentially we have this already in the form of the mkinv.pl, rebuild.pl, addsl.pl and addwn.pl programs. Most of the common code in these tools has (as of ROADS version 2) been incorporated into library routines in any case. My vote (if I have one!) would be for leaving this alone.

Many people are interested in using SQL based systems (we'll include people running Microsoft Access here) to hold their ROADS databases. We have the problem of finding a common schema and/or defining mappings between schemas in use locally and template types/attributes. Ideally we could provide a single ROADS backend for people using disparate SQL systems and use the Perl DBI module to paper over the cracks - we can't reasonably support N slightly different SQL backends (where N is the number of ROADS installations using SQL). A discussion would be useful here.

Jon has created an unofficial extension to the WHOIS++ protocol to support the editing of templates via WHOIS++ - so that the Tk version of the ROADS template editor can be run on a different machine from the one which is hosting the ROADS server. I propose that we make this a feature of the ROADS "API" if it is successful, so that we have a protocol level mechanism for creating and updating templates.


Requirements and comments collected so far...

  1. // index_template(handle)
    should return an index_term_struct, where
    index_term_struct = ((doc_type,attr,value)+)

  2. boolean init_database();

  3. boolean close_database();

  4. boolean add_to_database(handle){
    index_term_struct=index_template(handle);
    //
    add terms to database
    // }

  5. boolean delete_from_database(handle){
    index_term_struct=index_template(handle);
    //
    remove stuff from database
    // }

  6. result_set=search_database(query_struct){
    // if we're talking relational databases, then
    // query_struct should be readily hackable to an SQL query
    // a bit abstract, not the raw whois++ query.
    // result_set={handle*}, null=> no result.
    // specify the format of result set }

  7. The distribution version of ROADS could have a db_api.pm; this could then be directly replaced by a local db_api.pm

  8. Enable us to create/manipulate ROADS templates/data using MS Access (hooks to MS Access database)

  9. Enable volunteers (at other sites) to complete templates and either send these in to us or them add these templates themselves to the ROADS database

  10. Enable us to use foreign characters in our templates: eg. Greek and Russian

  11. The current backend API (WGI) is read-only - it's a way of bolting a Whois++ server onto an arbitrary backend database. I guess there has been/will be some talk about making this interface read-write (the Whois++ based API mentioned above?) to allow the ROADS cataloguer tools to access the database? I must admit, this sounds less useful and perhaps not necessary? In the main, where WGI is used to bolt Whois++ onto a database there will already be (non-ROADS) tools for creating records. Or am I missing the point?

  12. A search against a ROADS database using search.pl always returns HTML. Although the format of this HTML is highly configurable, I suspect that it would be difficult or impossible to return something other than HTML - RDF for example? Would it be possible to allow system administrators a formalised way of plugging in their own output routines to make it easier to return formats other than HTML. Similar to the way they can replace the current ranking code for example?

  13. [A more comment-oriented reply] First of all, I do not think you need to create an API. The software works just fine. Any software you create will only satisfy 80% of anybody's need; there will always be something that somebody wants that the software does not do. You are battling something I call "creeping feature-itis." Furthermore, once you create an API, it will only available to programmer types because it will be implemented as a programming library called from another programming language like Perl or C. Based on my knowledge of the ROADS software an API whose purpose is to provide hooks to other database applications is not necessary, but it would be nice.

    That said, it would be nice if ROADS provided more direct means of adding, editing, reporting on, and deleting records from an SQL database. The WHOIS++ templates are nice and simple. Any database application/programming language combination worth their weight in salt can export these templates. That is how I created ALEX. What is especially nice about the SQL database solution is portablility. Once records are saved in a relational database, it will be a trivial task to export them to any number of meta-data formats: MARC, GILS, SOIF, WHOIS++, etc.

    Consequently, your API may need to include functions for adding, editing, and deleting records in a relational database. The Perl library called DBI goes a long way to doing much of this for you. As your fellow ROADS workers know, MySQL may very will be a good SQL database to build on in combination with the DBI library.

    Since I have not used the template editor very much, I do not know a lot about its limitations. I do realize that everybody's list of fields to include in their databases would vary from site to site. If you wanted to get tricky, you could create an HTML form that would allow you to define the fields for you database. The form would then run a script that would create the database, define the fields, and create a data-entry form. If you did this then you would essencially be creating a complete Perl front-end to a database application. I doubt you want that.

    In short, here are some suggestions:

    1. modify the system so it uses an relational SQL database like mSQL or MySQL
    2. allow the librarian to create the structure of the SQL database through an HTML form
    3. create a module that allows data-entry into the database based on the structure defined in Step #2
    4. add a module that reads the SQL database and outputs WHOIS++ templates and alternatively other meta-data formats

    Frankly, I wonder at the need of any of this. Again, the API will be designed for programmers. If an institution has programmers at their disposal, then the programmers will already have figured out how to put things into their database, provide a web-based front-end for data-entry, and then output templates. From their the programmers will use the ROADS software to index the templates and provide access to them.


ROADS Software Information Gateways ROADS News ROADS Liaison InterOperability Template Registry Cataloguing Guidelines What is ROADS ROADS Guidebooks Papers and Reports Related Initiatives