| jon | I'm happy with point 1) (no major new features).. Any no's out there?
|
| ecdb | Does 'creating a record' count as a major new item?
|
| martin | We do do it already :-)
|
| ecdb | via API?
|
| jon | I don't think so. Its more of a repackaging of what's already there. I'd taken "no major new features" to mean no new funvationality AND an API for it.
|
| ecdb | OK.
|
| jon | Am I right in thinking that (seeing as I was in Brum yesterday!)
|
| jon | Is everyone happy that there are no major new features requiring special APIs?
|
| martin | Yep!
|
| Bob | yes
|
| jasper | uh-huh
|
| jon | Right, then the second conclusion from yesterday was that some of the external record creation/indexing/removal code be pacakged up as Perl modules.
|
| jon | Is this OK by everyone?
|
| martin | Yep!
|
| Bob | yes
|
| ecdb | Yep.
|
| ecdb | 'some of the'? Not all of it?
|
| martin | The details are below...
|
| jon | Just the three things in the web page - creation/indexing/removal.
|
| ecdb | Yep, so we're saying that all creation/indexing/removal will go through the modules?
|
| jon | I guess so. Seems a bit pointless otherwise.
|
| martin | :-)
|
| ecdb | that's why I was puzzled by 'some of the ...' above.)
|
| martin | Don't get hung up on the language - it was just written in a hurry
|
| Bob | just to be pendantic, indexing means taking a template and getting a data structures, that is then feed, by whatever mechanism into the presistent storage system.
|
| ecdb | No problem!
|
| jon | OK "all of" - I was cribbing the line "Some existing functionality in external Perl programs <blah> <blah>"
|
| ecdb | While we're on the topic of pedantry... could we move to saying "record" consistently instead of sometime record and sometimes "template"?
|
| jon | Bob - we can discuss the details of what indexing means later in the meeting.
|
| Bob | ok
|
| jon | Dan - this is a historical hangup but yep, sounds reasonable. Maybe we should use "template" for the unfilled in outlines and "record" for the filled in version
|
| jon | Martin and I are bound to slip in meetings/IRC/phone calls though as we're so used to using them interchangably
|
| martin | I've replaced the couple of references in the future tense to ROADS templates on the agenda page with ROADS objects
|
| ecdb | Will this be a pain for you in the code? eg. readtemplate()?
|
| martin | s/readtemplate/readobject/ :-)
|
| ecdb | :-)
|
| jon | OK, so I think we agree that all record creation, indexing (whatever that turns out to mean) and removal will be packaged up as perl modules.
|
| martin | Yep!
|
| jon | Onto the last conclusion - that we'll basically clean up the code to get rid of global variables were possible and
|
| jon | reach an agreement on the standard parameters and returns from the API calls/
|
| Bob | umm, i went yesterday just as dan and martin were moving into the areas of support for wppd.
|
| jon | s/\//\./ :-) for the programmers out there
|
| martin | We didn't really move on much further - the agenda is mostly me trying to rationalise how we'd actually do what we discussed on IRC yesterday
|
| martin | do == write code to do :-)
|
| ecdb | As for getting rid of global variables... will that help with getting ROADS running under fastCGI type setups?
|
| martin | Yep!
|
| ecdb | kewl
|
| jon | The globals are mostly due to my sloppy coding/the fact that the ROADS tools have often evolved (mutated?) beyond their original aims
|
| jon | So does that sound OK to people?
|
| martin | Yep!
|
| jasper | sounds good
|
| jon | Right, that seems to move us onto section 3 in the agenda...
|
| jon | OK, the first suggestion is that we have an open database routine
|
| Bob | just a sec, what database, template or index?
|
| jon | Not really too much use with the current ROADS backend but as Martin says, it could well be useful for forming connections to SQL databases
|
| martin | Bob - It was your suggestion :-)
|
| Bob | why ideas were all oriented to wppd/sql interaction for faster data retieval
|
| Bob | most of the talk yesaterday seemd focused on the template side of things, and missed the wppd side
|
| martin | See the "Perl visualization" section later on in the agenda for how I'm picturing this getting used in practice
|
| jon | The fact that you can currently see the ROADS index as separate files from the actual source files is an "implementation detail"...
|
| jon | ...its all still one database surely?
|
| jon | To think in SQL for a moment....
|
| Bob | if you say so... not convinced
|
| Bob | i think of them as separate
|
| martin | The rationale for "opening the database" specifically is that with session oriented backends you need to get a handle at the start of the session, and will most likely need to pass it to other functions which interact with the database backend.
|
| jon | our library OPAC all exists inside a single Sybase SQL database. That's the MARC data (equivalent of our records) and the indexes (equivalent of our indexs in ROADS/guts)
|
| martin | s/need/may need/ :-)
|
| jon | Does that make sense to people (or am I talking silly?)
|
| martin | I should mention in passing, in case it's not clear, that the things after the dashes "-" in the bullet points for section 3 are what we do at the moment (where appropriate)
|
| ecdb | jon -makes sense to me
|
| Bob | I just think of them as separate as I can envisaged having a go at the code to handle templates at a different time to the code for the index
|
| Bob | still, looks like i on my own here
|
| ecdb | yesterday, there seemed to be some talk about splitting out the indexing from the main datastore, eg. keeping records in ascii files but indexing using some relational table somehow
|
| jon | That's still an implementation detail. The fact is that the records (ie filled in templates) + the indexing (how ever it is done) is what I'd call a database for this discussion
|
| ecdb | OK.
|
| jon | Of course, you might all disagree with that "fact"
|
| martin | No :-)
|
| Bob | 'for this discussion' then
|
| jon | OK then, in that case does an "open the database" API routine sound reasonable (it does to me - I tended to think of SybPerl's dbopen routines when I read that)
|
| martin | Yep!
|
| jasper | yes
|
| Bob | yes
|
| jon | Rightio, I think we can take it as read that the close database is OK and move onto the get a list of record handles in the database
|
| martin | We didn't explicitly discuss this yesterday - but it's something that a lot of the ROADS tools need
|
| Bob | seems to be a useful function
|
| jon | This is currently done using ROADS::ReadTemplate::readalltemps() - obviosuly if you plug in Brand X database and use that to store the records you'll need to have some way of retrieving all the handles
|
| ecdb | Oh, i've been using `ls source/` ;-)
|
| martin | tsk!
|
| jon | sort of like (in SQL): select distinct(HANDLES) from RECORD
|
| ecdb | Yes, it would be useful.
|
| Bob | lucky you, I get arg list too long, ...
|
| jasper | me too
|
| Bob | some things gone wrong if you want distinct(handles), but thats the isea
|
| martin | cut -f1 -d' ' guts/alltemps
|
| jon | OK, seeing as we're talking about inputs and outputs later on, can we say that an API to get all the handles is a good idea and move on to...
|
| jon | ...reading a record
|
| jon | That's currently done with ROADS::ReadTemplate::readtemplate()
|
| jon | I guess in the new naming scheme that's going to need to be something like:
|
| ecdb | Soon to be ROADS:ReadRecord:readrecord() ?
|
| martin | :-)
|
| jon | What Dan said... :-)
|
| martin | Or ROADS::API::Record::read()
|
| jon | Or better yet, what Martin said. :-) :-)
|
| Bob | ok so far
|
| jon | Which I guess means that we'll also want a ROADS::API::Record::write()
|
| jon | right?
|
| martin | Yep!
|
| Bob | yup
|
| jasper | yip
|
| ecdb | Are we voting on this? (ie. yep)
|
| John_k | Yes
|
| jon | Now whether this writes out to an ASCII file as we currently do or sticks it into some un-indexed SQL table or BLOB or whatever is again an implementation detail in my mind. Am I right?
|
| martin | Yep!
|
| Bob | yes, a mere trifle
|
| John_k | That is an ecumenical matter
|
| ecdb | Are we talking about a "record" object with a write() method? or a write function thats passed a record?
|
| Bob | now that is a detail for 'ron
|
| ecdb | ok
|
| jon | We can discuss that in the input/output part of the meeting later.
|
| jon | OK then, next - an API call to index record. Something like ROADS::API::Record::index(). I guess this is time to look at your definition of indexing again Bob.
|
| jon | To my mind, indexing is the operation of generating some (implementation defined) data structures that allows you to rapidly locate records matching a users query
|
| ecdb | Sounds ok to me.
|
| jon | Now, I'd say that it would be "persistant" in that that the index would remain intact after the Perl code that created it has exited....
|
| Bob | this comes down to my confusion if the index and the templates are 'the same database' or not
|
| jon | ... but that it isn't persistant in the 'be around forever" sort of persistance.
|
| martin | If you only use the ROADS::API code to read/write/index/... your records, it doesn't matter whether they're the same database
|
| martin | Bear in mind that you might - or might not - want to store the objects in the same database you use for indexing and searching
|
| jon | OK, in which case I take it that a ROADS::API::Record::index() is OK by everyone?
|
| martin | yep!
|
| Bob | yes, i'm just trying to reorient my thoughts
|
| Bob | but yes, need an index function or class or whatever
|
| ecdb | yep. I think. having the index implimented in one system and the data stored / cached elsewhere should be OK within this framwork. If anyone want to do that.
|
| martin | Think Glimpse or freeWAIS
|
| jon | Dan - its one of those cool "implementation details" that as API designers we don't need to worry to much about (as long as the API doesn't go out of its way to make it impossible to do!)
|
| jon | Right, assuming that the need for an indexing call is OK, I think we can move onto the next one: delete a record.
|
| ecdb | ROADS::API::KillEmAll()
|
| jon | Now this does make me think - remove the record AND the index or just the index data or...?
|
| ecdb | Deleting gets my vote.
|
| jon | So I'd like to propose this:
|
| jon | ROADS::API::Record::unindex() and...
|
| jon | ROADS::API::Record::delete()
|
| martin | I'll go for that!
|
| jon | The late would do the former and delete the actual data
|
| jon | s/late/later/
|
| jasper | yep
|
| ecdb | Yep
|
| jon | Most people would probably only use ROADS::API::Record::delete() most of the time, but the other is there in case its needed
|
| Bob | im still thinking this
|
| Bob | i.m not sure
|
| Bob | I think that the operation delete template has two distinct elements
|
| jasper | Hmm.. shouldn't we keep the two methods separate?
|
| Bob | the template element and the index element, should the functions operation on different bits of the same ( :-) ) database
|
| jon | Jasper - what you mean have ROADS::API::Record::delete() NOT unindex the record? I'm easy on that one.
|
| Bob | be coded within the same function
|
| jon | Bob - I don't think I quite follow you. Could you explain some more please?
|
| Bob | deletefromindex(handle) and deletefromtemplate(handle)
|
| ecdb | I'd prefer both functions to exist, and have a convenience function that did both
|
| Bob | i dont think the delete a template function should have an implicit call to delete the keywords from the index
|
| jon | Ah right... well that's fine by me - any objections?
|
| Bob | is the computer science jargon 'cohesion'
|
| martin | :-)
|
| martin | Can we wrap this up with - you might want to store a template independently of indexing it, e.g. if it hadn't been approved, and just been submitted by a trusted info provider ?
|
| martin | Ditto removal from storage, removal from index
|
| Bob | nice example
|
| jon | Hang on a mo though - deleting a record without un-indexing it can break things surely?
|
| martin | where deleting == deleting from "storage"
|
| Bob | sure, but if i'm writing some code to replace your functions,. thats my problem?
|
| jon | whatever "storage" turns out to be an in a particular implementation
|
| Bob | as long as your base distribution does the right thing
|
| jon | Fair enough, but it does seem a little dangerous to give the API built in semantics that can nuke a database's consistancy.
|
| jon | And what if someone else later uses your code without realising that the record removal doesn't un-index?
|
| jon | Still, if people are happy with that danger I don't mind pressing on...
|
| Bob | they'll have some fun then won't they :-)
|
| martin | Our base distribution will call deindex/unindex/... what have you when deleting a record :-)
|
| jon | Other opinions on this before we move on?
|
| jon | OK, lets move on to performing a search. I think we can safely say that most people will want this, right?
|
| martin | We don't have to thrash out absolutely everything today - but it would be good to get through as much as possible while we're focussed on what we want
|
| martin | PS Searching is good for me!
|
| ecdb | Searching's good. Are we talking about an API for pulling handles out of the database or distributed whois++y searching. The former I'd have thought.
|
| martin | ROADS::Index::index_search()
|
| martin | i.e. searching a local "database"
|
| Bob | is this a detail, or, will the search function provide a protocol neutral result stream, and will the roads SW format this to a wppd format
|
| ecdb | Yep. Is the proposal that we treat the backend as something that understands whois++ queries? ie. just send in w++ formatted query
|
| ecdb | (+ what Bob said)
|
| martin | that's an input/output issue - can we deal with it later ?!
|
| Bob | ok then
|
| ecdb | OK. Last thing on searching: what about referrals? Is this something we consider all backends as potentially capable of doing?
|
| Bob | not mine
|
| ecdb | or is the forward knowledge stuff separate?
|
| martin | a) that's outputs, and b) see sample Perl at end of agenda !
|
| Bob | but the wppd on front may
|
| ecdb | OK. Section4 then...
|
| jon | Up to you I would guess, but the API should allow it. As Martin said we can discuss that later.
|
| jon | OK, to speed things up in Section 4, can I just put out each headline and then you all yell out "OK" if its fine or discuss it otherwise?
|
| martin | OK
|
| Bob | OK
|
| jasper | ok
|
| jon | open database
|
| ecdb | OK
|
| martin | Yep, er OK
|
| ecdb | Sounds handy
|
| John_k | ok
|
| jasper | ok
|
| Bob | local or remote databases important here?
|
| Bob | as thisd is inputs thinking about arguments
|
| Bob | or is this trying to be too clever (or too silly) by half
|
| ecdb | guess it could be remote. Could you use URIs to identify (eg whois++://omni:8327/ )
|
| jon | I was going to say local but Martin is just convincing me that it should be both. In which case I guess we need to have a structure of some sort to pass hostname, port, protocol, etc
|
| ecdb | Howabout just identify the database with URI, and let ROADS throw an exception if it can't talk to what you've asked for.
|
| jon | URI might be a good idea in some cases but what about SQL databases? Sybase://liba.lut.ac.uk/DB=prod_talis/ ??
|
| Bob | nick odbc style ideas
|
| Bob | as jons example
|
| martin | The intention when writing this document was that it should be for opening a back end database, and that you should use ROADS::WPPC::wppc if you wanted to talk WHOIS++ to someone. It might, however, be useful to be able to wrap up other things in a single API... Can we concentrate on the backend database for now, though ?
|
| martin | Bearing in mind that ideally ROADS::WPPC will become Net::WHOIS++ or something like eventually....
|
| ecdb | OK. So there's no issue of identifying which database as there's only one per installation?
|
| jon | Bearing in mind of course that the world being what it is, a "backend" database might not actually be on the machine that the Perl code is running on.
|
| ecdb | yep
|
| martin | I'd like us to be able to ship support for multiple contributed backends, e.g. Glimpse, freeWAIS, ...
|
| martin | And for people to be able to choose in say ROADS.pm which one to use
|
| ecdb | SWISH-E is half finished - need to do something about fielded searching...
|
| martin | SQL type people might need to pass user names and passwords too
|
| ecdb | How about... each ROADS installation has a set of convenience names for backends it knows about
|
| martin | My reading is that we probably couldn't reach closure on arguments today, but we can protect ourselves with the Database => $database type approach (see Perl fragments below)
|
| martin | below == below in the agenda :-)
|
| ecdb | and per-backend configuration files appropriate to the type of backend we're dealing with.
|
| jon | SQL people will also need to pass database names as a single SQL server typcially hosts multiple independent databases.
|
| ecdb | I agree.
|
| jon | Sounds nice and flexible to me. I guess the proof of the pudding is in the eating though... :-)
|
| ecdb | So... give ROADS a notion of named backends "GRAPEVINESQLBACKEND" and leave it to the implimentors to handle the various other parameters associated with that backend type and that particular database within the backend.
|
| martin | Something like that!
|
| Bob | yep,
|
| ecdb | This is what we've done with grapevine more or less. An adhoc config file mapping fieldnames etc.
|
| ecdb | Shall we move on?
|
| Bob | yes
|
| jon | OK, lets move on. close database inputs.
|
| jasper | ok
|
| martin | OK
|
| jon | OK
|
| Bob | any assumption for SQL users about commit and stuff
|
| jon | I assume an $database->close(); will commit any un terminated transactions if it can or return a failure code.
|
| ecdb | up to them I'dguess. if their notion of writing a template involves several stages, they can wrap it in SQLese to make it an atomic operation.
|
| Bob | im just flaging thoughts here
|
| Bob | next point?
|
| jon | read record
|
| ecdb | I don't like the filename bit.
|
| ecdb | Is there a need for it?
|
| martin | Seemed like a good idea at the time :-)
|
| martin | This is what ROADS::readtemplate::readtemplate() currently does, though we might not want to advertise it
|
| Bob | as this is an api, handle is enuf, then the local implementator can do it to read a file if wanted
|
| ecdb | Can we have a quick vote? Does anyone feel that we need to say its a filename? (of course it could be , behind the scenees)
|
| Bob | no
|
| jasper | no
|
| jon | I'd go along with Bob - the filename was a hack that made life easier for us but we can work round it.
|
| ecdb | no
|
| jon | (that's a no folks)
|
| martin | In the current implementation the filename is a separate argument which you can supply. If you don't the template with the specified handle will have its handle mapped to a filename in guts/alltemps.
|
| martin | That's a long winded way of agreeing with everyone :-)
|
| ecdb | OK. What about writerecord... could you sketch how variants and clusters work? and how attribute ordering happens?
|
| jon | OK then, onto: write record
|
| martin | Dan - it's up to the backend implementor to decide how to store the object
|
| ecdb | Yep. So theres no ordering of attributes other than what you know separately from knowing the outline files
|
| jasper | what happens if the hash doesn't contain elements that are already in the record?
|
| jon | In our case I'd say that the basic code from mktemp.pl would just use the config/outlines to determine the attribute ordering (not that it really matters)
|
| Bob | just a though, we have not thought about the template structure, martin has a possible assoc.
|
| Bob | ie the representation within the api
|
| martin | I've got two socks :-)
|
| Bob | same color?
|
| martin | Actually, yes!
|
| jon | Colour even.
|
| martin | Can we discuss object structure "later" ?
|
| Bob | ok
|
| Bob | non-linear or what
|
| jon | OK then folks - are we OK to go onto: index record?
|
| martin | One thing - Jasper - I'd say that an update of an object *should* overwrite an existing one, so any existing bits which aren't in the updated version *should* be lost
|
| jasper | ok
|
| ecdb | Seems simple enough to remember
|
| ecdb | What about locking?
|
| jon | Which is not how my CHANGE operation in the WHOIS++ server for v3 works I might add... :-)
|
| martin | Locking is up to the backend implementor ?
|
| Bob | yes locking
|
| Bob | might make it optimum
|
| ecdb | How will the backend know when a record is being 'checked out' for editing?
|
| martin | That could be part of the ROADS object structure ?
|
| jon | The reason being (for CHANGE) that not all of a template might be returned to a template editor for editing.
|
| Bob | does it matter? delete record, delete terms
|
| Bob | edit in browser, insert termplate insert terms
|
| jon | Dan - it doesn't. However the CHANGE protocol allows you to dump the templates in to a pending directory so that a cron/web based indexer can pick them up later
|
| jon | But enough of this aside on CHANGE... I shouldn't have mentioned it really. :-)
|
| ecdb | OK. Guess it wont hurt in practice.
|
| jon | Right, is everyone comfy with the inputs to index records?
|
| martin | OK
|
| jasper | ok
|
| ecdb | okj
|
| Bob | yes
|
| jon | Ok then: delete record
|
| jon | (I guess this is actually two now - unindex and delete)
|
| jasper | ok
|
| ecdb | handle is fine
|
| Bob | yes
|
| jon | What about a * to unindex/undelete all records. Martin doesn't like it but it seems to me that some people might like it. Opinions?
|
| Bob | no
|
| ecdb | dangerously appealling. no.
|
| martin | you might just type it by accident somewhere... :-)
|
| Bob | perform search then
|
| Bob | no for whois query format
|
| jon | OK, so I think that we can safely say no to a * in un-index/delete and use handles. OK with all?
|
| martin | OK
|
| Bob | ok
|
| jasper | ok
|
| jon | Right: perform search
|
| ecdb | agree with Bob. No to whois++ format. We should start to lose the assumption that everything/one will end up using WHOIS++ :-(
|
| Bob | a bit more of an abstract format that wppd
|
| jon | Reverse Polish Notation? :-)
|
| ecdb | Yep. Simple though...
|
| martin | Sick!
|
| ecdb | (yep was to bob not to poland)
|
| ecdb | We should look at whether we've any requirements of a query interface that W++ didn't meet.
|
| ecdb | eg. "please exandthis query with your thesaurus"
|
| martin | you had your opportunity :-)
|
| ecdb | or ability to search for records where Subject-clas-fication-scheme is DDC and value is 321
|
| ecdb | I'm not saying impliment this. Just that if we move away from W++ it might look feasible to think about this more
|
| martin | We've got a query language already - if we're going to replace it, we need to replace it with something. What ?
|
| jon | OK, so if not WHOIS++ or RPN parse trees, what then?
|
| ecdb | Well it could be the ROADS query language which just happens to look a lot like Whois++.
|
| martin | :-))
|
| Bob | actuall, an rpn tree structure may not be such a bad representation
|
| martin | which bit of WHOIS++ query language don't you like ?
|
| ecdb | I've no go answer to the what-would-be-better. Theres SQL3 and OQL to read up on...
|
| martin | I'll get me coat :-)
|
| Bob | easy to parse and turn into any (?) format
|
| jon | SQL is a no-no as it relies on you knowing the E-R model to form sensible queries.
|
| ecdb | I don't like the fact that there are things I wanted to ask of my server that whois++ didn't allow. It knows nothing about the variant/cluster structures we use to organise the data.
|
| jon | And an immediate <CLUNK> to anyone who suggests Z39.50 style ASN.1 :-)
|
| ecdb | :-)
|
| ecdb | There are noises in Z39.50 land about a Z39.50 lite, and possibly alligning that with a future RDF query language...
|
| jon | That's because the variant/cluster feature of IAFA templates was, after we started to use them, considered to be broken.
|
| ecdb | And RDF will probably pinch something that already exists...
|
| Bob | umm, is the search "behind" the wppd, or paralled to it?
|
| ecdb | Oh, you never told us it was broken.
|
| martin | You never asked :-)
|
| jon | I didn't think it was for what we were using it for. But the more you try to push it to do....
|
| ecdb | Yep. It's not great for complex data. Multilingual with multiple URLs and 3 classification schemes, two authors etc...
|
| Bob | thats not what i called the variant/cluster "feature" of IAFA
|
| martin | Hang on, folks...
|
| ecdb | OK.
|
| martin | We're talking about query language here
|
| martin | Just to keep us on terra firma :-)
|
| ecdb | Which is bound up with the logical structure of the thing being queried.
|
| ecdb | no?
|
| martin | The logical structure and how its written down are two different things
|
| martin | Logically the variants and clusters thing is very nice
|
| ecdb | Agreed.
|
| martin | In practice it's a pain for the reasons we've discussed from time time
|
| martin | ...time to time
|
| ecdb | More like, whois++ doesn't understand the nice logical structures IAFA gives us.
|
| jon | OK, here's a proposal to get us out of this quagmire: we pass in an argument of the search syntax to use as well as the query
|
| Bob | sorry to repeat myself, but I've assumed that the search function would be behind the wppd, is that the current working assumption
|
| martin | Bob - you mean this is what wppd calls when it gets a query ?
|
| Bob | yes
|
| martin | That's my assumption too. Though it could be used behind something else - e.g. LDAP, Z39.50, ...
|
| martin | :-)
|
| ecdb | Yes
|
| jon | Something like $result = new ROADS::API::Result $database->search("WHOIS++", "sex and drugs and rock and roll");
|
| martin | Harvest, ...
|
| Bob | no jon
|
| ecdb | Which brings us to the syntax of the stuff that comes back...
|
| martin | No - that's later
|
| Bob | i think roads should make some effort to parse the query
|
| jon | OK, so if we don't what to have multiple syntaxes, then we need to decide on the syntax to pass into this API call.
|
| Bob | i liked the idea of a tree structure
|
| jon | Though what about $result = new ROADS::API::Result $database->search("ROADSParsed", "(((sex drugs AND) rock AND) roll AND)");
|
| jon | for those that want it?
|
| Bob | i'll be happy with that, perhaps ought to writer that chunk of code in lisp, but ....
|
| ecdb | Query languages hurt my head
|
| martin | What do we gain by doing this ?
|
| jon | We take the parsing of what ever we get in from the punter into a separate routine I guess so that the backend code only really needs to know about its particular query language(s)
|
| jon | Different backends might share the same "query normalisation" code though
|
| Bob | roads parses the query to an independant format, and parsing the query is the hard part. the implementator takes the abstract representation and creates SQL or whatever, relatively easy
|
| ecdb | sounds plausible to me
|
| jasper | what about options (case sensitive, etc)?
|
| jon | Well, I don't know about that Bob - I don't really think there's much difference in complexity between the WHOIS++ and the RPN if you're going to have to translate it into SQL. But that's just me.
|
| Bob | agreed that whois is not too bad and could be used, just thinking what I'd _like_ here
|
| jon | I guess that for case sensitivity, etc, you'd actually need something that did quadruples rather than (RPN style) triples:
|
| jon | (sex drugs AND "case;expand;turnpurple")
|
| ecdb | do you want case/expand/etc properties per bit of the query or per query?
|
| martin | and if we're to incorporate support for multiple languages, character sets, and character set encodings...
|
| jon | Per bit of the query would allow local constraints (a bit of the WHOIS++ spec we currently ignore)
|
| martin | ... and which is being written out of the updated WHOIS++ spec
|
| jon | And someone is *bound* to want it for something in a year or so's time.
|
| ecdb | Updated spec? WHOIS++++? I thought W++ was 'resting' now.
|
| martin | No, draft-ietf-asid-whois*
|
| martin | The main thing is (going back to query languages :-) that we have WHOIS++ already, but if we do something else it means that we have to do lots of work.
|
| martin | Laziness, Hubris, ...
|
| martin | :-)
|
| ecdb | OK, appreciate that. Only concern is that ROADS isn't seem as just a whois++ system.
|
| ecdb | Z39.50 blah blah RDF blah blah LDAP etc. etc.
|
| martin | So, if we do move away from WHOIS++ as a query language, we need to be very clear on why we're putting lots of effort into doing this
|
| ecdb | Yes. Shall we continue this another time?
|
| Bob | yep
|
| jasper | ok
|
| martin | OK
|
| jon | Move the discussion to open-roads?
|
| jon | or roads-hackers more probably?
|
| ecdb | yep
|
| martin | OK
|
| martin | (either :-)
|
| jon | OK, roads-hackers it is then. Onto the gorgeously chunky section 5 of the agenda....
|
| jon | Outputs. Again, I'll call them out - OK if they're fine, argue (discuss) otherwise
|
| jon | open database
|
| martin | OK - though see database object stuff below :-)
|
| jasper | ok
|
| Bob | could have a simple boolean, or try to pick up the SQL resulkt code
|
| Bob | (if you use sql)
|
| Bob | pick up -> return
|
| jon | I think we need a stand "Yes it worked"/"No it didn't" with a follow on implementation defined error code/message if needed
|
| jon | s/stand/standard/
|
| martin | I'm thinking that in practice we'll probably want to return an object that encapsulates any private info which is needed by the back end for that database, or "undef" if the open failed
|
| martin | This didn't make it into the agenda, coz I forgot to put it in!
|
| martin | e.g. might need to hold a socket/file descriptor, process ID, Unix domain socket's filename, ...
|
| jon | How about always return an object (no undefs) but have a OK/Not OK flag in it so that you can then look into the object for the error code/message if things go pear shaperd
|
| martin | OK!
|
| Bob | yo
|
| jasper | ok
|
| jon | OK, then: close database
|
| Bob | same for close ?
|
| jon | How about more or less the same thing
|
| jon | Right Bob!
|
| jasper | ok
|
| martin | OK
|
| jon | OK, straight onto: read record
|
| Bob | it looks like a difference between a totally exposed assoc. type structure or an encapsulated object?
|
| martin | if it's an object, we get to give it methods, and we *could* hide the internal data structure
|
| ecdb | :-)
|
| jon | We've got a hash now and that sort of works. A Perl object would be neater though. I'd go with the latter were it not for the fact that our glorious exec committee didn't like the idea of us tweaking all the code to use objects
|
| Bob | but its a real pain writing OO code correclty
|
| martin | In Perl ?
|
| Bob | any form of OO, java, perl c++
|
| martin | Perl is very... forgiving :-)
|
| jon | Perl is the foam padded banana seat in the trike of OO programming languages
|
| martin | Suggest people check out the use of the $record object in the sample Perl to get a flavour for how I saw it being used
|
| ecdb | Using other people's Perl objects is a joy. Writing OO perl modules yourself is less fun I think.
|
| Bob | 'cause you can ditch/bypass all OO constraints when you want to
|
| martin | I've been reimplementing ssh in Perl, and it's not at all bad actually
|
| martin | I find myself wanting to turn everything into a Perl object :-)
|
| Bob | should that be on alt.sex.deviant
|
| jon | Advanced Programming in Perl makes objects seem like fun actually (OK, so I'm a sick puppy)
|
| ecdb | Yeah? Did I mention that theres a perl language binding for the ILU corba/dist-obj system...
|
| martin | I just discovered (reading Usenet in another window... ) talk.underwear.veg :-)
|
| jon | No, not the CORBA word, please!
|
| martin | Can we come back to the result of reading a template in later and just zip through the rest of section 5 ?
|
| ecdb | OK. How about the HTTP-NG word?
|
| ecdb | Yep.
|
| martin | Wash your mouth out!
|
| jon | Anyway, objects or hashes? Which is it to be?
|
| ecdb | Results of reading a template: objects. which contain hashes for now.
|
| jasper | I say objects
|
| martin | objects
|
| Bob | well, if jon and martin are so keen on OO then objects
|
| martin | :-)
|
| martin | And we're making work for ourselves too
|
| ecdb | OK. Will you need to ask permission of the exec committee on monday to work on this rockets science OO stuff?
|
| jon | Well objects from me as well. In which case JohnK and Martin will need to inform the Exec Ctte of this overwelming user demand... :-)
|
| martin | I think John has already pencilled me in for a session to explain what we did and what we decided...
|
| ecdb | Yep. Biz/ed officially requests Objects. Year2000 compliant objects at that ;-)
|
| jon | Expect them in 2014 then... :-)
|
| ecdb | Modelled in rdf and exposed over corba... ahem. sorry. swallowed a marketting manual...
|
| martin | One point - objects vs. complete re-write. They weren't keen on version 3 being a complete re-write, but if we need objects to make APIs work properly...d
|
| jon | <BANG> <thud>
|
| ecdb | Ouch!
|
| ecdb | Have we finished now? (i've lost my agenda url)
|
| jon | OK folks, objects it is and to hell with comitteee. On to write record
|
| martin | You won't get away that easily - http://www.roads.lut.ac.uk/roads-meetings/api-agenda.html
|
| martin | write record - OK by me
|
| ecdb | OK.
|
| jon | OK
|
| Bob | ok
|
| ecdb | A record being, like, an object or something?
|
| martin | a thingy
|
| Bob | same as open database for status code
|
| jon | A doodah
|
| jon | OK Bob - sounds good to me.
|
| Bob | its an object now then isn't it
|
| ecdb | yep
|
| jon | An objecty doodah
|
| jon | OK then: index record(s)
|
| martin | OK
|
| ecdb | OK
|
| jon | OK
|
| Bob | an encapsulated structure with a status code, as before, and some result structure
|
| jon | So that's an OK is it?
|
| ecdb | ok
|
| martin | OK
|
| martin | :--)
|
| jasper | ok
|
| Bob | ok
|
| jon | delete record(s)/deindex record(s)
|
| ecdb | killemall
|
| ecdb | sorry must stop typing that
|
| ecdb | ok
|
| martin | * err... OK
|
| jon | (and a special redrum option for Dan :-) )
|
| jasper | o k
|
| Bob | yo
|
| jon | perform search
|
| martin | OK
|
| jon | (this is the fun one)
|
| martin | to be defined later ... :-) in section 6
|
| ecdb | A set of ROADS doobries from the native database, describng the end resources.
|
| ecdb | And a set of ROADS doobries that describe referrals to remote searchable resources (ie. referrals)
|
| ecdb | Remotesearchabledoobries being resource descriptions like anything else
|
| Bob | here need to worry about the result stream going back intop wppd.
|
| Bob | plus status info
|
| ecdb | Why? Doobries will have a toWHOIS() method or something, no?
|
| jon | wppd.pl will be doing the formating of these ROADS results objects that the search API returns
|
| jon | Yep Dan
|
| ecdb | Where, if anywhere, should a ROADS database store descriptions of the remote services it has forward knowledge of?
|
| ecdb | eg if I know whois++://omni,.ac.yukj:// has a Title "OMNI - medical stuff" and a description ".....", where do we put this?
|
| martin | That's a separate issue
|
| ecdb | OK.
|
| ecdb | Can searches return referrals?
|
| jon | At the moment ROADS stores the centroids in DBM files and the metadata about the remote service that supplied them in a config file. These seem pretty efficient. However do people want to have the ability to put these into some other database
|
| jon | I think for the purposes of this API, we should say no. (oops looks like Bob's fallen in the water)
|
| ecdb | I think the can live fine where they are. I was thinking about how we deal with referalls
|
| martin | We do need to be able to deal with referrals in search results
|
| jon | That way we separate our database of records (and associated index) from the centroid information (which is what you use to generate referals)
|
| ecdb | if the search api has some notion of a referral, i'd like the api to also present me with a mechanism for finding out about the resource I've been referred to
|
| martin | We should bear in mind that at the moment centroids are completely separate from the main ROADS database thingy
|
| martin | (as Jon has just reminded me :-)
|
| ecdb | Quick, w.rt. john's point. We have both the database of centroids and the database of centroid-server-descriptions...
|
| jon | And I'd like to keep it that way!
|
| ecdb | Me too.
|
| martin | I have no opinion either way and will swing with the crowd :-)
|
| jon | So in which case the search API we're talking about here doesn't need any referal stuff in it.
|
| martin | OK by me
|
| jon | But we might want to put centroids in another backend database...
|
| martin | :-)
|
| jon | ... or more than one :-)
|
| ecdb | We could probably re-use the search API for searching the referral database? And just treat the records that come back (eg. whois++// urls) as referalls?
|
| jon | I guess so - in which case I guess I'd have to go back on what I just said and let the search results include a referal special return type.
|
| martin | ... which as I now realise begs the question of how we'd put centroids into the API ?!
|
| jon | Exactly
|
| martin | Strange that it didn't occur to me before. Must have been sunspot activity or something...
|
| jon | We need a ROADS::Centroid:: heirarchy
|
| jon | ROADS::Centroid::Open()
|
| jon | ROADS::Centroid::Close()
|
| jon | ROADS::Centroid::AddTerm()
|
| jon | ROADS::Centroid::RemoveTerm()
|
| jon | ROADS::Centroid::Search()
|
| jon | :-)
|
| jon | Jasper, Dan (if you're still there), martin - does this sound reasonable?
|
| martin | My theory is that we need to ask ourselves whether people need to write their own code to munge centroids
|
| martin | If they do, we should roll centroids and referrals into the API
|
| jasper | just catching up - interuptions
|
| martin | If not, we should leave things as they currently are :-)
|
| jon | Well seeing as it wasn't suggested in the API requirements, lets say that they don't for the time being.
|
| martin | :-))
|
| jon | If we treat centroids as separate things that we do internally we can bring them out into the light of day via an API at a later stage if people want them
|
| martin | Ok by me!
|
| jon | Jasper?
|
| jasper | [dan] - i'm in jasper room - in passing briefly - could you mail these Qs to roads-hackers? I'd like some support for building cnetroids. eg if i get a list of unique words per field into perl arrays, can ROADS help build a legal CIP object?
|
| jon | Ah, I'd like to steer clear of CIP. Pain in the ass.
|
| jasper | Is CIP a dead duck then? [dan]
|
| jasper | The basic idea seems sound. It's just a list of lists of words, per field, right?
|
| jon | Probably not, but for what people are currently using centroids for it ROADS its overkill.
|
| jon | No. CIP is a super complex indexing system that is open ended and lets you send more or less any thing you like (using MIME index types to say what it is)
|
| jasper | [dan] OK. What instead? Centroids (CIPv1)?
|
| jon | I've implemented tagged-index-objects using CIPv3 and its a lot more complex than centroids was
|
| jon | For the time being. If CIPv3 takes off in a big way we've got the code to munge it into our centroids structure if needs be now.
|
| jon | Lets say that if the world of CLUMPS comes up with Z39.50 servers using CIPv3 to permit some sort of forward knowledge we should be ready for it (and hopefully are now) but that I'd encourage ROADS users to stick to the conceptually simpler centroids.
|
| jasper | [DAN] been looking at the centroid methods suggested above; is the idea that there would be a getCentroid(string DATABASEIDENTIFIER) function which we could call and would give us a handle to a centroid for each database backend that roads knows about?
|
| jasper | [dan] w.r.t. Z39.50, I think we've got a bit of time till they catch up.
|
| martin | Yep, keep banging the rocks together guys :-)
|
| jon | Er no. I was thinking that we'd still have the wig.pl style config files that we have now but instead of just looking
|
| jon | in DBM databases we might look in (for example) an SQL database for a particular centroid.
|
| jasper | huh?
|
| jon | That way an implementation is free to either have one centroid per database or stuff them all into one database
|
| jon | Huh to which bit?
|
| jasper | getCentroid(DBID) would allow this? Huh to didn't know what you meant about DBM and SQL in same sentence [DAn being dumb not Jasper]
|
| martin | Something else we need to think about - we have an API for getting records in and out of back end databases, and searching/indexing/... - but not for generating a centroid of the records in the database. We probably need that, regardless of whether we have full blown centroid/CIP manipulation in our API ?
|
| jon | OK: you might (for some reason best known to an implementor) want to keep centroids for SOSIG in our DBM style databases but OMNI and ADAM ones in some SQL based database.
|
| jon | The wig.pl config file will need to tell the code where this backend database is (both for centroid gathering in
|
| jon | wig.pl and looking at in wppd.pl)
|
| jasper | [dan] agree with martin. need to be able to get some metadata about a database, including bulk metadata like a centroid.
|
| jon | Yep, I agree with Martin too.
|
| jon | $database->centroid()?
|
| martin | sounds good to me!
|
| jasper | [is-dan] xcentroid() returning what? A centroid textfile? Or a centroid Object(tm)?
|
| jon | We could have a $database->cipv3("application/index-obj-tagged;DSI=32.3.4.5.2.213") call in the future as well. :-)
|
| martin | I think we should have the option of getting a CIPv3 type centroid out of them, perhaps by specifying protocol/version as an argument. Backends shouldn't be required to support centroids or CIP, but should be encouraged to support at least vanilla WHOIS++ centroids.
|
| jasper | oh, right. $database implying some kind of Roads object for database?
|
| jon | Or something like that - needs a bit more thought if we do do it but I'd leave it at the moment.
|
| martin | The question is - is generation of centroids (and/or other manipulation of centroids) a core API requirement ?
|
| jon | $database is returned from ROADS::API::Database->new(BackEnd => "Glimpse"); in the examples in the agenda. Its the ojbect with the status of the open, internal guff, error code/msg, etc
|
| martin | Nobody's mentioned it until now... :-)
|
| jasper | [is-dan] Can/should database have a description of some sort associated with it? Title, Description etc? Subject...
|
| jon | I'd say generation of the centroid from a backend database is core, but I don't know about the actual centroid data itself (the stuff we've gathered rather than what we're generating)
|
| jon | Dunno. We've got a database name at the moment. I don't know whether a decription or other metadata will be of much use.
|
| jon | I suppose we could include an API call to get the metadata though.
|
| jasper | [dan] I think what I'm really after is richer metadata headers in the CIP things themselves.
|
| jasper | or centroid things.
|
| jasper | so if I get a referral to some finnish library, I know what it is before I go querying it and find out its all in finnish [dan]
|
| jon | Not in centroids you won't - we more or less do everything in there as it is. CIPv3, well, you can define your
|
| jon | own index MIME type... :-)
|
| martin | We still need to have a WHOIS++/CIP protocol level interface to get them back to somebody...
|
| jasper | ok
|
| jon | I think what you're after is what Z39.50 people call EXPLAIN. And which they are currently getting very confused about.
|
| jasper | [dan] Yep. self describing resources, more or less.
|
| jon | (bascially because nobody knows how to implement it properly to make database servers self describing)
|
| jon | I think we should put that down as a "do in the future" rather than an API thing now (though feel free to hack at it if you can see a clean way to do it)
|
| jasper | [dan] Yep. With Z39.50 its more of an immediate problem since a 'z client' knows very little about a resource just by being told its a z39.50 server. Knowing something is a whois++ server is a lot more, er, empowering.
|
| martin | So - have we wound up the API discussions for now ? :-)
|
| jasper | Yep. agree. I may hack some alternative non-CIP ish centroid format in XML with metadata headers. Just to see...
|
| jasper | [dan + jasper] Yep.
|
| jon | OK!
|
| jon | Right, time for some choccy then me thinks. Thanks for the input fellas!
|