Monday 23 August 2010

Shortcuts or rambles

As both a cataloguer and a catalogue user, I know that I come to the catalogue with a range of different needs and expectations. Sometimes I know exactly what I want - I may even know that it is in stock - so I am using the catalogue to check where I can find it or to make sure that there is a copy on the shelf when I call into the library on the way home. At other times I'm not at all sure what I want.

It seems to me that we cater very well for the person with a clear idea of what they are looking for. If you know the name of the person who created it and the title, you'll probably be able to find it straight off. If you return a number of hits, you'll be able to filter them by date, or by format - if you want the most up-to-date travel guide, or the DVD rather than the Blu-Ray version. Our catalogues are designed to make it as easy as possible to get straight to your destination, and that's great (as I said in the previous post) for people who know what they want.

If I have no idea at all of what I want, even then the catalogue will help, by providing lists of new acquisitions, or links to shortlists or titles of current interest - the "what's new" or "what's hot" kind of lists. These are pretty much the equivalent of library displays - something to catch your eye when you arrive on the site. There is usually scope for doing a lot more of this kind of thing, but at least most catalogues offer something.

However, let's suppose I want some books about Florence, because I am going to be spending a long weekend there and I want to plan my visit. If I were to go into my local library and speak to a real live librarian, then pretty soon we would be having a conversation about my holiday, and what I might like to do while I am there, and with luck I would be steered in the right direction to find a book that satisified me.

But if I go to the catalogue, and type "Florence" in the keyword box - I am going to get a great mass of stuff and very little help in trying to sort it out. First of all, I am going to find that I've got stuff about Florence Nightingale, and whole series of the Magic Roundabout (it might take me a while to work out why), and I have to filter these out, and there is probably no easy or obvious way of doing it. Even if I ignore all these, then there is no helpful guide asking me whether I am going to be spending all my time in galleries, or restaurants, or watching the football - or am I going to be hiring a bicycle and would I like a guide to cycle touring?

Of course there used to be a tool that did just this and it was called a subject index. It collocated distributed relatives - so it was very useful for refining a search (making it possible to define which aspect of a subject you were looking for) but it was also brilliant for reminding you of aspects of a subject you might not have thought of and which the library had in stock. It was the nearest you could get to having a conversation with a friendly and informed librarian.

So why don't any of our wonderful new online catalogues incorporate subject indexes? The most help they offer is a "Did you mean...?" which covers little more than mistyping. Why can't we make it as easy to find and choose between Florentine history and Tuscan cycle tours, as between the 2nd and 3rd edition of a book or between the Blu-Ray and DVD versions of Shrek 2?

3 comments:

  1. I think the current next-gen OPACs could do some of this as they already index the "library-approved" subjects contained in bib records. They may not assign any special extra weight to the subject terms or their synonyms, but they do already know what they are.

    At MPOW we have Innovative's Encore which presents a single Google-like search box. Following your search, Encore shows you a "refine by tag" tag cloud which is built from the LCSH in the bib records. In my MSc research on next-gen OPACs I found patrons easily discover this cloud and take suggestions from it - e.g. following a search for "labour party" the presentation of terms in a cloud such as "australian labor party", "great britain" or "independent labour party" will suggest new approaches to their search or limits they could add or remove as a subject index could (for example: http://encore.ulrls.lon.ac.uk/iii/encore/search/C|Slabour+party|Orightresult|U1?lang=eng&suite=pearl

    This is not very sophisticated. In particular no attempt is made to group related subjects you get a kind of LCSH soup. Flickr's clusters deal with suggestions for tags in a much clearer way, e.g. clusters for "bow" shows three different meanings and tags related to them rather well: http://www.flickr.com/photos/tags/bow/clusters/

    My research participants all understood that the Encore tag cloud suggests "relevant" terms drawn from records for items the library has, but were unclear on where these actually came from. Obviously no-one understood they were based on LCSH, but some understood a vague idea of "keywords" perhaps assigned by library catalogers. I could certainly imagine an LCSH-based tag cloud for the entire catalogue or subsets within it (based on discipline or classification) to assist during the "don't know what you don't know" stage of the search process.

    ReplyDelete
  2. Many thanks for this - and the Flickr example is an interesting one.

    I agree that the new Opacs attempt to provide access via LCSH, which is a help - as long as your library uses LCSH, and has always used LCSH (and keeps its LCSH up to date). My library doesn't - we only have LCSH in imported records, we don't add them to original cataloguing or maintain or check them in any way.

    Even if we did, as you say, there is no structure to the keyword-LCSH approach of the new Opacs at the moment as the hierarchy of NT, BT and RT isn't included.

    While I absolutely agree that users come to a subject search via language and not via classification (which is why I regard the subject index as an essential front end), the classification does provide a structured and hierarchical subject approach and I wonder why LMS designers don't embrace classification as the underlying structure for subject retrieval. Being numerical and language-free, I'd have thought it was a natural for machine handling. Is it because designers of LMS's are not library professionals and because even library professionals have fallen for "keywords" and no longer recognise the value of classification?

    ReplyDelete
  3. What about drawing on context to enhance discovery? Of course this is what you already pointed at, but I take context to include all words from a bibliographic "record", i.e. title, subtitle, subject headings (whatever system they are taken from), maybe tags assigned by users etc. I imagine you could use the vector space model to help determine relevance, or you could use a different approach, outlined here http://semantosoph.net/2010/3/19/dude-where-s-my-context. This does justice to the fact that, as you say, "users come to a subject search via language".

    ReplyDelete