ResolveRef : looking at the logs

One of the nice features of Google App Engine is you can easily view logs for your application to quickly see requests generating errors. Browsing the logs of ResolveRef, I’ve been able to identify an few classes of query which for one reason or another, weren’t working.

Firstly, there is the “just testing and don’t actually have a citation on hand to key-in” class of users, that tried requests something like:

/ref/xx/2007//

Not much sympathy here … it’s pretty much like dialing a random phone number and hoping it someone will pick up.

Then there is a class of users who appear to have sensible intentions, but provide incomplete ResolveRef URLs, eg:

/ref/Organic%20Letters/2000//

Maybe I poorly described ResolveRef in the initial announcement, maybe the documentation in the “About” box on the ResolveRef site is unclear or maybe these users just didn’t read the docs in the first place. When I described the service as “A RESTful way to do PubMed searches”, maybe it would have been more accurate to say “A simple, RESTful way to resolve a single journal article using only the human-readble citation information”. ResolveRef does not give a list of results to a PubMed search; it forwards to a single hit (ideally the requested article), or gives an error if it can’t be found. By the looks of it, many users seem to want to use ResolveRef as a way to retrieve a list of results. While this goes against the original spirit of ResolveRef being a resolver for an [almost] unique identifier for journal articles (akin to Noel’s OpenRef proposal), I may be tempted to update ResolveRef to return a list of hits in the future (or just forward to the HubMed or PubMed results page).

There are also some actual bugs which throw nasty python backtraces (I think this one was actually me trying to use ResolveRef to look up a reference at work ):


/ref/Protein%20Sci/1999/8/689

This threw an error since ResolveRef (stupidly) assumed that every PubMed record has an associated DOI … however for some reason this Protein Science article does not have a DOI recorded in PubMed, so it fails to resolve with ResolveRef. This is (yet another) drawback to using PubMed as a backend. I’m thinking I may need to make ResolveRef interface with CrossRef somehow too, since that may act as a backup (or complete replacement) for these cases.

There also seem to be occasional errors generated when the HTTP connection from the Google App servers to PubMed fails; my fault entirely … that type of exception should always be anticipated and caught in a networked application.

Apart from guessing how people may like to use the application by examining the logs, edoardo.marcora also suggested that autocomplete/suggest for the journal field would be nice. I agree … this was a feature I was working on prior to the initial release, but it was taking too long so I just launched ResolveRef without it.

There is a new version in the pipeline, and will be ready for release soon. I’ll also put it on Google Code, warts and all. I already have the “suggest” functionality working, and once I resolve the few bugs discussed above, I’ll push out an update. Stay tuned.

4 thoughts on “ResolveRef : looking at the logs

  1. Great idea. Adding CrossRef as a backup could be useful. For example, my search for:

    Journal of Chemical Information and Modelling 2007 47, 1727

    gave no result on ResolveRef.

    I’ve thought many times about building a service like this using CrossRef exclusively.

  2. Thanks Rich. I have considered using CrossRef as a backup (or even exclusively), but from comments to your post I got the impression that CrossRef may throttle requests too much to rely on that service exclusively (even if ResolveRef is ‘non-commercial’).

    Many searches fail because the journal title needs to be exactly the one PubMed uses. The new suggest feature should help with this. In your case, “Modelling” is actually spelled “Modeling” (single ‘L’) in the PubMed title :). Don’t worry, I’m one of those who speak the Queen’s English and spell “Modelling” the correct way too.

  3. Hi,

    CrossRef doesn’t throttle requests at all so I would suggest giving the OpenURL interface a try – http://www.crossref.org/requestaccount/

    There are some limitations to the metadata (we don’t always get all others from some publishers, sometimes there are character encoding issues) but we do a fuzzy match on journal titles so they don’t have to be exact.

    Ed Pentz, Executive Director, CrossRef

  4. Thanks Ed … the fuzzy match on journal titles could really help make ResolveRef a lot more user friendly.

    It’s not like ResolveRef is actually getting traffic that it would generate many Crossref API requests at the moment anyway … but I can always dream 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *