Software review: producing two dimensional diagrams of membrane proteins

E. coli LamB, presented using TMRPres2D. Not that the cytoplasmic/extracellular labels are incorrect, and should say extracellular/periplasmic.

I recently needed to make a simple, two dimensional figure of a beta-barrel membrane protein. I went hunting for programs that might take a sequence and/or structure and produce a pretty looking diagram to save me constructing everything by hand. Here are two I found and tried.

Continue reading

ResolveRef updated : now with auto-suggest and source code

I updated ResolveRef last night and checked in the most current sourcecode to svn at Google Code.

New features include:

ResolveRef, now prettier, with comments box by disqus.

  • Suggest/autocomplete for journal title field, using the journal title lists provided by PubMed.
  • A “Verify” button. Allows a ResolveRef URL to be constructed with the web form and verified as working and valid without actually forwarding the user to the article.
  • Some bugfixes (handled the case where there is no DOI in the PubMed record, handled network timeouts to PubMed)
  • Refreshed visuals
  • Disqus comments box for feedback

In the interest of just getting something working quickly, I implemented the suggest feature in the laziest, possibly most RAM and CPU hungry way possible (the “JQuery Suggest” code queries the web app with substrings as you type each character. At the server side, the app uses a regex to scan a ~1.5 Mb list of journal titles held in RAM). I’ve already noticed a few “This request used a high amount of CPU” warnings in the logs, with the threat “High CPU requests have a small quota, and if you exceed this quota, your app will be temporarily disabled“. If my nasty hack starts heating up Google’s datacentre too much, I might have to disable the ‘suggest’ feature until I can implement it “properly”.

Continue reading

ResolveRef : looking at the logs

One of the nice features of Google App Engine is you can easily view logs for your application to quickly see requests generating errors. Browsing the logs of ResolveRef, I’ve been able to identify an few classes of query which for one reason or another, weren’t working.

Continue reading

FoldIt – Crowdsourcing to solve the protein folding problem

David Baker’s lab and friends, have recently released a new ‘experiment’ in protein folding called FoldIt. Essentially, individuals or teams can compete online to manually fold protein structures, guided by the internal energy function within the game (it very likely uses code from the impressive ab initio folding software Rosetta under the hood). The interface is designed as a game to make it accessible to everyone, not just experts in protein folding. While it’s pretty simplified compared with your average molecular structure editing software, I think designers of scientific software (often scientists themselves) should take note; a good clean interface can really assist getting a specific job done painlessly. I haven’t played enough with it yet, but I get the feeling that FoldIt could be a nice way to introduce some protein structure concepts to undergraduates too.

There were the usual complaints on Slashdot that FoldIt doesn’t have a Linux version. Well, I’m happy to report that it seems to run alright using Wine (on Ubuntu Hardy Heron). I couldn’t log in to try the competitive puzzles, but I suspect the server is just in the midst of a Slashdotting. I’ll try later.

FoldIt screenshot, running under Wine

From the FoldIt FAQ:

Can humans really help computers fold proteins?
We’re collecting data to find out if humans’ pattern-recognition and puzzle-solving abilities make them more efficient than existing computer programs at pattern-folding tasks. If this turns out to be true, we can then teach human strategies to computers and fold proteins faster than ever!

Not sure where I saw it, but I remember reading an argument that the future of crowdsourcing would be to not just blindly trust the whole crowd, but also identify experts in the crowd and weight their predictions more strongly. I’d say this is will be the case with ‘manual’ protein folding – just like some players become l33t at first-person-shooters (like my favorite, RTCW: Enemy Territory which depsite enjoying, I’m not so l33t at), and could beat any AI player that doesn’t cheat… some people will probably become pretty good at folding up proteins. Maybe FoldIt will identify them, and they can make their gaming skills useful, and teach their tricks to software to automate the process. Or maybe it will just remain a fun-ish puzzle game 🙂

texshade: useful, and still kickin’

I’ve been looking at doing an analysis with some protein subfamily sequence logos, using Eric Beitz’s texshade. While it’s a little strange that it does the actual analysis part (rather than just the rendering) using LaTeX, it’s the only implementation of the method I know of, and it beats reimplementing it from the paper.

Although it was published in 2006 (and earlier in 2000), with the original URLs now dead, I noticed the latest update for the version of texshade in CTAN (v1.18) was on 15th of April, 2008 … ie texshade was updated just 14 days ago !

It happens all to often that published bioinformatics tools cease to be updated or even disappear from the Web not long after the peer-review publication is released. Kudos to Eric for not abandoning his software.