Naming in molecular biology: get comfortable with meaninglessness !

I noticed an interesting post over on BoingBoing: “Comfort with meaninglessness the key to good programmers“. It outlines some research by Dehnadi and Bornat on attributes that can predict aptitude in computer programming. They conclude that a “deep comfort with meaninglessness” is an important predictor of programming aptitude.

I think comfort with meaninglessness is an important skill in studying biology (and probably other sciences too). Many times, during the description of a system, various acronyms are thrown about as labels for entities (or ‘actors’) in that system. An important skill of the scientist is being able to follow how all the actors in the system relate to each other, without necessarily knowing anything about the specific properties of those actors. There are lots of protein and gene names which often bear very little meaning relative to the biological entity that they label, and fixating on what ‘the name’ means simply distracts from the true nature of the entity.

Example: TPR proteins are a superfamily of protein fold, often involved in protein-protein interactions. I have sometimes been asked at poster presentations, or the occasional talk: “What does TPR stand for ?”. “TPR” is an acronym for “tetratricopeptide repeat” … you may be able to glean from that expansion that the protein fold is composed of repeat sequences 34 amino acids long – but that is only one small aspect of the family, and isn’t the important point. Yet many molecular biologists appear uncomfortable with an “undefined” acronym, and insist on having it expanded to reveal the full name. TPR is just a convenient label for the superfamily … it could equally have been called GrratBlat or 5450520A, it would still be the same thing. The point is, you shouldn’t have to ask what TPR stands for. Sure, it’s a curiosity, and some protein names can be amusing (Sonic Hedgehog, or “Just Another Kinase” come to mind), it may also contain some meaning, but first and foremost it’s a label – something to link the entity to all the other descriptive information about it’s structure, function, localisation and regulation. Like many classes of protein, the original name was given at a time when little was actually known about the thing, and typically the meaning embedded in the name should be ignored lest it bias our interpretation about what that protein really does.

Summary of opinion: Molecular biologists should become comfortable with the notion that a name is just a label – meaningless without the associated metadata.

All of this is probably second nature those who studied philosophy (or computer science, or linguistics) … I’m guessing it is an issue of semantics. I really should have taken some of those subjects back in my undergrad days … 🙂

(Postscript: Sadly, not everything that comes out of Middlesex University is good)

Leave a Reply

Your email address will not be published. Required fields are marked *