Proteins are Nature's Robots: they carry out most of the activities of life, as well as constituting some of its major structural building blocks. Many of them fold into structurally stable conformations, which can then form crystals. Crystallographers “solve” the 3-dimensional structure of a protein by computing the coordinates of each of its constituent atoms from the X-ray diffraction pattern of these crystals. Thousands of such structures have been deposited in the Protein Data Bank (PDB). These beautiful macromolecules constitute The Machinery Of Life; author and illustrator David S. Goodsell highlights some of the fascinating stories of how they work in his Molecule of the Month column.
Clearly, only a finite number of proteins exist among all the living organisms on the Earth today. How far have we gotten in exploring the space of proteins? From a historical point of view, we can judge the extent to which a culture had explored the surface of the Earth over time by looking at the maps drawn by people of that culture. Had they encountered all the continents? How fine is the level of detail for each continent?
Current maps of protein space can be found at SCOP: Structural Classification of Proteins and CATH: Protein Structure Classification Database. For simplicity I'll discuss SCOP, although CATH is similar. From the Introduction to SCOP, “the different major levels in the hierarchy are: Family: Clear evolutionary relationship; Superfamily: Probable common evolutionary origin; and Fold: Major structural similarity.” Let's look at the growth of each of these categories in SCOP over time:
In each case I've set the height of the Y-axis at three times the value in July 2001 when SCOP version 1.55 was released. The rate of finding new folds (the “continents” of protein space) is neither taking off (as we might expect if there had been the same kinds of fundamental advances in protein structure determination as in sequencing) nor levelling off (as we might expect if we had explored most of the existing space). The conclusion is clear: there's plenty of unknown territory left for the pioneers of tomorrow.
Sequence space and the ongoing expansion of the protein universe http://www.nature.com/nature/journal/vaop/ncurrent/full/nature09105.html#/
ReplyDelete