With recent advances in bioengineering, scientists are designing novel proteins from scratch that perform some of biology's most powerful functions.

Imagine that a pathogen is released in North America and is spreading like wildfire from person to person. To stop it, the government calls on a specialized team of scientists, who race against the clock to engineer a protein capable of deactivating the pathogen. The human race is spared. Engineering proteins, the intricately-folded three-dimensional machines that form the foundation of a cell’s infrastructure, defense systems, communications network, and manufacturing capabilities, is being aggressively pursued by scientists around the world. Synthetic proteins aren’t ready to save the day just yet, but major players in academia, industry, and federal government — such as DARPA’s ambitious Protein Design Processes program — see the potential in this technology. The hope is that protein engineering will enable us to aggressively customize the cellular assembly line, seamlessly integrating both natural and unnatural components to enable the production of just about anything that chemistry will allow.

Since 2000, the total number of deposits in the Protein Data Bank, a publicly-accessible online repository of detailed schematic protein information, has quadrupled to more than 56,000. These types of resources are helping to illuminate the complex relationship between the sequence of a chain of amino acids, the shape into which that chain will ultimately fold, and the function executed by the resulting protein — and this knowledge is in turn fueling the rapid growth of a community of protein “hackers” who are learning how to exploit these principles. “I always thought it was sort of a fascinating extension from observing nature to think about how to improve on it,” says University of Zurich researcher Andreas Plückthun, a pioneer from protein engineering’s early days a decade ago.

Until recently, the confounding complexity of proteins demanded a “combinatorial” approach to protein design: Vast libraries of protein variants are repeatedly screened to select for molecules with desirable properties in a process known as directed evolution. Plückthun has developed numerous innovative large-scale, directed evolution-based screening methods and applied them toward the development of an array of useful molecules, including synthetic antibodies — proteins designed to bind tightly to specific targets, such as tumor cells, and then label them or even mark them for destruction.

The growing stockpile of protein data has, in the last few years, enabled development of smarter computer algorithms that are giving scientists the power to redesign proteins virtually and with greater speed and accuracy. These “computational design” algorithms are able to scan through truly staggering numbers of potential sequence and structure variations — far more than would be possible in a lab — and serve up the best designs. “I think that combinatorial biology has more successes to its name than pure computational design,” says Brian Kuhlman, an associate professor of biochemistry at the University of North Carolina at Chapel Hill, “but there are some things that you would be very hard-pressed to do with only combinatorial biology, such as design an entirely new protein structure.”  Kuhlman was a postdoctoral fellow in David Baker’s lab at the University of Washington, where the first computational design of a protein not found in nature was achieved.

The Baker Lab used Rosetta, free software initially developed to predict the structure of existing proteins. But since 2005, the lab has been using Rosetta to identify amino acid sequences that provide the best fit for their target structure, while simultaneously introducing tiny tweaks that make a resulting protein as stable and likely to fold properly as possible. “One of the most interesting things that has helped the field is the realization that tools that are developed for protein-structure prediction are equally useful for protein design,” says Kuhlman.

Kuhlman, in collaboration with chemist Klaus Hahn, is now using Rosetta at UNC-Chapel Hill to build a cellular signaling protein that can be switched on using light. “What’s really cool about this particular switch is that it actually causes the cell to grow in a certain direction,” says Kuhlman. “If you shine a laser on the left-hand side of the cell, the cell will grow towards the left.” Such ”switchable” proteins offer incredibly precise control over when and where a particular protein is active in a cell or patch of cells, providing scientists with a potential means for experimentally disrupting cellular processes in real-time and even manipulating them for therapeutic purposes — for example, by putting a damper on tumor growth.

Computational design is also speeding up the enormously promising field of enzyme engineering. Enzymes are specialized proteins that catalyze specific chemical reactions within cells, and if properly engineered, they have the potential to transform cells into powerful chemists. Duke University scientist Bruce Donald and colleagues recently announced the successful application of their K* algorithm to engineer new versions of an enzyme involved in the production of a natural antibiotic called gramicidin S. K* explores as many subtle variations as possible in an amino acid chain, so that only the most promising designs reach the test tube.  “The idea is to measure twice and cut once,” says Donald. “The students and postdocs at the wet bench work very hard, and we want to give them designs with a good chance of working.” Modifications like this could allow scientists to effectively change the machinery that bacteria use to make antibiotics, and pave the way toward more efficient redesign of old compounds to foil drug-resistance in germs.

The holy grail of enzyme engineering is the capability to design an enzyme that catalyzes any reaction imaginable, even those not performed in nature. Two recent articles published by Baker’s team in collaboration with several other leading protein research groups reveal exciting early progress on this front, including the successful design of a novel enzyme that catalyzes the Kemp elimination, a chemical reaction involving the deprotonation of a carbon atom. This may not sound sexy, but it represents a landmark achievement: the engineering of an entirely new protein that performs a chemical reaction no known enzyme can do at 100,000 times the rate the reaction would occur on its own. Based on this proof of concept, it may soon be possible to build enzymes that can recognize and destroy environmental pollutants, transform plant matter into energy, synthesize revolutionary biomaterials — just about anything to which an ambitious chemist might aspire. “There are a huge number of proteins in nature that do all kinds of marvelous things,” says Baker. “But there’s an even larger set of proteins which nature never explored that could do even more marvelous things.”

The past decade has seen the field of protein engineering go from thought exercises and ginger tinkering to creating entirely novel proteins not found in nature. But in order for protein engineering to reach its full potential, several hurdles need to be cleared. Screening vast numbers of sequence and structure permutations demands heavy hardware firepower, putting ambitious design projects out of the reach of many labs. “In my laboratory alone, we have 230 processor cores to our cluster,” says Donald. “[Our designs] might take half to all of that cluster for between a day and a month depending on the design — so it’s a lot of time, and an expensive proposition.” Baker’s group has turned to distributed computing, enlisting processing support from more than 230,000 volunteers via their Rosetta@Home initiative. But the real hope is that as biological knowledge of proteins improves, so will the efficiency of algorithms. That way, 1050 different structural arrangements can be processed at a time, rather than 10150 at a time, which will require less hardware gruntwork.

“Our knowledge is still imperfect in a lot of areas,” admits Baker, especially in understanding the physical principles underlying enzymatic catalysis. For this reason, the experimental process remains essential to successful protein design, and even the best computational designs require extensive testing and refinement in the laboratory. “The winning recipe will be to combine the powers of [directed] evolution and more advanced methods in design,” says Plückthun. Baker is equally enthusiastic about this point — his group is collaborating extensively with directed-evolution specialists, including Plückthun and Dan Tawfik at Israel’s Weizmann Institute, and he believes that a two-pronged strategy will be the winning one. “We’re really in the learning phase, and what we’re learning is not only how to design proteins, but also a lot about how proteins work and how biology works,” says Baker. “It’s really fun.”