Protein engineering experiments and Phi(F)-value analysis of SH3 domains reveal that their transition state ensemble (TSE) is conformationally restricted, i.e. the fluctuations in the transition state (TS) structures are small. In the TS of src SH3 and alpha-spectrin SH3 the distal loop and the associated hairpin are fully structured, while the rest of the protein is relatively disordered. If native structure predominantly determines the folding mechanism, the findings for SH3 folds raise the question: What are the features of the native topology that determine the nature of the TSE? We propose that the presence of stiff loops in the native state that connect local structural elements (such as the distal hairpin in SH3 domains) conformationally restricts TSE. We validate this hypothesis using the simulations of a "control" system (16 residue beta-hairpin forming C-terminal fragment of the GBl protein) and its variants. In these fragments the role of bending rigidity in determining the nature of the TSE can be directly examined without complications arising from interactions with the rest of the protein. The TSE structures in the beta-hairpins are determined computationally using cluster analysis and limited Phi(F)-value analysis. Both techniques prove that the conformational heterogeneity decreases as the bending rigidity of the loop increases. To extend this finding to SH3 domains a measure of bending rigidity based on loop curvature, which utilizes native structures in the Protein Data Bank (PDB), is introduced. Using this measure we show that, with few exceptions, the ordering of stiffness of the distal, n-src, and RT loops in the 29 PDB structures of SH3 domains is conserved. Combining the simulation results for beta-hairpins and the analysis of PDB structures for SH3 domains, we propose that the stiff distal loop restricts the conformational fluctuations in the TSE. We also predict that constraining the distal loop to be preformed in the denatured ensemble should not alter the nature of TSE. On the other hand, if the amino and carboxy terminals are cross-linked to form a circular polypeptide chain, the pathways and TSs are altered. These contrasting scenarios are illustrated using simulations of cross-linked WT beta-hairpin fragments. Computations of bending rigidities for immunoglobulin-like domain proteins reveal no clear separation in the stiffness of their loops. In the beta-sandwich proteins, which have large fractions of non-local native contacts, the nature of the TSE cannot be apparently determined using purely local structural characteristics. Nevertheless, the measure of loop stiffness still provides qualitative predictions of the ordered regions in the TSE of Ig27 and TenFn3.
Thermal unfolding (or folding) in many proteins occurs in an apparent two-state manner, suggesting that only two states, unfolded and folded, are populated. At the melting temperature, Tm, the two states coexist. Using lattice models with side chains we show that individual residues become structured at temperatures that deviate from Tm, which implies that partially folded conformations make substantial contribution to thermodynamic properties of two-state proteins. We also find that the folding cooperativity for a given residue is linked to its accessible surface area. These results are consistent with the experiments on GCN4-like zipper peptide, which showed that local melting temperatures differ from Tm. Analysis of thermal unfolding of six proteins shows that deltaT/Tm approximately N(-1), where deltaT is the transition width and N is the number of residues. This scaling allows us to conclude that, when corrected for finite size effects, folding cooperativity can be captured using coarse grained models.
Neurodegenerative diseases induced by transmissible spongiform encephalopathies are associated with prions. The most spectacular event in the formation of the infectious scrapie form, referred to as PrP(Sc), is the conformational change from the predominantly alpha-helical conformation of PrP(C) to the PrP(Sc) state that is rich in beta-sheet content. Using sequence alignments and structural analysis of the available nuclear magnetic resonance structures of PrP(C), we explore the propensities of helices in PrP(C) to be in a beta-strand conformation. Comparison of a number of structural characteristics (such as solvent accessible area, distribution of (Phi, Psi) angles, mismatches in hydrogen bonds, nature of residues in local and nonlocal contacts, distribution of regular densities of amino acids, clustering of hydrophobic and hydrophilic residues in helices) between PrP(C) structures and a databank of "normal" proteins shows that the most unusual features are found in helix 2 (H2) (residues 172-194) followed by helix 1 (H1) (residues 144-153). In particular, the C-terminal residues in H2 are frustrated in their helical state. The databank of normal proteins consists of 58 helical proteins, 36 alpha+beta proteins, and 31 beta-sheet proteins. Our conclusions are also substantiated by gapless threading calculations that show that the normalized Z-scores of prion proteins are similar to those of other alpha+beta proteins with low helical content. Application of the recently introduced notion of discordance, namely, incompatibility of the predicted and observed secondary structures, also points to the frustration of H2 not only in the wild type but also in mutants of human PrP(C). This suggests that the instability of PrP(C) proteins may play a role in their being susceptible to the profound conformational change. Our analysis shows that, in addition to the previously proposed role for the segment (90-120) and possibly H1, the C-terminus of H2 and possibly N-terminus may play a role in the alpha-->beta transition. An implication of our results is that the ease of polymerization depends on the unfolding rate of the monomer. Sequence alignments show that helices in avian prion proteins (chicken, duck, crane) are better accommodated in a helical state, which might explain the absence of PrP(Sc) formation over finite time scales in these species. From this analysis, we predict that correlated mutations that reduce the frustration in the second half of helix 2 in mammalian prion proteins could inhibit the formation of PrP(Sc).
Folding of RNA into an ordered, compact structure requires substantial neutralization of the negatively charged backbone by positively charged counterions. Using a native gel electrophoresis assay, we have examined the effects of counterion condensation upon the equilibrium folding of the Tetrahymena ribozyme. Incubation of the ribozyme in the presence of mono-, di- and trivalent ions induces a conformational state that is capable of rapidly forming the native structure upon brief exposure to Mg2+. The cation concentration dependence of this transition is directly correlated with the charge of the counterion used to induce folding. Substrate cleavage assays confirm the rapid onset of catalytic activity under these conditions. These results are discussed in terms of classical counterion condensation theory. A model for folding is proposed which predicts effects of charge, ionic radius and temperature on counterion-induced RNA folding transitions.
Molecular chaperones are required to assist folding of a subset of proteins in Escherichia coli. We describe a conceptual framework for understanding how the GroEL-GroES system assists misfolded proteins to reach their native states. The architecture of GroEL consists of double toroids stacked back-to-back. However, most of the fundamentals of the GroEL action can be described in terms of the single ring. A key idea in our framework is that, with coordinated ATP hydrolysis and GroES binding, GroEL participates actively by repeatedly unfolding the substrate protein (SP), provided that it is trapped in one of the misfolded states. We conjecture that the unfolding of SP becomes possible because a stretching force is transmitted to the SP when the GroEL particle undergoes allosteric transitions. Force-induced unfolding of the SP puts it on a higher free-energy point in the multidimensional energy landscape from which the SP can either reach the native conformation with some probability or be trapped in one of the competing basins of attraction (i.e., the SP undergoes kinetic partitioning). The model shows, in a natural way, that the time scales in the dynamics of the allosteric transitions are intimately coupled to folding rates of the SP. Several scenarios for chaperonin-assisted folding emerge depending on the interplay of the time scales governing the cycle. Further refinement of this framework may be necessary because single molecule experiments indicate that there is a great dispersion in the time scales governing the dynamics of the chaperonin cycle.
We describe a conceptual framework for understanding the way large RNA molecules fold based on the notion that their free-energy landscape is rugged. A key prediction of our theory is that RNA folding can be described by the kinetic partitioning mechanism (KPM). According to KPM a small fraction of molecules folds rapidly to the native state whereas the remaining fraction is kinetically trapped in a low free-energy non-native state. This model provides a unified description of the way RNA and proteins fold. Single-molecule experiments on Tetrahymena ribozyme, which directly validate our theory, are analyzed using KPM. We also describe the earliest events that occur on microsecond time scales in RNA folding. These must involve collapse of RNA molecules that are mediated by counterion-condensation. Estimates of time scales for the initial events in RNA folding are provided for the Tetrahymena ribozyme.
Using exhaustive simulations of lattice models with side-chains, we show that optimized two-state folders reach the native state by a nucleation-collapse mechanism with multiple folding nuclei (MFN). For both the full model and the Go version, there are certain contacts that on an average participate in the critical nuclei with higher probability than the others. The high- (> or = 0.5) probability contacts are largely determined by the structure of the native state. Comparison of the results for the full sequence and the Go model shows that non-native interactions compromise the degree of cooperativity and stability of the native state. From an extremely detailed analysis of the folding kinetics, we find that non-native interactions are present in the folding nuclei. The folding times decrease if the non-native interactions in the folding nuclei are made neutral or repulsive. Using cluster analysis and making no prior assumption about reaction coordinate, we show that both full and Go models have three distinct transition states that give a structural description for the MFN. In the transition states, on an average, about two-thirds of the sequence is structured, whereas the rest is disordered, reminiscent of the polarized transition state in the SH3 domain. Our studies show that Go models cannot describe the transition state characteristics of two-state folders at the molecular level. As a byproduct of our investigations, we establish that our method of computing the transition state ensemble is numerically equivalent to the technique based on the stochastic separatrix, which also does not require a priori knowledge of the folding reaction coordinate.
Condensed counterions contribute to the stability of compact structures in RNA, largely by reducing electrostatic repulsion among phosphate groups. Varieties of cations induce a collapsed state in the Tetrahymena ribozyme that is readily transformed to the catalytically active structure in the presence of Mg2+. Native gel electrophoresis was used to compare the effects of the valence and size of the counterion on the kinetics of this transition. The rate of folding was found to decrease with the charge of the counterion. Transitions in monovalent ions occur 20- to 40-fold faster than transitions induced by multivalent metal ions. These results suggest that multivalent cations yield stable compact structures, which are slower to reorganize to the native conformation than those induced by monovalent ions. The folding kinetics are 12-fold faster in the presence of spermidine3+ than [Co(NH3)6]3+, consistent with less effective stabilization of long-range RNA interactions by polyamines. Under most conditions, the observed folding rate decreases with increasing counterion concentration. In saturating amounts of counterion, folding is accelerated by addition of urea. These observations indicate that reorganization of compact intermediates involves partial unfolding of the RNA. We find that folding of the ribozyme is most efficient in a mixture of monovalent salt and Mg2+. This is attributed to competition among counterions for binding to the RNA. The counterion dependence of the folding kinetics is discussed in terms of the ability of condensed ions to stabilize compact structures in RNA.
Large ribozymes typically require very long times to refold into their active conformation in vitro, because the RNA is easily trapped in metastable misfolded structures. Theoretical models show that the probability of misfolding is reduced when local and long-range interactions in the RNA are balanced. Using the folding kinetics of the Tetrahymena ribozyme as an example, we propose that folding rates are maximized when the free energies of forming independent domains are similar to each other. A prediction is that the folding pathway of the ribozyme can be reversed by inverting the relative stability of the tertiary domains. This result suggests strategies for optimizing ribozyme sequences for therapeutics and structural studies.
Thermodynamics and kinetics of off-lattice models with side chains for the beta-hairpin fragment of immunoglobulin-binding protein and its variants are reported. For all properties (except refolding time tau(F)) there are no qualitative differences between the full model and the Go version. The validity of the models is established by comparison of the calculated native structure with the Protein Data Bank coordinates and by reproducing the experimental results for the degree of cooperativity and tau(F). For the full model tau(F) approximately 2 micros at the folding temperature (experimental value is 6 micros); the Go model folds 50 times faster. Upon refolding, structural changes take place over three time scales. On the collapse time scale compact structures with intact hydrophobic cluster form. Subsequently, hydrogen bonds form, predominantly originating from the turn by a kinetic zipping mechanism. The assembly of the hairpin is complete when most of the interstrand contacts (the rate-limiting step) is formed. The dominant transition state structure (located by using cluster analysis) is compact and structured. We predict that when hydrophobic cluster is moved to the loop tau(F) marginally increases, whereas moving the hydrophobic cluster closer to the termini results in significant decrease in tau(F) relative to wild type. The mechanism of hairpin formation is predicted to depend on turn stiffness.
Single-molecule manipulation techniques reveal that stretching unravels individually folded domains in the muscle protein titin and the extracellular matrix protein tenascin. These elastic proteins contain tandem repeats of folded domains with beta-sandwich architecture. Herein, we propose by stretching two model sequences (S1 and S2) with four-stranded beta-barrel topology that unfolding forces and pathways in folded domains can be predicted by using only the structure of the native state. Thermal refolding of S1 and S2 in the absence of force proceeds in an all-or-none fashion. In contrast, phase diagrams in the force-temperature (f,T) plane and steered Langevin dynamics studies of these sequences, which differ in the native registry of the strands, show that S1 unfolds in an allor-none fashion, whereas unfolding of S2 occurs via an obligatory intermediate. Force-induced unfolding is determined by the native topology. After proving that the simulation results for S1 and S2 can be calculated by using native topology alone, we predict the order of unfolding events in Ig domain (Ig27) and two fibronectin III type domains ((9)FnIII and (10)FnIII). The calculated unfolding pathways for these proteins, the location of the transition states, and the pulling speed dependence of the unfolding forces reflect the differences in the way the strands are arranged in the native states. We also predict the mechanisms of force-induced unfolding of the coiled-coil spectrin (a three-helix bundle protein) for all 20 structures deposited in the Protein Data Bank. Our approach suggests a natural way to measure the phase diagram in the (f,C) plane, where C is the concentration of denaturants.
Considerable insights into the mechanisms and timescales of protein folding have been obtained from detailed studies of minimal off-lattice models. These models are coarse-grained representations of polypeptide chains. Many novel predictions of the mechanisms and timescales of the folding of proteins have been made using simulations of off-lattice models. The concepts derived from these simulations have been used to analyze the recent experiments and simulations of proteins and peptides.
The chaperonin system, GroEL and GroES of Escherichia coli enable certain proteins to fold under conditions when spontaneous folding is prohibitively slow as to compete with other non-productive channels such as aggregation. We investigated the plausible mechanisms of GroEL-mediated folding using simple lattice models. In particular, we have investigated protein folding in a confined environment, such as those offered by the GroEL, to decipher whether rate and yield enhancement can occur when the substrate protein is allowed to fold within the cavity of the chaperonins. The GroEL cavity is modeled as a cubic box and a simple bead model is used to represent the substrate chain. We consider three distinct characteristic of the confining environment. First, the cavity is taken to be a passive Anfinsen cage in which the walls merely reduce the available conformation space. We find that at temperatures when the native conformation is stable, the folding rate is retarded in the Anfinsen cage. We then assumed that the interior of the wall is hydrophobic. In this case the folding times exhibit a complex behavior. When the strength of the interaction between the polypeptide chain and the cavity is too strong or too weak we find that the rates of folding are retarded compared to spontaneous folding. There is an optimum range of the interaction strength that enhances the rates. Thus, above this value there is an inverse correlation between the folding rates and the strength of the substrate-cavity interactions. The optimal hydrophobic walls essentially pull the kinetically trapped states which leads to a smoother the energy landscape. It is known that upon addition of ATP and GroES the interior cavity of GroEL offers a hydrophilic-like environment to the substrate protein. In order to mimic this within the context of the dynamic Anfinsen cage model, we allow for changes in the hydrophobicity of the walls of the cavity. The duration for which the walls remain hydrophobic during one cycle of ATP hydrolysis is allowed to vary. These calculations show that frequent cycling of the wall hydrophobicity can dramatically reduce the folding times and increase the yield as well under non-permissive conditions. Examination of the structures of the substrate proteins before and after the change in hydrophobicity indicates that there is global unfolding involved. In addition, it is found that a fraction of the molecules kinetically partition to the native state in accordabce with the iterative annealing mechanism. Thus, frequent "unfoldase" activity of chaperonins leading to global unfolding of the polypeptide chain results in enhancement of the folding rates and yield of the folded protein. We suggest that chaperonin efficiency can be greatly enhanced if the cycling time is reduced. The calculations are used to interpret a few experiments on chaperonin-mediated protein folding.
Folding of the Tetrahymena self-splicing RNA into its active conformation involves a set of discrete intermediate states. The Mg2+-dependent equilibrium transition from the intermediates to the native structure is more cooperative than the formation of the intermediates from the unfolded states. We show that the degree of cooperativity is linked to the free energy of each transition and that the rate of the slow transition from the intermediates to the native state decreases exponentially with increasing Mg2+ concentration. Monovalent salts, which stabilize the folded RNA nonspecifically, induce states that fold in less than 30 s after Mg2+ is added to the RNA. A simple model is proposed that predicts the folding kinetics from the Mg2+-dependent change in the relative stabilities of the intermediate and native states.
We examine the similarities and differences between two widely used knowledge-based potentials, which are expressed as contact matrices (consisting of 210 elements) that gives a scale for interaction energies between the naturally occurring amino acid residues. These are the Miyazawa-Jernigan contact interaction matrix M and the potential matrix S derived by Skolnick J et al., 1997, Protein Sci 6:676-688. Although the correlation between the two matrices is good, there is a relatively large dispersion between the elements. We show that when Thr is chosen as a reference solvent within the Miyazawa and Jernigan scheme, the dispersion between the M and S matrices is reduced. The resulting interaction matrix B gives hydrophobicities that are in very good agreement with experiment. The small dispersion between the S and B matrices, which arises due to differing reference states, is shown to have dramatic effect on the predicted native states of lattice models of proteins. These findings and other arguments are used to suggest that for reliable predictions of protein structures, pairwise additive potentials are not sufficient. We also establish that optimized protein sequences can tolerate relatively large random errors in the pair potentials. We conjecture that three body interaction may be needed to predict the folds of proteins in a reliable manner.
Single-molecule force spectroscopy reveals unfolding of domains in titin on stretching. We provide a theoretical framework for these experiments by computing the phase diagrams for force-induced unfolding of single-domain proteins using lattice models. The results show that two-state folders (at zero force) unravel cooperatively, whereas stretching of non-two-state folders occurs through intermediates. The stretching rates of individual molecules show great variations reflecting the heterogeneity of force-induced unfolding pathways. The approach to the stretched state occurs in a stepwise "quantized" manner. Unfolding dynamics and forces required to stretch proteins depend sensitively on topology. The unfolding rates increase exponentially with force f till an optimum value, which is determined by the barrier to unfolding when f = 0. A mapping of these results to proteins shows qualitative agreement with force-induced unfolding of Ig-like domains in titin. We show that single-molecule force spectroscopy can be used to map the folding free energy landscape of proteins in the absence of denaturants.
BACKGROUND: Over the past few years novel folding mechanisms of globular proteins have been proposed using minimal lattice and off-lattice models. The factors determining the cooperativity of folding in these models and especially their explicit relation to experiments have not been fully established, however.
RESULTS: We consider equilibrium folding transitions in lattice models with and without sidechains. A dimensionless measure, omega c, is introduced to quantitatively assess the degree of cooperativity in lattice models and in real proteins. We show that larger values of omega c resembling the values seen in proteins are obtained in lattice models with sidechains. The enhanced cooperativity of such models results from possible denser packing of sidechains in the interior of the model polypeptide chain. We also establish that omega c correlates extremely well with sigma T = (T o - T f) /T o, where T o and T f are collapse and folding transition temperatures, respectively. These theoretical ideas are used to analyze folding transitions in two-state folders (RNase A, chymotrypsin inhibitor 2, fibronectin type III modules and tendamistat) and three-state folders (apomyoglobin and lysozyme). The values of omega c extracted from experiments show a correlation with sigma T (suitably generalized when folding is induced by denaturants or acid).
CONCLUSIONS: A quantitative description of the cooperative transition of real proteins can be made by lattice models with sidechains. The degree of cooperativity in minimal models and real proteins can be expressed in terms of the single parameter sigma, which can be estimated from experimental data.
The hydrophobic hydration in a series of hydrocarbons is probed by using molecular dynamics simulations. The solutes considered range from methane to octane. Examination of the shapes of the hydration shell suggests that there is no single stable structure surrounding these solutes. The structure of the water molecules around the solute is not significantly perturbed, even for octane, and the hydrogen bond network is essentially preserved. The solutes are accommodated in the voids of the tetrahedral network of water in such a way as to leave the local environment almost intact. The hydrophobic hydration arises primarily because of the plasticity of the hydrogen bond network. Even for octane we find very little evidence for water-mediated interactions between nonbonded carbon atoms, leading us to suggest that the transition to globular conformations can only occur for very long, linear hydrocarbon chains.
The nature of the nucleation-collapse mechanism in protein folding is probed using 27-mer and 36-mer lattice models. Three different forms for the interaction potentials are used. Three of the four 27-mer sequences have maximally compact and identical native state while the other has a non-compact native conformation. All the sequences fold thermodynamically and kinetically by a two-state process. Analysis of individual trajectories for each sequence using a self-organizing neural net algorithm shows that upon formation of a critical set of contacts the polypeptide chain rapidly reaches the native conformation which is consistent with a nucleation-collapse mechanism. The algorithm, which reduces the identification of the folding nucleus for each trajectory to one of pattern recognition, is used to show that there are multiple folding nuclei. There is a distribution of nucleation contacts in the transition states with some of them occurring with more probability (when averaged over the denatured ensemble) than others. We also show that there is a distribution in the size of the nuclei with the average number of residues in the folding nuclei being less than about one-third of the chain size. The fluctuations in the sizes of the nuclei are large, suggestive of a broad transition region. The folding nuclei, the structures of each are the corresponding transition states, have varying degree of overlap with the native conformation. The distribution of the radius of gyration of the transition states shows that these structures are an expanded form (by about 25% in the radius of gyration) of the native conformation. Local contacts are most dominant in the folding nuclei while a certain fraction of non-local contacts is necessary to stabilize the transition states. The search for the critical nuclei initially involves the formation of local contacts, while non-local contacts are formed later. The fractional values of PhiF for the two 27-mer mutants found by using the protein engineering protocol are consistent with the microscopic picture of partial formation of structures involving these residues in the transition state. These observations lead to a multiple folding nuclei (MFN) model for nucleation-collapse mechanism in protein folding. The major implication of the MFN model is that, even if the residues whose tertiary interactions are formed nearly completely in the transition state are mutated, it does not disrupt the nature of the nucleation-collapse mechanism. We analyze the experiments on chymotrypsin inhibitor 2 and alpha-spectrin SH3 domain and two circular permutants in light of the MFN model. It is shown that the PhiF-value analysis for these proteins gives considerable support to the MFN model. The theoretical and experimental studies give a coherent picture of the nucleation-collapse mechanism in which there is a distribution of folding nuclei with some more probable than others. The formation of any specific nucleus is not necessary for efficient two-state folding.