Selected Publications
A Newfound Cancer-activating Mutation Reshapes the Energy Landscape of Estrogen-binding Domain
J. Chem. Theory Comput. ASAP article, doi:10.1021/ct500313e (2014)
Wei Huang, Krishna M Ravikumar, and S. Yang
A simple energy model of coarse-graining correctly predicts the large-scale switching between two well-defined conformations.
[PDF]    [HTML]    [Supporting Info]    [Supporting Movie]

Abstract: The ligand-binding domain (LBD) of an estrogen receptor undergoes a large conformational switching from an inactive to active state in response to hormone stimuli. Very recently, a novel D538G mutant has been identified to be active in advanced breast cancer tumors. Here, we ask if molecular simulations can provide insight on its mechanistic impact on the receptors activation status. It has been challenging for ab initio modeling to identify two distinct conformations of a single amino acid sequence as large as that of the LBD. Using a coarse-grained (CG) model, we are able to correctly reproduce this LBD conformational switching. Furthermore, we found that the D538G mutation reshapes the energy landscape by stabilizing both active and inactive conformations, but preferring the active by 1.5 kcal/mol. This observation is consistent with the concept of a mutation-shifting landscape and provides a structural explanation for the oncogenic D538G mutation at the detailed conformational level.

Quantitative Topological Mapping of Protein Structure by Hydroxyl Radical Footprinting Mediated Structural Mass Spectrometry
Submitted (2014)
Wei Huang, Krishna M Ravikumar, Mark Chance, and S. Yang
Introduced the P-factor analysis for topological mapping of protein structures utilizing footprinting data.
[PDF] [HTML]

Abstract: An increasingly popular technique of protein footprinting provides information about the solvent accessibility of specific protein sites. Traditional analysis approaches have focused on comparing the difference of unitary sites, e.g., between a free and ligand-bound state. What is unclear is whether the information can be used for absolute comparisons across multiple sites. Here, we present a new analytic approach to convert a measured rate constant to a protection factor (termed P-factor). This P-factor analysis explains kinetic aspects of the biophysical process itself, but enables a novel topological picture of protein structures utilizing footprinting data.

Methods for SAXS-based Structure Determination of Biomolecular Complexes
Advanced Materials (impact factor: 14.8), published online, doi: 10.1002/adma.201304475 (2014)
S. Yang (Invited Review - Research News)
[PDF]    [HTML]
(On the occasion of celebrating the 10th-year anniversary of Shanghai Synchrotron Radiation Facility, also known as 上海光源).

Abstract: Measurements from small-angle X-ray scattering (SAXS) are highly informative to determine the structures of bimolecular complexes in solution. Here, current and recent SAXS-driven developments are described, with an emphasis on computational modeling. In particular, accurate methods to computing one theoretical scattering profile from a given structure model are discussed, with a key focus on structure factor coarse-graining and hydration contribution. Methods for reconstructing topological structures from an experimental SAXS profile are currently under active development. We report on several modeling tools designed for conformation generation that make use of either atomic-level or coarse-grained representations. Furthermore, since large, flexible biomolecules can adopt multiple well-defined conformations, a traditional single-conformation SAXS analysis is inappropriate, so we also discuss recent methods that utilize the concept of ensemble optimization, weighing in on the SAXS contributions of a heterogeneous mixture of conformations. These tools will ultimately posit the usefulness of SAXS data beyond a simple space-filling approach by providing a reliable structure characterization of biomolecular complexes under physiological conditions.

Cross-talk between the ligand- and DNA-binding domains of estrogen receptor
Proteins, 81 (11) 1900-1909, (2013)
Wei Huang, Geoffrey L Greene, Krishna M Ravikumar, and Sichun Yang
Using a set of PPR simulations based on a simple coarse-grained/atomic-level model to predict the multidomain conformations of estrogen receptor.
[PDF]    [HTML]

Abstract: Estrogen receptor alpha (ERα) is a hormone-responsive transcription factor that contains several discrete functional domains including a ligand-binding domain (LBD) and a DNA-binding domain (DBD). Despite a wealth of knowledge about the behaviors of individual domains, the molecular mechanisms of cross-talk between LBD and DBD during signal transduction from hormone to DNA-binding of ERα remain elusive. Here, we apply a multi-scale approach combining coarse-grained (CG) and atomistically-detailed simulations to characterize this cross-talk mechanism via an investigation of the ERα conformational landscape. First, a CG model of ERα is built based on crystal structures of individual LBDs and DBDs, with more emphasis on their inter-domain interactions. Second, molecular dynamics (MD) simulations are implemented and enhanced sampling is achieved via the push-pull-release (PPR) strategy in the search for different LBD-DBD orientations. Third, multiple energetically stable ERα conformations are identified on the landscape. A key finding is that estradiol-bound LBDs utilize the well-described activation helix H12 to pack and stabilize LBD-DBD interactions. Our results suggest that the estradiol-bound LBDs can serve as a scaffold to position and stabilize the DBD-DNA complex, consistent with experimental observations of enhanced DNA binding with the LBD. Final assessment using atomic-level simulations shows that these CG-predicted models are significantly stable within a 15-ns simulation window and specific pairs of lysine residues in close proximity at the domain interfaces could serve as candidate sites for chemical cross-linking studies. Together, these simulation results provide a molecular view of the role of ERα domain interactions in response to hormone binding.

Energy evaluation of β-strand packing in a fibril-forming SH3 domain
J. Phys. Chem. B, 117 (42) 13051–13057, (2013)
Sichun Yang*, Krishna M Ravikumar, and Herbert Levine* (*Corresponding authors)
Using a simple energy function to model the fibril state of an SH3 domain.
[PDF]    [HTML]

Abstract: We examine the energetics of β-strand packing in a fibril-forming SH3 domain using a simple sequence-based energy model. First, we describe this packing energy function and then apply it to three model systems: Aβ, HET-s prion, and SH3 domain. The packing results of Aβ and HET-s are compared to and are consistent with available experimental and computational results. Moreover, our results show that a native β-strand in SH3 is strongly disfavored to pack with any other strand, in accord with recent NMR data. Finally, based on packing energy calculations, several SH3 models of β-strand packing are proposed that fit well with known electron microscopy maps.

Fast-SAXS-pro: A unified approach to computing SAXS profiles of DNA, RNA, protein, and their complexes
J. Chem. Phys.138 (2) 024112 (7 pages), (2013)
Krishna M Ravikumar, Wei Huang and Sichun Yang
Presented a generalized SAXS computing method for a biomolecular complex of protein and DNA/RNA.
[PDF]    [HTML]

Note: A webserver of SAXS computing is available at fast-SAXS-pro

Abstract: A generalized method, termed Fast-SAXS-pro, for computing small angle x-ray scattering (SAXS) profiles of proteins, nucleic acids, and their complexes is presented. First, effective coarse-grained structure factors of DNA nucleotides are derived using a simplified two-particleper- nucleotide representation. Second, SAXS data of a 18-bp double-stranded DNA is measured and used for the calibration of the scattering contribution from excess electron density in the DNA solvation layer. Additional test on a 25-bp DNA duplex validates this SAXS computational method and suggests that DNA has a different contribution from its hydration surface to the total scattering compared to RNA and protein. To account for such a difference, a sigmoidal function is implemented for the treatment of non-uniform electron density across the surface of a protein/nucleic-acid complex. This treatment allows differential scattering from the solvation layer surrounding protein/nucleic-acid complexes. Finally, the applications of this Fast-SAXS-pro method are demonstrated for protein/DNA and protein/RNA complexes.

Identification of a novel LXXLL motif in alpha actinin 4 (ACTN4) spliced isoform that is critical for its interaction with estrogen receptor alpha and co-activators
J. Biol. Chem. 287 (42), 35418-35429, (2012)
Simran Khurana, Sharmistha Chakraborty, Xuan Zhao, Yu Liu, Dongyin Guan, Minh Lam, Wei Huang, Sichun Yang and Hung-Ying Kao
[PDF]    [HTML]
  

Abstract: Alpha actinins (ACTNs) are a family of proteins crosslinking actin filaments to maintain cytoskeletal organization and cell motility. Recently, it has also become clear that ACTN4 can function in the nucleus. In this report, we found that ACTN4 (full-length) and its spliced isoform, ACTN4 (Iso), possess an unusual LXXLL nuclear receptor interacting motif. Both ACTN4 (full-length) and ACTN4 (Iso) potentiate basal transcription activity and directly interact with ERα, although ACTN4 (Iso) binds ERα more strongly. We have also found that both ACTN4 (full-length) and ACTN4 (Iso) interact with the ligand-independent and the ligand-dependent activation domains of ERα. While ACTN4 (Iso) interacts efficiently with transcriptional co-activators such as PCAF and SRC-1, the full-length ACTN4 protein either does not or does so weakly. More importantly, the flanking sequences of the LXXLL motif are important not only for interacting with nuclear receptors but also for the association with co-activators. Taken together, we have identified a novel extended LXXLL motif that is critical for interactions with both receptors and co-activators. This motif functions more efficiently in a spliced isoform of ACTN4 than it does in the full-length protein.

Coarse-Grained Simulations of Protein-Protein Association: An Energy Landscape Perspective
Biophys. J.103 (4) 837–845 (2012)
Krishna M Ravikumar, Wei Huang and Sichun Yang
Introduced the pull-push-release (PPR) sampling algorithm for the simulations of protein-protein interactions.
[PDF]    [HTML]

Abstract: Understanding protein-protein association is crucial in revealing the molecular basis of many biological processes. Here, we describe a theoretical simulation pipeline to study protein-protein association from an energy landscape perspective. First, a coarse-grained (CG) model is implemented and its applications are demonstrated via molecular dynamics simulations for several protein complexes. Second, an enhanced search method is utilized to efficiently sample a broad range of protein conformations. Third, multiple conformations are identified and clustered from simulation data and further projected on a three-dimensional globe specifying protein orientations and interacting energies. Results from several complexes indicate that the crystal-like conformation is favorable on the energy landscape even if the landscape is relatively rugged with meta-stable conformations. A closer examination on molecular forces shows that the formation of associated protein complexes can be primarily electrostatics-driven, hydrophobics-driven, or a combination of both in stabilizing specific binding interfaces. Taken together, these results suggest that the CG simulations and analyses provide a new tool-set to study protein-protein association occurring in functional biomolecular complexes.

EROS: Better than SAXS!
Structure. 19 (1), 3-4 (2011)
Sichun Yang and Benoît Roux
[PDF]    [HTML]
  

Preview: Revealing the three-dimensional organization of large dynamic protein complexes in solution is challenging. To tackle this problem, Rozycki and colleagues (2011) design a method combining small angle X-ray scattering (SAXS) data with the results of computer simulations. Their study offers new insights into the conformational transition induced by salt that occurs in an endosome-associated ESCRT-III CHMP3 domain.

Multidomain Assembled States of Hck Tyrosine Kinase in Solution
Proc. Natl. Acad. Sci. USA 107, 15757-15762 (2010)
Sichun Yang, Lydia Blachowicz, Lee Makowski, and Benoît Roux
This paper has developed the SAXS-based shape reconstruction method.
[PDF]    [HTML]   

Featured in a News & Views article by Bernadó and Blackledge in Nature (2010).
Structural biology: Proteins in dynamic equilibrium
Nature 468, 1046-1048 (2010)
[PDF]    [HTML]   

An approach combining small-angle X-ray solution scattering (SAXS) data with coarse-grained (CG) simulations is developed to characterize the assembly states of Hck, a member of the Src-family kinases, under various conditions in solution. First, a basis set comprising a small number of assembly states is generated from extensive CG simulations. Second, a theoretical SAXS profile for each state in the basis set is computed by using the Fast-SAXS method. Finally, the relative population of the different assembly states is determined via a Bayesian-based Monte Carlo procedure seeking to optimize the theoretical scattering profiles against experimental SAXS data. The approach reveals the structural organization of Hck in solution and the different shifts in the equilibrium population of assembly states upon the binding of different signaling peptides. The study establishes the concept of basis-set supported SAXS (BSS-SAXS) reconstruction combining computational and experimental techniques.

RNA Structure Determination Using SAXS Data
J. Phys. Chem. B. 114 (31), 10039-10048 (2010)
Sichun Yang, Marc Parisien, François Major, and Benoît Roux
[PDF]    [HTML]   

Note: A webserver of SAXS computing is available at fast-SAXS-pro

We present a coarse-grained method for rapidly computing small-angle X-ray scattering (SAXS) profiles from ribonucleic acid (RNA) three-dimensional structures, and subsequent comparison and evaluation against experimental SAXS data. The coarse-graining makes an effective use of a two-particle-per-nucleotide model. One effective particle is assigned for the sugar-phosphate group; the other is dedicated and tailored for each RNA sidechain type: adenine, cytosine, guanine and uracil. Furthermore, the RNA molecule is soaked in explicit water to account for the contribution from the hydration layer to the scattering. A coarse-grained representation of water molecules and the RNA molecule as well clearly takes advantage of the intrinsically coarse-grained and low-resolution nature of SAXS data, and thus allows for a fast computation of scattering profiles. We call this method Fast-SAXS-RNA. To access the usefulness of the method, we use the MC-Fold and MC-Sym pipeline to generate three-dimensional decoy models for a tRNA and for a group I intron P4-P6 fragment, two widely used model RNA molecules. The decoys generated cover most of the fold space and range in RMSD under and beyond the accepted resolution of SAXS. We find that our method, combined with concrete SAXS data, is able to filter out a large portion of the models, such that those with the best fit to the SAXS data are also congruent to the native structure. These results demonstrate that the incorporation of SAXS data as global constraints into modeling is positioned to provide a powerful tool for ranking models of the RNA structures.

Atomistic view of the conformational activation of Src kinase using the string method with swarms-of-trajectories
Biophysical Journal 97,(4) L8-L10 (2009)
Wenxun Gan, Sichun Yang, and Benoît Roux
[PDF]    [HTML]   

The inactive-to-active conformational transition of the catalytic domain of human c-Src tyrosine kinase is characterized using the string method with swarms-of-trajectories with all-atom explicit solvent molecular dynamics (MD) simulations. The activation process occurs in two main steps in which the activation loop (A-loop) opens first, followed by the rotation of the alphaC helix. The computed potential of mean force energy along the activation pathway displays a local minimum, which allows the identification of an intermediate state. These results show that the string method with swarms-of-trajectories is an effective technique to characterize complex and slow conformational transitions in large biomolecular systems.

A rapid coarse residue-based computational method for X-ray solution scattering characterization of protein folds and multiple conformational states of large protein complexes
Biophysical Journal 96, 4449-4463 (2009)
Sichun Yang, Sanghyun Park, Lee Makowski, and Benoît Roux
[PDF]    [HTML]    [DOI]

Note: A webserver of SAXS computing is available at fast-SAXS-pro, with new features added to Fast-SAXS, developed for complexes formed by protein and RNA/DNA.

We present a coarse residue-based computational method to rapidly compute the solution scattering profile from a protein with dynamical fluctuations. The method is built upon a coarse-grained (CG) representation of the protein. This CG representation takes advantage of the intrinsic low-resolution and CG nature of solution scattering data. It allows rapid scattering determination from a large number of conformations that can be extracted from CG simulations to obtain scattering characterization of protein conformations. The method includes several important elements, effective residue structure factors derived from the Protein Data Bank, explicit treatment of water molecules in the hydration layer at the surface of the protein, and an ensemble average of scattering from a variety of appropriate conformations to account for macromolecular flexibility. This simplified method is calibrated and illustrated to accurately reproduce the experimental scattering curve of Hen egg white lysozyme. We then illustrated the applications of this CG method by computing the solution scattering patterns of several representative protein folds and multiple conformational states. The results suggest that solution scattering data, when combined with the reliable computational method that we developed, show great potential for a better structural description of multidomain complexes in different functional states, and for recognizing structural folds when sequence similarity to a protein of known structure is low.

Mapping the conformational transition in Src activation by cumulating the information from multiple molecular dynamics trajectories
Proc. Natl. Acad. Sci. USA 106, 3776-3781 (2009)
Sichun Yang, Nilesh K. Banavali, and Benoît Roux
[HTML]    [PDF]    [Animation]   

The Src-family kinases are allosteric enzymes that play a key role in the regulation of cell growth and proliferation. In response to cellular signals, they undergo large conformational changes to switch between distinct inactive and active states. A computational strategy for characterizing the conformational transition pathway is presented to bridge the inactive and active states of the catalytic domain of Hck. The information from a large number (78) of independent all-atom molecular dynamics trajectories with explicit solvent is combined together to assemble a connectivity map of the conformational transition. Two intermediate states along the activation pathways are identified, and their structural features are characterized. A coarse free-energy landscape is built in terms of the collective motions corresponding to the opening of the activation loop (A-loop) and the rotation of the αC helix. This landscape shows that the protein can adopt a multitude of conformations in which the A-loop is partially open, while the αC helix remains in the orientation characteristic of the inactive conformation. The complete transition leading to the active conformation requires a concerted movement involving further opening of the A-loop, the relative alignment of N-lobe and C-lobe, and the rotation of the αC helix needed to recruit the residues necessary for catalysis in the active site. The analysis leads to a dynamic view of the full-length kinase activation, whereby transitions of the catalytic domain to intermediate configurations with a partially open A-loop are permitted, even while the SH2-SH3 clamp remains fully engaged. These transitions would render Y416 available for the transphosphorylation event that ultimately locks down the active state. The results provide a broad framework for picturing the conformational transitions leading to kinase activation.

Src kinase conformational activation: Thermodynamics, pathways, and mechanisms
PLoS Computational Biology 4, e1000047 (14 pages) (2008)
Sichun Yang and Benoît Roux
[HTML]    [PDF]

Tyrosine kinases of the Src-family are large allosteric enzymes that play a key role in cellular signaling. Conversion of the kinase from an inactive to an active state is accompanied by substantial structural changes. Here, we construct a coarse-grained model of the catalytic domain incorporating experimental structures for the two stable states, and simulate the dynamics of conformational transitions in kinase activation. We explore the transition energy landscapes by constructing a structural network among clusters of conformations from the simulations. From the structural network, two major ensembles of pathways for the activation are identified. In the first transition pathway, we find a coordinated switching mechanism of interactions among the αC helix, the activation-loop, and the β strands in the N-lobe of the catalytic domain. In a second pathway, the conformational change is coupled to a partial unfolding of the N-lobe region of the catalytic domain. We also characterize the switching mechanism for the αC helix and the activation-loop in detail. Finally, we test the performance of a Markov model and its ability to account for the structural kinetics in the context of Src conformational changes. Taken together, these results provide a broad framework for understanding the main features of the conformational transition taking place upon Src activation.

Folding time predictions from all-atom replica exchange simulations
J. Mol. Biol. 372, 756-763 (2007)
Sichun Yang, José N. Onuchic, Angel E. García, and Herbert Levine
[HTML]    [PDF]

We present an approach to predicting the folding time distribution from all-atom replica exchange simulations. This is accomplished by approximating the multidimensional folding process as stochastic reaction-coordinate dynamics for which effective drift velocities and diffusion coefficients are determined from the short-time replica exchange simulations. Our approach is applied to the folding of the second β-hairpin of the B domain of protein G. The folding time prediction agrees quite well with experimental measurements. Therefore, we have in hand a fast numerical tool for calculating the folding kinetic properties from all-atom “first-principles” models.

Effective stochastic dynamics on a protein folding energy landscape
J. Chem. Phys. 125, 054910 (8 pages) (2006)
Sichun Yang, José N. Onuchic, and Herbert Levine
[HTML]    [PDF]

We present an approach to protein folding kinetics using stochastic reaction-coordinate dynamics, in which the effective drift velocities and diffusion coefficients are determined from microscopic simulation data. The resultant Langevin equation can then be used to directly simulate the folding process. Here, we test this approach by applying it to a toy two-state dynamical system and to a funnellike structure-based (Go-type) model. The folding time predictions agree very well with full simulation results. Therefore, we have in hand a fast numerical tool for calculating the folding kinetic properties, even when full simulations are not feasible. In addition, the local drift and diffusion coefficients provide an alternative way to compute the free energy profile in cases where only local sampling can be achieved.

Prion disease: Exponential growth requires membrane binding
Biophys. J. 90, L77-L79 (2006)
Daniel L. Cox, Rajiv R. P. Singh, and Sichun Yang
[HTML]    [PDF]

A hallmark feature of prions, whether in mammals or yeast and fungi, is exponential growth associated with fission or autocatalysis of protein aggregates. We have employed a rigorous kinetic analysis to recent data from transgenic mice lacking a glycosylphosphatidylinositol membrane anchor to the normal cellular PrPC protein, which show that toxicity requires the membrane binding. We find as well that the membrane is necessary for exponential growth of prion aggregates; without it, the kinetics is simply the quadratic-in-time growth characteristic of linear elongation as observed frequently in in vitro amyloid growth experiments with other proteins. This requires both: i), a substantial intercellular concentration of anchorless PrPC, and ii), a concentration of small scrapies seeding aggregates from the inoculum, which remains relatively constant with time and exceeds the concentration of large polymeric aggregates. We also can explain via this analysis why mice heterozygous for the anchor-full/anchor-free PrPC proteins have more rapid incubation than mice heterozygous for anchor-full/null PrPC, and contrast the mammalian membrane associated fission or autocatalysis with the membrane free fission of yeast and fungal prions.

Structure of infectious prions: Stabilization by domain swapping
FASEB. J. 19, 1778-1782 (2005)
Sichun Yang, Herbert Levine, José N. Onuchic, and Daniel L. Cox
[HTML]    [PDF]

A candidate structure for the minimal prion infectious unit is a recently discovered protein oligomer modeled as a ß-helical prion trimer (BPT); BPTs can stack to form cross-ß fibrils and may provide insight into protein aggregates of other amyloid diseases. However, the BPT lacks a clear intermonomer binding mechanism. Here we propose an alternative domain-swapped trimeric prion (DSTP) model and show with molecular dynamics (MD) that the DSTP has more favorable intermonomer hydrogen bonding and proline dihedral strain energy than the BPT. This new structural proposal may be tested by lysine and N terminus fluorescent resonance energy transfer (FRET) either directly on recombinant prion protein amyloid aggregates or on synthetic constructs that contain the proline/lysine-rich hinge region critical for domains to swap. In addition, the domain swapping may provide 1) intrinsic entanglement, which can contribute to the remarkable temperature stability of the infectious prion structure and help explain the absence of PrPSc monomers, 2) insight into why specific prolines are potentially relevant to three inherited forms of prion disease, and 3) a simple explanation of prion strains assuming the strain is encoded in the monomer number of the oligomers.

Protein oligomerization through domain swapping: Role of inter-molecular interactions and protein concentration
J. Mol. Biol. 352, 202-211 (2005)
Sichun Yang, Herbert Levine, and José N. Onuchic
[HTML]    [PDF]

Domain swapping has been shown to be an important mechanism controlling multiprotein assembly and has been suggested recently as a possible mechanism underlying protein aggregation. Understanding oligomerization via domain swapping is therefore of theoretical and practical importance. By using a symmetrized structure-based (Gō) model, we demonstrate that in the free-energy landscape of domain swapping, a large free-energy barrier separates monomeric and domain-swapped dimeric configurations. We investigate the effect of finite monomer concentration, by implementing a new semi-analytical method, which involves computing the second virial coefficient, a thermodynamic indicator of inter-molecular interactions. This method, together with the symmetrized structure-based (Gō) model, minimizes the need for expensive many-protein simulations, providing a convenient framework to investigate concentration effect. Finally, we perform direct simulations of domain-swapped trimer formation, showing that this modeling approach can be used for higher-order oligomers.

Domain swapping is a consequence of minimal frustration
Proc. Natl. Acad. Sci. USA 101, 13786-13791 (2004)
Sichun Yang*, Samuel S. Cho*, Yaakov Levy, Margaret S. Cheung, Herbert Levine, Peter G. Wolynes, and José N. Onuchic
[HTML]    [PDF]

The same energy landscape principles associated with the folding of proteins into their monomeric conformations should also describe how these proteins oligomerize into domain-swapped conformations. We tested this hypothesis by using a simplified model for the epidermal growth factor receptor pathway substrate 8 src homology 3 domain protein, both of whose monomeric and domain-swapped structures have been solved. The model, which we call the symmetrized Gō-type model, incorporates only information regarding the monomeric conformation in an energy function for the dimer to predict the domain-swapped conformation. A striking preference for the correct domain-swapped structure was observed, indicating that overall monomer topology is a main determinant of the structure of domain-swapped dimers. Furthermore, we explore the free energy surface for domain swapping by using our model to characterize the mechanism of oligomerization.