563 Publications

Fused regression for multi-source gene regulatory network inference

K Lam, Z Westrick, C. Müller, L Christiaen, R. Bonneau

Understanding gene regulatory networks is critical to understanding cellular differentiation and response to external stimuli. Methods for global network inference have been developed and applied to a variety of species. Most approaches consider the problem of network inference independently in each species, despite evidence that gene regulation can be conserved even in distantly related species. Further, network inference is often confined to single data-types (single platforms) and single cell types. We introduce a method for multi-source network inference that allows simultaneous estimation of gene regulatory networks in multiple species or biological processes through the introduction of priors based on known gene relationships such as orthology incorporated using fused regression. This approach improves network inference performance even when orthology mapping and conservation are incomplete. We refine this method by presenting an algorithm that extracts the true conserved subnetwork from a larger set of potentially conserved interactions and demonstrate the utility of our method in cross species network inference. Last, we demonstrate our method’s utility in learning from data collected on different experimental platforms.

Show Abstract

Rotamer libraries for the high-resolution design of beta-amino acid foldamers

A Watkins, D. Renfrew, T Craven, P Arora, R. Bonneau

β-amino acids offer attractive opportunities to develop biologically active peptidomimetics, either employed alone or in conjunction with natural α-amino acids. Owing to their potential for unique conformational preferences that deviate considerably from α-peptide geometries, β-amino acids greatly expand the possible chemistries and physical properties available to polyamide foldamers. Complete in silico support for designing new molecules incorporating nonnatural amino acids typically requires representing their side chain conformations as sets of discrete rotamers for model refinement and sequence optimization. Such rotamer libraries are key components of several state of the art design frameworks. Here we report the development, incorporation in to the Rosetta macromolecular modeling suite, and validation of rotamer libraries for β3-amino acids.

Show Abstract
November 8, 2016

Accurate de novo design of hyperstable constrained peptides

G Bhardwaj, V Mulligan, C Bahl, J Gilmore, P Harvey, O Cheneval, G Buchko, S Pulavarti, Q Kaas, A Eletsky, P Huang, W Johnsen, PGreisen, G Rocklin, Y Song, T Linsky, A Watkins, S Rettie, X Xu, L Carter, R. Bonneau, J Olson, E Coutsias, C Correnti, T Szyperski, D Craik, D Baker

Naturally occurring, pharmacologically active peptides constrained with covalent crosslinks generally have shapes evolved to fit precisely into binding pockets on their targets. Such peptides can have excellent pharmaceutical properties, combining the stability and tissue penetration of small molecule drugs with the specificity of much larger protein therapeutics. The ability to design constrained peptides with precisely specified tertiary structures would enable the design of shape-complementary inhibitors of arbitrary targets. Here we describe the development of computational methods for de novo design of conformationally-restricted peptides, and the use of these methods to design 15–50 residue disulfide-crosslinked and heterochiral N-C backbone-cyclized peptides. These peptides are exceptionally stable to thermal and chemical denaturation, and twelve experimentally-determined X-ray and NMR structures are nearly identical to the computational models. The computational design methods and stable scaffolds presented here provide the basis for development of a new generation of peptide-based drugs.

Show Abstract
October 20, 2016

A Global Genetic Interaction Network Maps a Wiring Diagram of Cellular Function

M Costanzo, Benjamin VanderSluis, Ph.D., E Koch, A Baryshnikova, C Pons, G Tan, W Wang, M Usaj, J Hanchard, S Lee, O. Troyanskaya, I Stagljar, T Xia, Y Ohya, A Gingras, B Raught, M Boutros, L Steinmetz, C Moore, A Rosebrock, A Caudy, C Myers, B Andrews, C Boone

We generated a global genetic interaction network for Saccharomyces cerevisiae, constructing more than 23 million double mutants, identifying about 550,000 negative and about 350,000 positive genetic interactions. This comprehensive network maps genetic interactions for essential gene pairs, highlighting essential genes as densely connected hubs. Genetic interaction profiles enabled assembly of a hierarchical model of cell function, including modules corresponding to protein complexes and pathways, biological processes, and cellular compartments. Negative interactions connected functionally related genes, mapped core bioprocesses, and identified pleiotropic genes, whereas positive interactions often mapped general regulatory connections among gene pairs, rather than shared functionality. The global network illustrates how coherent sets of genetic interactions connect protein complex and pathway modules to map a functional wiring diagram of the cell.

Show Abstract
September 23, 2016

EGRINs (Environmental Gene Regulatory Influence Networks) in rice that function in the response to water deficit, high temperature, and agricultural environments

O Wilkins, C Hafemeister, A Plessis, M Holloway-Phillips, G Pham, A Nicotra, G Gregorio, K Jagadish, E Septiningsih, R. Bonneau, M Purugganan

Environmental Gene Regulatory Influence Networks (EGRINs) coordinate the timing and rate of gene expression in response to environmental signals. EGRINs encompass many layers of regulation, which culminate in changes in accumulated transcript levels. Here, we inferred EGRINs for the response of five tropical Asian rice (Oryza sativa) cultivars to high temperatures, water deficit, and agricultural field conditions by systematically integrating time series transcriptome data, patterns of nucleosome-free chromatin, and the occurrence of known cis-regulatory elements. First, we identified 5,447 putative target genes for 445 transcription factors (TFs) by connecting TFs with genes harboring known cis-regulatory motifs in nucleosome-free regions proximal to their transcriptional start sites. We then used network component analysis to estimate the regulatory activity for each TF based on the expression of its putative target genes. Finally, we inferred an EGRIN using the estimated TFA as the regulator. The EGRINs include regulatory interactions between 4,052 target genes regulated by 113 TFs. We resolved distinct regulatory roles for members of the heat shock factor family, including a putative regulatory connection between abiotic stress and the circadian clock. TFA estimation using network component analysis is an effective way of incorporating multiple genome-scale measurements into network inference.

Show Abstract
September 17, 2016

An expanded evaluation of protein function prediction methods shows an improvement in accuracy

Y Jiang, R. Bonneau, et. al.

Background
A major bottleneck in our understanding of the molecular underpinnings of life is the assignment of function to proteins. While molecular experiments provide the most reliable annotation of proteins, their relatively low throughput and restricted purview have led to an increasing role for computational function prediction. However, assessing methods for protein function prediction and tracking progress in the field remain challenging.

Results
We conducted the second critical assessment of functional annotation (CAFA), a timed challenge to assess computational methods that automatically assign protein function. We evaluated 126 methods from 56 research groups for their ability to predict biological functions using Gene Ontology and gene-disease associations using Human Phenotype Ontology on a set of 3681 proteins from 18 species. CAFA2 featured expanded analysis compared with CAFA1, with regards to data set size, variety, and assessment metrics. To review progress in the field, the analysis compared the best methods from CAFA1 to those of CAFA2.

Conclusions
The top-performing methods in CAFA2 outperformed those from CAFA1. This increased accuracy can be attributed to a combination of the growing number of experimental annotations and improved methods for function prediction. The assessment also revealed that the definition of top-performing algorithms is ontology specific, that different performance metrics can be used to probe the nature of accurate predictions, and the relative diversity of predictions in the biological process and human phenotype ontologies. While there was methodological improvement between CAFA1 and CAFA2, the interpretation of results and usefulness of individual methods remain context-dependent.

Show Abstract
September 7, 2016

PPII Helical Peptidomimetics Templated by Cation–π Interactions

T Craven, R. Bonneau, K Kirshenbaum

Poly-proline type II (PPII) helical PXXP motifs are the recognition elements for a variety of protein–protein interactions that are critical for cellular signaling. Despite development of protocols for locking peptides into α-helical and β-strand conformations, there remains a lack of analogous methods for generating mimics of PPII helical structures. We describe herein a strategy to enforce PPII helical secondary structure in the 19-residue TrpPlexus miniature protein. Through sequence variation, we showed that a network of cation–π interactions could drive the formation of PPII helical conformations for both peptide and N-substituted glycine peptoid residues. The achievement of chemically diverse PPII helical scaffolds provides a new route towards discovering peptidomimetic inhibitors of protein–protein interactions mediated by PXXP motifs.

Show Abstract
August 19, 2016

Side-Chain Conformational Preferences Govern Protein–Protein Interactions

A Watkins, R. Bonneau, P Arora

Protein secondary structures serve as geometrically constrained scaffolds for the display of key interacting residues at protein interfaces. Given the critical role of secondary structures in protein folding and the dependence of folding propensities on backbone dihedrals, secondary structure is expected to influence the identity of residues that are important for complex formation. Counter to this expectation, we find that a narrow set of residues dominates the binding energy in protein–protein complexes independent of backbone conformation. This finding suggests that the binding epitope may instead be substantially influenced by the side-chain conformations adopted. We analyzed side-chain conformational preferences in residues that contribute significantly to binding. This analysis suggests that preferred rotamers contribute directly to specificity in protein complex formation and provides guidelines for peptidomimetic inhibitor design.

Show Abstract

Racemization barriers of atropisomeric 3,3′-bipyrroles: an experimental study with theoretical verification

S. Chatterjee, G.L. Butterfoss, M. Mandal, B. Paul, S. Gupta, R. Bonneau, P. Jaisankar

The significant rotational energy barrier about the stereogenic carbon–carbon bond of axially chiral 3,3′-bipyrroles has been investigated by electronic circular dichroism (ECD) spectroscopy, time dependent HPLC analysis, and computational modeling. The results elucidate pathways and transition states involved in configurational inversion, thereby confirming that 3,3′-bipyrrole derivatives can exist in stable and isolable atropisomeric forms.

Show Abstract

GIANT API: An Application Programming Interface for Functional Genomics

GIANT API provides biomedical researchers programmatic access to tissue-specific and global networks in humans and model organisms, and associated tools, which includes functional re-prioritization of existing genome-wide association study (GWAS) data. Using tissue-specific interaction networks, researchers are able to predict relationships between genes specific to a tissue or cell lineage, identify the changing roles of genes across tissues and uncover disease-gene associations. Additionally, GIANT API enables computational tools like NetWAS, which leverages tissue-specific networks for re-prioritization of GWAS results. The web services covered by the API include 144 tissue-specific functional gene networks in human, global functional networks for human and six common model organisms and the NetWAS method. GIANT API conforms to the REST architecture, which makes it stateless, cacheable and highly scalable. It can be used by a diverse range of clients including web browsers, command terminals, programming languages and standalone apps for data analysis and visualization. The API is freely available for use at http://giant-api.princeton.edu.

Show Abstract
  • Previous Page
  • Viewing
  • Next Page
Advancing Research in Basic Science and MathematicsSubscribe to Flatiron Institute announcements and other foundation updates