J. Thomas, N. Ramakrishnan, and C. Bailey-Kellogg, "Graphical models of protein-protein interaction specificity from correlated mutations and interaction data", Proteins, 2009, 76:911-929. [paper]

Protein-protein interactions are mediated by complementary amino acids defining complementary surfaces. Typically not all members of a family of related proteins interact equally well with all members of a partner family; thus analysis of the sequence record can reveal the complementary amino acid partners that confer interaction specificity. This paper develops methods for learning and using probabilistic graphical models of such residue "cross-coupling" constraints between interacting protein families, based on multiple sequence alignments and information about which pairs of proteins are known to interact. Our models generalize traditional consensus sequence binding motifs, and provide a probabilistic semantics enabling sound evaluation of the plausibility of new possible interactions. Furthermore, predictions made by the models can be explained in terms of the underlying residue interactions. Our approach supports different levels of prior knowledge regarding interactions, including both one-to-one (e.g., pairs of proteins from the same organism) and many-to-many (e.g., experimentally identified interactions), and we present a technique to account for possible bias in the represented interactions. We apply our approach in studies of PDZ domains and their ligands, fundamental building blocks in a number of protein assemblies. Our algorithms are able to identify biologically interesting cross-coupling constraints, to successfully identify known interactions, and to make explainable predictions about novel interactions.

proteins09.png