X. Ye, P.K. O'Neil, A.N. Foster, M.J. Gajda, J. Kosinski, M.A. Kurowski, A.M. Friedman, and C. Bailey-Kellogg, "Probabilistic cross-link analysis and experiment planning for high-throughput elucidation of protein structure", Protein Sci., 2004, 13:3298-3313. [paper]

Emerging high-throughput techniques for the characterization of protein and protein complex structures yield noisy data with sparse information content, placing a significant burden on computation to properly interpret the experimental data. One such technique employs cross-linking (chemical or by cysteine oxidation) to confirm or select among proposed structural models (e.g. from fold recognition, ab initio prediction, or docking) by testing the consistency between cross-linking data and model geometry. This paper develops a probabilistic framework for analyzing the information content in cross-linking experiments, accounting for anticipated experimental error. This framework supports a mechanism for planning experiments to optimize the information gained. We evaluate potential experiment plans using explicit trade-offs among key properties of practical importance — discriminability, coverage, balance, ambiguity, and cost. We devise a greedy algorithm that considers those properties and, from a large number of combinatorial possibilities, rapidly selects sets of experiments expected to efficiently discriminate pairs of models. In an application to residue-specific chemical cross-linking, we demonstrate the ability of our approach to effectively plan experiments involving combinations of cross-linkers and introduced mutations. We also describe an experiment plan for the bacteriophage lambda Tfa chaperone protein in which we plan dicysteine mutants for discriminating threading models by disulfide formation. Preliminary results from a subset of the planned experiments are consistent and demonstrate the practicality of planning. Our methods provide the experimenter with a valuable tool (available from the authors) for understanding and optimizing cross-linking experiments.