C. Bailey-Kellogg, J.J. Kelley, III, R. Lilien, and B.R. Donald, "Physical geometric algorithms for structural molecular biology", Invited paper, ICRA, 2001. [preprint]

A wealth of interesting computational problems arises in proposed methods for discovering new pharmaceuticals. This paper surveys our recent work in three key areas, using a Physical Geometric Algorithm (PGA) approach to data interpretation, experiment planning, and drug design:

    Data-directed computational protocols for high-throughput protein structure determination. A key component of structure determination through nuclear magnetic resonance (NMR) is that of assigning spectral peaks. We are developing a novel approach, called Jigsaw, to automated secondary structure determination and main-chain assignment. Jigsaw consists of two main components: graph-based secondary structure pattern identification in unassigned heteronuclear (N15-labeled) NMR data, and assignment of spectral peaks by probabilistic alignment of identified secondary structure elements against the primary sequence. Experiment planning and data interpretation algorithms for reducing mass degeneracy in mass spectrometry (MS). MS offers many advantages for high-throughput assays (e.g. small sample size and large mass limits), but it faces the potential problem of mass degeneracy — indistinguishable masses for multiple biopolymer fragments (e.g. from a limited proteolytic digest). We are studying the use of selective isotopic labeling to substantially reduce potential mass degeneracy, especially in the context of structural determination of protein-protein and protein-DNA complexes. Computer-aided drug design (CADD). We are developing new CADD tools and applying them to the design of an inhibitor for the Core-Binding Factor-Beta oncoprotein (CBF-Beta-MYH11), a fusion protein involved in some forms of Acute Myelomonocytic Leukemia (AMML). Computational-structural studies of CBF help determine the molecular basis for its function and assist in the development of therapeutic strategies. A key issue in such studies is geometric modeling of protein flexibility; our approach attempts to account for flexibility by using an ensemble of structures representing low-energy conformations as determined by solution NMR.

Our long-range goal is the structural and functional understanding of biopolymer interactions in systems of significant biochemical as well as pharmacological interest. The research overviewed here represents a set of important steps towards that goal.