CS 88/188 (Bioinformatics), Spring 2005: Schedule |
||||
| This page will be updated daily to weekly with current/upcoming topics. The details are subject to change and are provided without warranty. I typically provide pointers to much more material than is needed; read what you find useful in following up on the lectures. I welcome additional pointers -- please email me or post directly on the class message board. | ||||
| Date | Assignments | Topic | Reading | |
| Sequences I | ||||
| Mar. | 29 | HW0 out | Course overview Pairwise sequence alignment by dynamic programming |
AI and Molecular Biology, Ch. 1 Pearson's ISMB-00 tutorial Tompa's lecture notes PSC tutorial Classic papers: Smith and Waterman; Needleman and Wunsch; Gotoh BLAST: Altshcul et al., JMB, 1990 (classic papers archive); Altschul et al., NAR, 1997 (paper) FASTA: Pearson and Lipman, PNAS, 1988 (paper) |
| 31 | Sequence profiles by hidden Markov models |
Cline, Barrett, Karplus, ISMB-99 tutorial Krogh, "An introduction to hidden Markov models for biological sequences" (paper) Eddy, "Profile hidden Markov models", Bioinformatics, 1998 (paper) Rabiner, "A tutorial on hidden Markov models and selected applications in speech recognition", Proc. IEEE, 1989 (paper) Software (and more papers): SAM and HMMER |
||
| Apr. | 5 | HW0 due; HW1 out |
Sequence motifs by expectation-maximization and Gibbs sampling |
Bailey and Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proc. ISMB, 1994 (paper) and other papers Dempster, Laird, Rubin, "Maximum likelihood from incomplete data via the EM algorithm", J. Royal Statistical Society B, 1977 (paper) Farid, "Fundamentals of Image Processing", description of and Matlab example for EM Lawrence et al., "Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment", Science, 1993 (paper) Lawrence, Overview of the Gibbs Motif Sampler MEME software PROSITE database |
| 7 | Guest lecture: Prof. Bruce Donald |
Lilien, Farid, and Donald, "Probabilistic disease classification of expression-dependent proteomic data from mass spectrometry of human serum", J. Comp. Biol., 2003 (paper) Lilien, Bailey-Kellogg, Anderson, and Donald, "A subgroup algorithm to identify cross-rotation peaks consistent with non-crystallographic symmetry", Acta Cryst., 2004 (paper) Lilien, Stevens, Anderson, and Donald, "A novel ensemble-based scoring and search algorithm for protein redesign, and its application to modify the substrate specificity of the gramicidin synthetase a phenylalanine adenylation enzyme", Proc. RECOMB, 2004 (paper) |
||
| Structures I | ||||
| 12 | Structure comparison by branch-and-bound and heuristic search |
Various protein structure tutorials by EXPASY, Birkbeck PDB: molecule of the month; education Holm and Sander, "Mapping the protein universe", Science, 1996 (paper) Shindyalov and Bourne, "Protein structure alignment by incremental combinatorial extension (CE) of the optimal path", Protein Engineering, 1998 (paper) Murzin et al., "SCOP: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol., 1995 (paper) Web sites: DALI/FSSP, CE, SCOP CVonline has a section on pose estimation (e.g. 3D-3D); so does the Kwon3D Motion Analaysis Web (rotation matrices) |
||
| 14 | HW1 due | Structure prediction by constraint optimization |
Viewpoint: Baker and Sali, "Protein structure prediction and structural genomics", Science, 2001 (paper) Survey: Marti-Renom et al., "Comparative protein structure modeling of genes and genomes", Annu. Rev., 2000 (paper) Sali and Blundell, "Comparative protein modelling by satisfaction of spatial restraints", JMB, 1993 (paper) Lathrop and Smith, "Global optimum protein threading with gapped alignment and empirical pair score functions", JMB, 1996 (paper) Simons, Kooperberg, Huang, and Baker, "Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and bayesian scoring functions", JMB, 1997 (paper) Bystroff and Baker, "Prediction of local structure in proteins using a library of sequence-structure motifs", JMB, 1998 (paper) Bystroff, Thorsson, and Baker, "HMMSTR: a hidden markov model for local sequence-structure correlations in proteins", JMB, 2000 (paper) Homology modelers: MODELLER, Swiss-model Fold prediction meta-server, with links to other fold recognition servers I-Sites/HMMSTR/Rosetta server CASP web page |
|
| 19 | HW2 out | Docking by pattern matching |
Ewing and Kuntz, "Critical evaluation of search algorithms for automated molecular docking and database screening", J. Comp. Chem., 1997 (paper) Ewing, Makino, Skillman, and Kuntz, "DOCK 4.0: search strategies for automated molecular docking of flexible molecule databases," J. CAMD, 2001 (paper) Gabb, Jackson, and Sternberg, "Modelling protein docking using shape complementarity, electrostatics and biochemical information", JMB, 1997 (paper) Chen and Weng, "A novel shape complementarity scoring function for protein-protein docking", Proteins, 2003 (paper) Halperin, Ma, Wolfson, Nussinov, "Principles of docking: An overview of search algorithms and a guide to scoring functions", Proteins, 2002 (paper) Some geometric hashing papers: Proteins, 1999, Wolfson's archive (e.g. Salzberg book chapter and IEEE CiSE) Example software: DOCK, AutoDock, FT Dock, ZDock CAPRI prediction website |
|
| 20x | Catch up | |||
| Functions I | ||||
| 26 | Microarray data analysis by clustering and classification |
Eisen et al., "Cluster analysis and display of genome-wide expression patterns", PNAS, 1998 (paper) Tamayo et al., "Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoetic differentiation", PNAS, 1999 (paper) Golub et al., "Molecular classification of cancer: class discovery and class prediction by gene expression monitoring", Science, 1999 (paper; website) Brown et al., "Knowledge-based analysis of microarray gene expression data by using support vector machines", PNAS, 2000 (paper; website) Burges, "A tutorial on support vector machines for pattern recognition", Data Mining and Knowledge Discovery, 1998 (paper). Other tutorials available at kernel-machines.org Alter, Brown, and Botstein, "Singular value decomposition for genome-wide expression data processing and modeling", PNAS, 2000 (paper) SOM toolbox for Matlab Stanford microarray database Example software (and data): Eisen, ExpressionProfiler Flash animation of DNA microarray technology |
||
| 28 | HW2 due | Regulatory network inference with motifs and graphical models |
Tavazoie et al., "Systematic determination of genetic network architecture", Nature Genetics, 1999 (paper) Segal et al., "Genome-wide discovery of transcriptional modules from DNA sequence and gene expression", Bioinformatics, 2003 (paper) Segal et al., "Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data", Nature Genetics, 2003 (paper) Bayes net tutorials: Heckerman, Murphy, Moore, selections from Jordan's book |
|
| May | 3 | Project proposal due; HW3 out |
Mass spectrometry for proteomics |
Aebersold, Mann, "Mass spectrometry-based proteomics", Nature, 2003 (paper) Mann, Hendrickson, Pandey, "Analysis of proteins and proteomes by mass spectrometry", Annual Reviews, 2001 (paper) Pevzner, Dancik, Tang, "Mutation-tolerant protein identification by mass spectrometry", J. Comp. Bio., 2000 (paper) Potluri et al., "Geometric analysis of cross-linkability for protein fold discrimination", Proc. PSB, 2004 (paper) Ye et al., "Probabilistic cross-link analysis and experiment planning for high-throughput elucidation of protein structure", Protein Sci., 2004 (paper) Lilien, Farid, Donald, "Probabilistic disease classification of expression-dependent proteomic data from mass spectrometry of human serum", J. Comp. Biol., 2003 (paper) Ask me if you want links to any of the other papers in the lecture. Some MS tutorials: Scripps, Leeds, Davidson Example MS software: Protein Prospector, PeptideMass, PROWL |
| Sequences II | ||||
| 5 | Phylogeny |
Tree of Life Phylip software and links to many more ClustalW: Thompson, Higgins, Gibson, "CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment...", NAR, 1994 (paper) Gusfield, "An overview of combinatorial methods for haplotype inference", LNCS, 2004 (paper) HapMap Eisen, "Phylogenomics: improving functional predictions for uncharacterized genes by evolutionary analysis", Genome Research, 1998 (paper) Tatusov, Koonin, and Lipman, "A genomic perspective on protein families", Science, 1997 (paper) Tatusov et al., "The COG database: new developments in phylogenetic classification of proteins from complete genomes", NAR, 2001 (paper) COG |
||
| 10 | Genome Analysis |
GLIMMER: Salzberg et al., "Microbial gene identification using interpolated Markov models," NAR, 1998 (paper) GenScan: Burge and Karlin, "Prediction of complete gene structures in human genomic DNA," J. Mol. Biol., 1997 (paper) MUMmer: Delcher et al., "Alignment of whole genomes", NAR, 1999 (paper) Some suffix tree stuff: intro and demo; overview; links; Gusfield's software |
||
| Structures II | ||||
| 12 | HW3 due | Molecular dynamics |
Steered MD:
example papers on unfolding titin and survey Blue Gene Folding @ Home: example papers on beta hairprin folding and ensemble dynamics Teodoro, Phillips Jr., and Kavraki, "A dimensionality reduction approach to modeling protein flexibility", Proc. RECOMB, 2002 (paper; web site) |
|
| 17 | Protein design |
Lovell et al., "The Penultimate Rotamer Library", Proteins, 2000: paper, web page Dahiyat and Mayo, "De novo protein design: fully automated sequence selection," Science, 1997 (paper) Pierce, Spriet, Desmet, and Mayo. "Conformational splitting: a more powerful criterion for dead-end elimination," J. Comp. Chem., 2000 (paper) Voigt, Gordon, and Mayo, "Trading accuracy for speed: A quantitative comparison of search algorithms in protein sequence design," JMB, 2000 (paper) Koehl and Delarue, "Application of a self-consistent mean field theory to predict protein side-chains conformation and estimate their conformational entropy," JMB, 1994 (paper) Looger, Dwyer, Smith, and Hellinga, "Computational design of receptor and sensor proteins with novel functions", Nature, 2003 (paper) Lilien, Stevens, Anderson, and Donald, "A novel ensemble-based scoring and search algorithm for protein redesign, and its application to modify the substrate specificity of the gramicidin synthetase A phenylalanine adenylation enzyme", Proc. RECOMB, 2004 (paper) |
||
| Functions II | ||||
| 19 | HW4 out | Interaction network characterization |
Tong et al., "A combined experimental and computational strategy to define protein interaction networks for peptide recognition modules", Science, 2002 (paper) Interaction data papers: Ho et al. (mass spec): Nature 2002; Ito et al. (two hybrid) PNAS 2001; Uetz et al. (two hybrid): Nature 2000 Databases: BIND, DIP, etc. Jeong et al., "Lethality and centrality in protein networks", Nature, 2001 (paper) Goldberg and Roth, "Assessing experimentally derived interactions in a small world", PNAS, 2003(paper) Sharan et al., "Conserved patterns of protein interaction in multiple species", PNAS, 2005 (paper) Scott et al., "Efficient algorithms for detecting signaling pathways in protein interaction networks", Proc. RECOMB 2005 (paper) |
|
| 24 | Systems biology |
Ideker et al., "Integrated genomic and proteomic analyses of a systematically perturbed metabolic network," Science, 2001 (paper) Ideker, Galitski, and Hood, "A new approach to decoding life: systems biology", Annu. Rev. Genomics Hum. Genet., 2001 (paper) Hood's Institute for Systems Biology |
||
| Wrap Up | ||||
| 26 | Project presentations | Chris and Sara; Sheng | ||
| 31 | HW4 due | Project presentations | Eric and Mark; Doug; Fei | |
![]() |
CS 88/188 Chris Bailey-Kellogg Last modified: Tue May 24 10:27:37 EDT 2005 |