This page will be updated daily to weekly with current/upcoming topics and references. The details are subject to change and are provided without warranty.
| Date (Notes) | Assignments | Topic | References | |
| Aug. | 26 | P0 out | Overview |
www.python.org AI and Molecular Biology, Ch. 1 |
| 28 | Bio basics |
The Biology Project: biochemistry and molecular biology MIT 7.001 (Intro Biology) PDB: molecule of the month Amino acid properties; Amino acid wikipedia entry Stryer, Biochemistry, 5th ed., parts of Ch. 1-7 (Reserved) Alberts et al., Molecular Biology of the Cell, 4th ed., parts of Ch. 1-7 (Reserved) Branden & Tooze, Introduction to Protein Structure, Ch. 1-2 (Reserved) | ||
| Sequences | ||||
| Sept. | 2 | P0 due H1 out | Pairwise alignment |
Pearson's ISMB tutorial PSC tutorial Classic articles: Smith and Waterman; Needleman and Wunsch; Gotoh BLAST: Altshcul et al., JMB, 1990 (classic articles archive); Altschul et al., NAR, 1997 (paper) FASTA: Pearson and Lipman, PNAS, 1988 (paper) Durbin, Eddy, Krogh, Mitchison, Biological Sequence Analysis, Ch. 2 (Reserved) Gusfield, Algorithms on Strings, Trees, and Sequences, Ch. 11, 15 (Reserved) Mount, Bioinformatics: Sequence and Genome Analysis, Ch. 3, 7 (Reserved) |
| 4 | Multiple alignment |
MSA: Lipman, Altschul, Kececioglu, PNAS, 1989 (paper); server seems to be missing ClustalW: Thompson, Higgins, Gibson, NAR, 1994 (classic articles archive) Durbin, Eddy, Krogh, Mitchison, Biological Sequence Analysis, Ch. 6 (Reserved) Gusfield, Algorithms on Strings, Trees, and Sequences, Ch. 14 (Reserved) Mount, Bioinformatics: Sequence and Genome Analysis, Ch. 4 (Reserved) |
||
| 9 | H1 due P1 out | Profile HMMs |
Cline, Barrett, Karplus, ISMB-99 tutorial Krogh, "An introduction to hidden Markov models for biological sequences" (paper) Eddy, "Profile hidden Markov models", Bioinformatics, 1998 (paper) Rabiner, "A tutorial on hidden Markov models and selected applications in speech recognition", Proc. IEEE, 1989 (paper) Software (and more papers): SAM and HMMER Durbin, Eddy, Krogh, Mitchison, Biological Sequence Analysis, Ch. 3, 5 (Reserved) |
|
| 11 | Motifs |
Lawrence et al., "Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment", Science, 1993 (paper) Bailey and Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proc. ISMB, 1994 (paper) and other papers Dempster, Laird, Rubin, "Maximum likelihood from incomplete data via the EM algorithm", J. Royal Statistical Society B, 1977 (paper) Lawrence, Overview of the Gibbs Motif Sampler Farid, Fundamentals of Image Processing", description of and Matlab example for EM MEME software PROSITE database Mount, Bioinformatics: Sequence and Genome Analysis, Ch. 4 (Reserved) |
||
| 16 | Phylogeny |
Tree of Life Phylip software and links to many more Eisen, "Phylogenomics: improving functional predictions for uncharacterized genes by evolutionary analysis", Genome Research, 1998(paper) Tatusov, Koonin, and Lipman, "A genomic perspective on protein families", Science, 1997 (paper) Tatusov et al., "The COG database: new developments in phylogenetic classification of proteins from complete genomes", NAR, 2001 (paper) Pellegrini et al., "Assigning protein functions by comparative genome analysis: protein phylogenetic profiles", PNAS, 1999 (paper) COG Durbin, Eddy, Krogh, Mitchison, Biological Sequence Analysis, Ch. 7 (Reserved) Gusfield, Algorithms on Strings, Trees, and Sequences, Ch. 17 (Reserved) Mount, Bioinformatics: Sequence and Genome Analysis, Ch. 6 (Reserved) |
||
| 18 | Suffix trees |
Suffix trees (includes demo) Some resources An overview Gusfield's software Gusfield, Algorithms on Strings, Trees, and Sequences, Ch. 5-9 (Reserved; handout) | ||
| Structures | ||||
| 23 | P1 due | Structure prediction basics |
Various protein structure tutorials by EXPASY, Birkbeck PDB: molecule of the month; education Lovell et al., "The Penultimate Rotamer Library", Proteins, 2000: paper, web page Various MM tutorials by Cross, Glactone, Stote, NIH Branden and Tooze, Introduction to Protein Structure (Reserved) Leach, Molecular Modelling: Principles and Applications (Reserved) | |
| 25 | Threading | Guest Lecturer: Daisuke Kihara Bowie, Luthy, and Eisenberg, "A method to identify protein sequences that fold into a known three-dimensional structure", Science, 1991 (abstract; paper) Godzik, Kolinski, and Skolnick, "Topology fingerprint approach to the inverse protein folding problem", JMB, 1992 (abstract) Skolnick and Kihara, "Defrosting the frozen approximation: PROSPECTOR--a new approach to threading", Proteins, 2001 (abstract and paper; server) | ||
| 30 | H2 out | Homology modeling and ab initio prediction |
Viewpoint: Baker and Sali, "Protein structure prediction and structural genomics", Science, 2001 (paper) Survey: Marti-Renom et al., "Comparative protein structure modeling of genes and genomes", Annu. Rev., 2000 (paper) Sali and Blundell, "Comparative protein modelling by satisfaction of spatial restraints", JMB, 1993 (paper) Simons, Kooperberg, Huang, and Baker, "Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and bayesian scoring functions", JMB, 1997 (paper) Bystroff and Baker, "Prediction of local structure in proteins using a library of sequence-structure motifs", JMB, 1998 (paper) Bystroff, Thorsson, and Baker, "HMMSTR: a hidden markov model for local sequence-structure correlations in proteins", JMB, 2000 (paper) Homology modelers: MODELLER, Swiss-model I-Sites/HMMSTR/Rosetta server CASP web page | |
| Oct. | 2 | P2 out | Structure alignment |
Holm and Sander, "Mapping the protein universe", Science, 1996 (paper) Shindyalov and Bourne, "Protein structure alignment by incremental combinatorial extension (CE) of the optimal path", Protein Engineering, 1998 (paper) Murzin et al., "SCOP: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol., 1995 (paper) Web sites: DALI/FSSP, CE, SCOP CVonline has a section on pose estimation (e.g. 3D-3D); so does the Kwon3D Motion Analaysis Web (rotation matrices) Mount, Bioinformatics: Sequence and Genome Analysis, Ch. 9 (Reserved) |
| 7 | H2 due | Molecular Dynamics |
Steered MD:
example papers on unfolding titin and survey Blue Gene: example papers on Blue Gene and Blue Gene/L Folding @ Home: example papers on beta hairprin folding and ensemble dynamics Teodoro, Phillips Jr., and Kavraki, "A dimensionality reduction approach to modeling protein flexibility", Proc. RECOMB, 2002 (paper; web site) | |
| 9 | Review, Project discussion | |||
| 14 | October Break | |||
| 16 | Docking |
Ewing and Kuntz, "Critical evaluation of search algorithms for automated molecular docking and database screening", J. Comp. Chem., 1997 (paper) Ewing, Makino, Skillman, and Kuntz, "DOCK 4.0: search strategies for automated molecular docking of flexible molecule databases," J. CAMD, 2001 (paper) Gabb, Jackson, and Sternberg, "Modelling protein docking using shape complementarity, electrostatics and biochemical information", JMB, 1997 (paper) Chen and Weng, "A novel shape complementarity scoring function for protein-protein docking", Proteins, 2003 (paper) Halperin, Ma, Wolfson, Nussinov, "Principles of docking: An overview of search algorithms and a guide to scoring functions", Proteins, 2002 (paper) Some geometric hashing papers: Proteins, 1999, Wolfson's archive (e.g. Salzberg book chapter and IEEE CiSE) Example software: DOCK, AutoDock, FT Dock, ZDock |
||
| 21 | P2 due H3 out | Protein design |
Lovell et al., "The Penultimate Rotamer Library", Proteins, 2000: paper, web page Dahiyat and Mayo, "De novo protein design: fully automated sequence selection," Science, 1997 (paper) Pierce, Spriet, Desmet, and Mayo. "Conformational splitting: a more powerful criterion for dead-end elimination," J. Comp. Chem., 2000 (paper) Voigt, Gordon, and Mayo, "Trading accuracy for speed: A quantitative comparison of search algorithms in protein sequence design," JMB, 2000 (paper) Koehl and Delarue, "Application of a self-consistent mean field theory to predict protein side-chains conformation and estimate their conformational entropy," JMB, 1994 (paper) Looger, Dwyer, Smith, and Hellinga, "Computational design of receptor and sensor proteins with novel functions", Nature, 2003 (paper) |
|
| Networks | ||||
| 23 | Microarrays 1 Clustering |
Eisen et al., "Cluster analysis and display of genome-wide expression patterns", PNAS, 1998 (paper) Tamayo et al., "Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoetic differentiation", PNAS, 1999 (paper) SOM toolbox for Matlab Stanford microarray database Example software (and data): Eisen, ExpressionProfiler Flash animation of DNA microarray technology |
||
| 28 | H3 due P3 out | Microarrays 2 Classification; SVD |
Golub et al., "Molecular classification of cancer: class discovery and class prediction by gene expression monitoring", Science, 1999 (paper; website) Brown et al., "Knowledge-based analysis of microarray gene expression data by using support vector machines", PNAS, 2000 (paper; web page) Burges, "A tutorial on support vector machines for pattern recognition", Data Mining and Knowledge Discovery, 1998 (paper) Other tutorials available at kernel-machines.org Alter, Brown, and Botstein, "Singular value decomposition for genome-wide expression data processing and modeling", PNAS, 2000 (paper) |
|
| 30 | Proposal due | Regulatory network inference |
Tavazoie et al., "Systematic determination of genetic network architecture", Nature Genetics, 1999 (paper) Segal et al., "From promoter sequence to expression: a probabilistic framework", Proc. RECOMB, 2002 (paper) Segal et al., "Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data", Nature Genetics, 2003 (paper) Bayes net tutorials: Heckerman, Murphy, Moore, selections from Jordan's book |
|
| Nov. | 4 | Mass spectrometry for proteomics |
Aebersold, Mann, "Mass spectrometry-based proteomics", Nature, 2003 (paper) Mann, Hendrickson, Pandey, "Analysis of proteins and proteomes by mass spectrometry", Annual Reviews, 2001 (paper) Ask me if you want more info on any of the other MS applications in the lecture. Lilien, Farid, Donald, "Probabilistic Disease Classification of Expression-Dependent Proteomic Data from Mass Spectrometry of Human Serum", J. Comp. Biol., 2003 (paper) Some MS tutorials: Purdue, Scripps, Leeds, Davidson Example MS software: Protein Prospector, PeptideMass, PROWL |
|
| 6 | Protein-protein interactions |
Tong et al., "A combined experimental and computational strategy to define protein interaction networks for peptide recognition modules", Science, 2002 (paper) Bader and Hogue, "An automated method for finding molecular complexes in large protein interaction networks", BMC Bioinformatics, 2003 (paper) Other example interaction papers: Ho et al. (mass spec): Nature 2002; Ito et al. (two hybrid) PNAS 2001; Uetz et al. (two hybrid): Nature 2000 Databases: BIND, DIP, etc. |
||
| 11 | P3 due | Network simulation |
StochSim homepage Morton-Firth and Bray, "Predicting temporal fluctuations in an intracellular signalling pathway", J. Theor. Biol., 1998 (paper) Morton-Firth, Shimizu, and Bray, "A free-energy-based stochastic simulation of the Tar receptor complex", J. Mol. Biol., 1999 (paper) Shimizu, Aksenov, and Bray, "A spatially extended stochastic model of the bacterial chemotaxis signalling pathway", J. Mol. Biol., 2003 (paper) Survey article on other simulation approaches (for regulation): de Jong, "Modeling and Simulation of Genetic Regulatory Systems: A Literature Review", J. Comp. Biol., 2002 (paper) |
|
| 13 | H4 out | Systems biology |
Ideker et al., "Integrated genomic and proteomic analyses of a systematically perturbed metabolic network," Science, 2001 (paper) Ideker, Galitski, and Hood, "A new approach to decoding life: systems biology", Annu. Rev. Genomics Hum. Genet., 2001 (paper) Hood's Institute for Systems Biology |
|
| Home Stretch | ||||
| 18 | Class choice Gene finding |
GLIMMER: Salzberg et al., "Microbial gene identification using interpolated Markov models," NAR, 1998 (paper) GenScan: Burge and Karlin, "Prediction of complete gene structures in human genomic DNA," J. Mol. Biol., 1997 (paper) Mount, Bioinformatics: Sequence and Genome Analysis, Ch. 8 (Reserved) |
||
| 20 | H4 due H5 out | Class choice Mutation clustering |
S.W. Lockless and R. Ranganathan, "Evolutionarily conserved pathways of energetic connectivity in protein families", Science, 1999 (paper) G.M. Suel, S.W. Lockless, M.A. Wall, R. Ranganathan, "Evolutionarily conserved networks of residues mediate allosteric communication in proteins", Nat Struct Biol, 2003 (paper) O. Olmea, B. Rost, and A. Valencia, "Effective use of sequence correlation and conservation in fold recognition1", J. Mol. Biol., 1999 (paper) O. Schueler-Furman and D. Baker, "Conserved residue clustering and protein structure prediction", Proteins, 2003 (paper) |
|
| 25 | Thanksgiving Break | |||
| 27 | Thanksgiving Break | |||
| Dec. | 2 | H5 due H6 out | Presentations | Mehmet, Sheetal/Hetu, Wenhui |
| 4 | Presentations | Carrie/Mingwu, Chris/Mohamed, Manish/Srinivasan | ||
| 9 | Presentations | Jiangtian, Xiaoduan, Evans/Otto | ||
| 11 | H6 due | Presentations | Megan/Scott, Chetak, Robert | |
| 18 | Project due | |||
![]() |
CS 590B Last modified: Fri Nov 21 11:34:01 EST 2003 |