CHOPPI integrates a number of data sources — genome, transcriptome, secretome, and immunome — to identify immunogenicity risks from host contaminant proteins (HCPs) in CHO-based protein production.
CHOPPI is described in:
C. Bailey-Kellogg, A.H. Gutierrez, L. Moise, F. Terry, W. Martin, A. S. De Groot, "CHOPPI: a web tool for the analysis of immunogenicity risk from host cell proteins in CHO-based protein production", Biotechnology & Bioengineering, 2014, in press.
Please cite this paper if you use CHOPPI for your work.
Earlier CHOPPI-based analysis is presented in:
A.H. Gutierrez, L. Moise, F. Terry, K. Dasilva, C. Bailey-Kellogg, W. Martin, A. S. De Groot, "Immunoinformatic Analysis of Chinese Hamster Ovary (CHO) Protein Contaminants in Therapeutic Protein Formulations", Immunoinformatics and Computational Immunology Workshop, 2012.
The CHOPPI web server was developed by Chris Bailey-Kellogg in collaboration with Annie De Groot, Bill Martin, and the other collaborators listed on the paper above. Thanks to EpiVax for use of EpiMatrix and whole-protein immunogenicity evaluation.
We welcome all feedback on how to improve the site and help you apply it to your work.
Search for a CHO protein by its name, id (gi or gb), or amino acid sequence (BLAST).
Identify proteins in intersections of the various -omes. Percent identity is to the closest homolog in the specified set, such that the alignment includes at least the percent coverage of the query protein.
A protein page (e.g., beta-2-microglobulin) provides a full analysis of a protein:
- id (gi and gb)
- We have analyzed CHO K1 [bioproject 69991] and 17A/GY [bioproject 189319]. As protein sequences become available for other strains, we will include them too [alert us].
- To focus on proteins expressed by CHO, we have compared the genes against contigs in one of the CHO transcriptome projects [bioproject 66543].
- To further focus on proteins that have been translated, we have compared the genes against the sequences (both proteins and glycoproteins) identified by the proteomic analysis of Baycin-Hizal et al.
- mouse secretome
- To focus on the proteins likely to be secreted, we have compared the genes against mouse secreted proteins identified in the LOCATE database and UniProt.
- As a complementary assessment of secretion, we employed SignalP (v4.0, default settings) to identify which genes have predicted signal peptides.
- validated HCPs
- We have collected from the literature some experimentally identified CHO host contaminant proteins. You may submit more for us to include.
- To check homology with the human genome, we have compared against the Uniprot Reviewed database (downloaded 2012-10-22).
- immunogenicity score
- These are on a scale such that numbers below -20 indicate low risk of immunogenicity and those above 20 high risk.
- epitope analysis
- We provide a summary of the MHC class 2 epitopes predicted by
EpiMatrix within the protein.
In addition to listing the total number of immunogenic 9mers, we also separate the count of 9mers that have a degenerate human 9mer and those that don't. Those without a human counterpart are more likely to pose an immunogenicity risk.
Finally, we provide the percentage of 9mer frames (# amino acids - 8) that are in epitopes or in unique-to-CHO epitopes. This helps calibrate epitope "density".