Automatic Pairing of Chromosomes

Alisha D’Souza

Thayer School of Engineering

Introduction

Karyotype [1] is a set of characteristics that describe the chromosomes in a cell. An ordered depiction of the karyotype, as an image, in a standard format, is called a karyogram; chromosomes are arranged in pairs by size (decreasing order) and centromere position. Study of karyograms is at the heart of cytogenetics. These analyses contribute greatly to the study of chromosomal abnormalities and aberrations, genetic disorders, taxonomical relationships etcetera.

In humans, somatic cells have 23 classes of chromosome, and a total of 46 chromosomes per cell; 23 pairs of chromosomes are present in each cell. In order to develop a karyogram, cells arrested at the metaphase stage of cell division are stained, by a dye, such as Giemsa [2] and imaged. The chromosomes then need to be arranged in pairs in order of decreasing size. This process of pairing and karyotyping is usually done manually and requires considerable time of an expert. Automating these is an active field of research [3].

Objective

The goal of this project is to automatically pair chromosomes from a karyogram.

Dataset

The Lisbon-K1 dataset [3, 15], of chromosomes from bone marrow cells of leukemia patients, developed by the technicians of Institute of Molecular Medicine of Lisbon, will be used for this project. The dataset contains 200 karyograms (9200 chromosomes).

Method

Initial steps required for an automatic pairing or classification algorithm involve extraction of features. Features used commonly in literature are dimensions, geometry, band profile [3]. I have, through prior research, preprocessed and geometrically corrected the chromosomes by the method described here [4]. In this project, feature vectors will be extracted fully and pairing will be performed as described below.

For pairing based on classification, numerous methods of classifier design have been proposed in literature. For example, hidden markov models [5], template matching [6], neural network and multilayer perceptron [7] – [12], wavelet [13], fuzzy [6] and Bayes [9] classifiers have been proposed. Classification success is usually in the range of 70% to 80% with these, which is much lower than the accuracy of 99.70% achieved by a human expert [3]. Khmelinskii et al propose an algorithm that pairs chromosomes directly without accurately classifying them and assistance from a rough classification, performed using Support Vector Machine (SVM) classifier is used [14]. This is the method for pairing that I propose to use for this project.

Timeline

By Milestone: Extract feature set and pair chromosomes by distance-based approach (without assistance from rough classification performed using SVM classifier).

By May 30th: Obtain complete results of paired chromosomes by the method described in [14].

References

[1] http://en.wikipedia.org/wiki/Karyotype#cite_note-3

[2] http://en.wikipedia.org/wiki/Giemsa_stain

[3] A. Khmelinskii, R. Ventura and J. Sanches, “Chromosome Pairing for Karyotyping Purposes using Mutual Information,” Proceedings of the 5th IEEE International Symposium on Biomedical Imaging, 14-17 May 2008, pp 484-487.

[4] S. Khan, A. DSouza, J. Sanches and R. Ventura, “Geometric Correction of Deformed Chromosomes for Automatic Karyotyping” (under review).

[5] J. M. Conroy, R. L. Jr. Becker, W. Lefkowitz, K. L. Christopher, R. B. Surana, T. O’Leary, D. P. O’Leary, and T. G. Kolda, “Hidden markov models for chromosome identification,” in Proceedings of the 14th IEEE Symposium of Computer-Based Medical Systems, July 2001, pp. 473– 477.

[6] A. M. Badawi, K. G. Hasan, E. A. Aly, and R. A. Messiha, “Chromosomes classification based on neural networks, fuzzy rule based, and template matching classifiers,” in Proceedings of the 46th IEEE International Midwest Symposium on Circuits and Systems, Dec. 2003, vol. 1, pp. 383–387.

[7] J. R. Stanley, M. J. Keller, P. Gader, and W. C. Caldwell, “Data-driven homologue matching for chromosome identification,” IEEE Transactions on Medical Imaging, vol. 17, no. 3, pp. 451–462, 1998.

[8] M. Zardoshti-Kermani and A. Afshordi, “Classification of chromosomes using higher-order neural networks,” in IEEE International Conference on Neural Networks, Nov.-Dec. 1995, vol. 5, pp. 2587–2591.

[9] B. Lerner, H. Guterman, I. Dinstein, and Y. Romem, “A comparison of multilayer perceptron neural network and bayes piecewise classifier for chromosome classification,” IEEE International Conference on Computational Intelligence, vol. 6, pp. 3472–3477, June-July 1994.

[10] B. Lerner, M. Levinstein, B. Rosenberg, H. Guterman, L. Dinstein, and Y. Romem, “Feature selection and chromosome classification using a multilayer perceptron neural network,” IEEE International Conference on Neural Networks, vol. 6, pp. 3540–3545, Jun.-Jul. 1994.

[11] B. Lerner, “Toward a completely automatic neural-network-based human chromosome analysis,” IEEE Transactions on Systems, Man and Cybernetics, Part B, vol. 28, no. 4, pp. 544–552, Aug. 1998.

[12] J. M. Cho, “Chromosome classification using backpropagation neural networks,” IEEE Engineering in Medicine and Biology Magazine, vol. 19, no. 1, pp. 28–33, Jan.-Feb. 2000.

[13] Q. Wu and K. R. Castleman, “Automated chromosome classification using wavelet-based band pattern descriptors,” in 13th IEEE Symposium on Computer-Based Medical Systems, June 2000, pp. 189–194.

[14] A. Khmelinskii, R. Ventura, J. Sanches, “Classifier-assisted metric for chromosome pairing,” IEEE International Conference of Engineering in Medicine and Biology Society, 2010, 6729-6732.

[15] http://mediawiki.isr.ist.utl.pt/wiki/Lisbon-K_Chromosome_Dataset