TY - GEN
T1 - DNA base-calling using polynomial classifiers
AU - Mohammed, Omniyah G.
AU - Assaleh, Khaled T.
AU - Husseini, Ghaleb A.
AU - Majdalawieh, Amin F.
AU - Woodward, Scott R.
PY - 2010
Y1 - 2010
N2 - Base-calling is one of many problems that can be solved using pattern recognition, the act of classifying raw data based on prior or statistical information extracted from the data into various classes. In this paper, we propose a new framework using polynomial classifiers to model electropherogram traces obtained from ABI sequencing machines to perform base-calling. Initially, pre-processing, which includes segmented normalization and peak sharpening, needs to be performed to reduce the imperfections caused in a trace as a result of the chemistry involved. Discriminative feature vectors are then extracted from the chromatogram traces and are expanded to a higher dimensional space by second order polynomial expansion. A linear classifier is then trained and bases are classified respectively. Chromatogram traces that were chosen for analysis belong to Homo sapiens, Saccharomyces mikatae and Drosophila melanogaster. Simulation results indicated an accuracy of up to 99.2% upon testing three different chromatogram traces consisting of about 600 to 800 bases each. The proposed model's performance was compared to the existing standards: ABI and PHRED in terms of insertion, deletion and substitution errors. Simulation evidence indicated that the designed model performs comparably or slightly better than ABI in terms of deletion and insertion errors. Moreover, polynomial classifier resulted in negligible substitution errors compared to ABI. Polynomial classifier was also observed to perform comparable to PHRED in terms of deletion error and substitution errors. The results obtained demonstrate the potential of this model to perform base-calling.
AB - Base-calling is one of many problems that can be solved using pattern recognition, the act of classifying raw data based on prior or statistical information extracted from the data into various classes. In this paper, we propose a new framework using polynomial classifiers to model electropherogram traces obtained from ABI sequencing machines to perform base-calling. Initially, pre-processing, which includes segmented normalization and peak sharpening, needs to be performed to reduce the imperfections caused in a trace as a result of the chemistry involved. Discriminative feature vectors are then extracted from the chromatogram traces and are expanded to a higher dimensional space by second order polynomial expansion. A linear classifier is then trained and bases are classified respectively. Chromatogram traces that were chosen for analysis belong to Homo sapiens, Saccharomyces mikatae and Drosophila melanogaster. Simulation results indicated an accuracy of up to 99.2% upon testing three different chromatogram traces consisting of about 600 to 800 bases each. The proposed model's performance was compared to the existing standards: ABI and PHRED in terms of insertion, deletion and substitution errors. Simulation evidence indicated that the designed model performs comparably or slightly better than ABI in terms of deletion and insertion errors. Moreover, polynomial classifier resulted in negligible substitution errors compared to ABI. Polynomial classifier was also observed to perform comparable to PHRED in terms of deletion error and substitution errors. The results obtained demonstrate the potential of this model to perform base-calling.
UR - https://www.scopus.com/pages/publications/79959456492
U2 - 10.1109/IJCNN.2010.5596983
DO - 10.1109/IJCNN.2010.5596983
M3 - Conference contribution
AN - SCOPUS:79959456492
SN - 9781424469178
T3 - Proceedings of the International Joint Conference on Neural Networks
BT - 2010 IEEE World Congress on Computational Intelligence, WCCI 2010 - 2010 International Joint Conference on Neural Networks, IJCNN 2010
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2010 6th IEEE World Congress on Computational Intelligence, WCCI 2010 - 2010 International Joint Conference on Neural Networks, IJCNN 2010
Y2 - 18 July 2010 through 23 July 2010
ER -