TY - GEN
T1 - DNA base-calling using artificial neural networks
AU - Khan, Omniyah Gul M.
AU - Assaleh, Khaled T.
AU - Husseini, Ghaleb A.
AU - Majdalawieh, Amin F.
AU - Woodward, Scott R.
PY - 2011
Y1 - 2011
N2 - Information acquired from any species genomic sequence is expected to contribute massively to advances in various fields, such as medicine, forensics and agriculture. This huge impact of DNA sequencing leads to the need for efficient automation of mapping chromatogram traces to their corresponding string of bases through base-calling. This paper attempts to solve the problem of base-calling by modeling traces using Artificial Neural Networks (ANN). Traces, belonging to Homo sapiens, Saccharomyces mikatae and Drosophila melanogaster, undergo pre-processing, which includes de-correlation, de-convolution and normalization, to minimize or eliminate data imperfections. Representative features are then extracted for training and testing the ANN base-caller. Results obtained are then compared with the existing standards, PHRED and ABI KB base-caller in terms of deletion, insertion and substitution errors. Simulation results indicate that the proposed model achieve a higher base-calling accuracy when compared to PHRED and a comparable performance when compared to ABI KB. The results obtained validate the potential of the proposed model for efficient DNA base-calling.
AB - Information acquired from any species genomic sequence is expected to contribute massively to advances in various fields, such as medicine, forensics and agriculture. This huge impact of DNA sequencing leads to the need for efficient automation of mapping chromatogram traces to their corresponding string of bases through base-calling. This paper attempts to solve the problem of base-calling by modeling traces using Artificial Neural Networks (ANN). Traces, belonging to Homo sapiens, Saccharomyces mikatae and Drosophila melanogaster, undergo pre-processing, which includes de-correlation, de-convolution and normalization, to minimize or eliminate data imperfections. Representative features are then extracted for training and testing the ANN base-caller. Results obtained are then compared with the existing standards, PHRED and ABI KB base-caller in terms of deletion, insertion and substitution errors. Simulation results indicate that the proposed model achieve a higher base-calling accuracy when compared to PHRED and a comparable performance when compared to ABI KB. The results obtained validate the potential of the proposed model for efficient DNA base-calling.
UR - https://www.scopus.com/pages/publications/79957914644
U2 - 10.1109/MECBME.2011.5752074
DO - 10.1109/MECBME.2011.5752074
M3 - Conference contribution
AN - SCOPUS:79957914644
SN - 9781424470006
T3 - 2011 1st Middle East Conference on Biomedical Engineering, MECBME 2011
SP - 96
EP - 99
BT - 2011 1st Middle East Conference on Biomedical Engineering, MECBME 2011
T2 - 2011 1st Middle East Conference on Biomedical Engineering, MECBME 2011
Y2 - 21 February 2011 through 24 February 2011
ER -