Biological Sequence Data Preprocessing for Classification: A Case Study in Splice Site Identification

Baten, AKMA; Halgamuge, SK; Chang, B; Wickramarachchi, N

Biological Sequence Data Preprocessing for Classification: A Case Study in Splice Site Identification

dc.contributor.author	Baten, AKMA
dc.contributor.author	Halgamuge, SK
dc.contributor.author	Chang, B
dc.contributor.author	Wickramarachchi, N
dc.date.accessioned	2013-10-21T02:28:50Z
dc.date.available	2013-10-21T02:28:50Z
dc.description.abstract	The increasing growth of biological sequence data demands better and efficient analysis methods. Effective detection of various regulatory signals in these sequences requires the knowledge of characteristics, dependencies, and relationship of nucleotides in the surrounding region of the regulatory signals. A higher order Markov model is generally regarded as a useful technique for modeling higher order dependencies of the nucleotides. However, its implementation requires estimating a large number of computationally expensive parameters. In this paper, we propose a hybrid method consisting of a first order Markov model for sequence data preprocessing and a multilayer perceptron neural network for classification. The Markov model captures the compositional features and dependencies of nucleotides in terms of probabilistic parameters which are used as inputs to the classifier. The classifier combines the Markov probabilities nonlinearly for signal detection. When applied to the splice site detection problem using three widely used data sets, it is observed that the proposed hybrid method is able to model higher order dependencies with better classification accuracies.
dc.identifier.issue	4492
dc.identifier.journal	Lecture Notes in Computer Science
dc.identifier.pgnos	1221-1230
dc.identifier.uri	http://dl.lib.mrt.ac.lk/handle/123/8561
dc.identifier.volume	4492
dc.identifier.year	2007
dc.language	en
dc.title	Biological Sequence Data Preprocessing for Classification: A Case Study in Splice Site Identification
dc.type	Article-Abstract

Files

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Plain Text
Description:

Download

Collections

Articles authored by UoM staff