A data driven binning method to recover more nucleotide sequences of species in a metagenome

dc.contributor.authorVimukthi, K
dc.contributor.authorWimalasiri, G
dc.contributor.authorBandara, P
dc.contributor.authorHerath, D
dc.contributor.editorWeeraddana, C
dc.contributor.editorEdussooriya, CUS
dc.contributor.editorAbeysooriya, RP
dc.date.accessioned2022-08-09T06:27:21Z
dc.date.available2022-08-09T06:27:21Z
dc.date.issued2020-07
dc.description.abstractMetagenomics accelerated the process of studying different species and their dynamics in multiple environments. A key step in a metagenomic study is to group nucleotide sequences belonging to an individual or closely related species which is often termed binning. Multiple machine learning techniques have been adopted in binning metagenomic sequences. Specifically, unsupervised learning is being used in most of the recent binning methods. This work considers data-driven methods for binning metagenomic sequences and discusses such approaches in detail. Furthermore, it explores on increasing the amount of metagenomic sequences binned while maintaining a reasonable binning accuracy. Consequently, a dissimilarity-based approach is proposed to improve the number of contigs binned by an existing binning method. It is shown to result in a 10% increase in the number of contigs binned compared to the original approach. Accordingly, this work suggests that the effective use of observed data which may be discarded as outliers otherwise, may result in improved performance in binning.en_US
dc.identifier.citationK. Vimukthi, G. Wimalasiri, P. Bandara and D. Herath, "A Data Driven Binning Method to Recover More Nucleotide Sequences of Species in a Metagenome," 2020 Moratuwa Engineering Research Conference (MERCon), 2020, pp. 307-312, doi: 10.1109/MERCon50084.2020.9185388.en_US
dc.identifier.conferenceMoratuwa Engineering Research Conference 2020en_US
dc.identifier.departmentEngineering Research Unit, University of Moratuwaen_US
dc.identifier.doi10.1109/MERCon50084.2020.9185388en_US
dc.identifier.emailkasunchamara@eng.pdn.ac.lken_US
dc.identifier.emailgeethp@eng.pdn.ac.lken_US
dc.identifier.emailokprabhath@gmail.comen_US
dc.identifier.emaildamayanthiherath@eng.pdn.ac.lken_US
dc.identifier.facultyEngineeringen_US
dc.identifier.pgnospp. 307-312en_US
dc.identifier.placeMoratuwa, Sri Lankaen_US
dc.identifier.proceedingProceedings of Moratuwa Engineering Research Conference 2020en_US
dc.identifier.urihttp://dl.lib.uom.lk/handle/123/18573
dc.identifier.year2020en_US
dc.language.isoenen_US
dc.publisherIEEEen_US
dc.relation.urihttps://ieeexplore.ieee.org/document/9185388en_US
dc.subjectMetagenomicsen_US
dc.subjectbinningen_US
dc.subjectdata drivenen_US
dc.subjectmahalanobis distance measureen_US
dc.titleA data driven binning method to recover more nucleotide sequences of species in a metagenomeen_US
dc.typeConference-Full-texten_US

Files

Collections