A data driven binning method to recover more nucleotide sequences of species in a metagenome
dc.contributor.author | Vimukthi, K | |
dc.contributor.author | Wimalasiri, G | |
dc.contributor.author | Bandara, P | |
dc.contributor.author | Herath, D | |
dc.contributor.editor | Weeraddana, C | |
dc.contributor.editor | Edussooriya, CUS | |
dc.contributor.editor | Abeysooriya, RP | |
dc.date.accessioned | 2022-08-09T06:27:21Z | |
dc.date.available | 2022-08-09T06:27:21Z | |
dc.date.issued | 2020-07 | |
dc.description.abstract | Metagenomics accelerated the process of studying different species and their dynamics in multiple environments. A key step in a metagenomic study is to group nucleotide sequences belonging to an individual or closely related species which is often termed binning. Multiple machine learning techniques have been adopted in binning metagenomic sequences. Specifically, unsupervised learning is being used in most of the recent binning methods. This work considers data-driven methods for binning metagenomic sequences and discusses such approaches in detail. Furthermore, it explores on increasing the amount of metagenomic sequences binned while maintaining a reasonable binning accuracy. Consequently, a dissimilarity-based approach is proposed to improve the number of contigs binned by an existing binning method. It is shown to result in a 10% increase in the number of contigs binned compared to the original approach. Accordingly, this work suggests that the effective use of observed data which may be discarded as outliers otherwise, may result in improved performance in binning. | en_US |
dc.identifier.citation | K. Vimukthi, G. Wimalasiri, P. Bandara and D. Herath, "A Data Driven Binning Method to Recover More Nucleotide Sequences of Species in a Metagenome," 2020 Moratuwa Engineering Research Conference (MERCon), 2020, pp. 307-312, doi: 10.1109/MERCon50084.2020.9185388. | en_US |
dc.identifier.conference | Moratuwa Engineering Research Conference 2020 | en_US |
dc.identifier.department | Engineering Research Unit, University of Moratuwa | en_US |
dc.identifier.doi | 10.1109/MERCon50084.2020.9185388 | en_US |
dc.identifier.email | kasunchamara@eng.pdn.ac.lk | en_US |
dc.identifier.email | geethp@eng.pdn.ac.lk | en_US |
dc.identifier.email | okprabhath@gmail.com | en_US |
dc.identifier.email | damayanthiherath@eng.pdn.ac.lk | en_US |
dc.identifier.faculty | Engineering | en_US |
dc.identifier.pgnos | pp. 307-312 | en_US |
dc.identifier.place | Moratuwa, Sri Lanka | en_US |
dc.identifier.proceeding | Proceedings of Moratuwa Engineering Research Conference 2020 | en_US |
dc.identifier.uri | http://dl.lib.uom.lk/handle/123/18573 | |
dc.identifier.year | 2020 | en_US |
dc.language.iso | en | en_US |
dc.publisher | IEEE | en_US |
dc.relation.uri | https://ieeexplore.ieee.org/document/9185388 | en_US |
dc.subject | Metagenomics | en_US |
dc.subject | binning | en_US |
dc.subject | data driven | en_US |
dc.subject | mahalanobis distance measure | en_US |
dc.title | A data driven binning method to recover more nucleotide sequences of species in a metagenome | en_US |
dc.type | Conference-Full-text | en_US |