A rule-based lemmatizing approach for sinhala language

dc.contributor.authorNandathilaka, M
dc.contributor.authorAhangama, S
dc.contributor.authorWeerasuriya, GT
dc.contributor.editorWijesiriwardana, CP
dc.date.accessioned2022-12-05T05:39:43Z
dc.date.available2022-12-05T05:39:43Z
dc.date.issued2018
dc.description.abstractSpeech recognition, natural language processing, language translation and deep learning researches are bridging the communication gap between humans as well as between humans and machines. Sinhala is a native language in Sri Lanka which is being used by 19 million people approximately. The growth of Sinhala natural language processing tools is less when compared to European and other Asian Languages. A lemmatizer for Sinhala can be used for the morphological analysis and is an essential module in Sinhala language processing mechanisms. Lemmatizing is a complex process in morphological analyzing where base/root of words are derived. There is not much work published focusing on lemmatizer approaches for Sinhala. This paper presents a rule based lemmatizing approach which can be used to determine the base form of Sinhala words with an accuracy of 77.3%. It differs from similar works because the data used in the research are extracted from social media.en_US
dc.identifier.citationM. Nandathilaka, S. Ahangama and G. T. Weerasuriya, "A Rule-based Lemmatizing Approach for Sinhala Language," 2018 3rd International Conference on Information Technology Research (ICITR), 2018, pp. 1-5, doi: 10.1109/ICITR.2018.8736134.en_US
dc.identifier.conference3rd International Conference on Information Technology Research 2018en_US
dc.identifier.departmentInformation Technology Research Unit, Faculty of Information Technology, University of Moratuwa.en_US
dc.identifier.doidoi: 10.1109/ICITR.2018.8736134en_US
dc.identifier.emailpraba.14@itfac.mrt.ac.lken_US
dc.identifier.emailsupunmali@uom.lken_US
dc.identifier.emailthiliniw@uom.lken_US
dc.identifier.facultyITen_US
dc.identifier.proceedingProceedings of the 3rd International Conference in Information Technology Research 2018en_US
dc.identifier.urihttp://dl.lib.uom.lk/handle/123/19643
dc.identifier.year2018en_US
dc.language.isoenen_US
dc.publisherInformation Technology Research Unit, Faculty of Information Technology, University of Moratuwa, Sri Lankaen_US
dc.relation.urihttps://ieeexplore.ieee.org/document/8736134en_US
dc.subjectSinhala Morphologyen_US
dc.subjectLemmatizationen_US
dc.subjectInflectionen_US
dc.subjectRule-baseden_US
dc.subjectSocial media dataen_US
dc.titleA rule-based lemmatizing approach for sinhala languageen_US
dc.typeConference-Full-texten_US

Files

Collections