Domain specific named entity recognition in tamil

dc.contributor.authorMurugathas, R
dc.contributor.authorThayasivam, U
dc.contributor.editorRathnayake, M
dc.contributor.editorAdhikariwatte, V
dc.contributor.editorHemachandra, K
dc.date.accessioned2022-10-27T08:33:53Z
dc.date.available2022-10-27T08:33:53Z
dc.date.issued2022-07
dc.description.abstractThis paper presents a domain specific Tamil Named Entity Recognizer for history domain. The system uses a manually annotated corpus of 23k tokens and the dataset is tagged with 36 tags related to history domain. NER model is trained for Tamil based on Conditional Random Fields (CRF) with the use of features extracted based on the domain of interest and language. Hyper parameter tuning is applied with random search algorithm to find the best hyper parameters for the model. Tamil is a low resourced and morphologically rich language which makes the task challenging. Despite that, the system achieved a fair results with micro-averaged Precision, Recall and Fl-score of 87.9%, 67.1% and 76.1% respectively.en_US
dc.identifier.citationR. Murugathas and U. Thayasivam, "Domain specific Named Entity Recognition in Tamil," 2022 Moratuwa Engineering Research Conference (MERCon), 2022, pp. 1-6, doi: 10.1109/MERCon55799.2022.9906295.en_US
dc.identifier.conferenceMoratuwa Engineering Research Conference 2022en_US
dc.identifier.departmentEngineering Research Unit, University of Moratuwaen_US
dc.identifier.doi10.1109/MERCon55799.2022.9906295en_US
dc.identifier.emailrubika.murugathas.19@cse.mrt.ac.lk
dc.identifier.emailrtuthaya@cse.mrt.ac.lk
dc.identifier.facultyEngineeringen_US
dc.identifier.pgnos******en_US
dc.identifier.placeMoratuwa, Sri Lankaen_US
dc.identifier.proceedingProceedings of Moratuwa Engineering Research Conference 2022en_US
dc.identifier.urihttp://dl.lib.uom.lk/handle/123/19268
dc.identifier.year2022en_US
dc.language.isoenen_US
dc.publisherIEEEen_US
dc.relation.urihttps://ieeexplore.ieee.org/document/9906295/en_US
dc.subjectNamed entity recognitionen_US
dc.subjectNERen_US
dc.subjectTamilen_US
dc.subjectHistoryen_US
dc.subjectCRFen_US
dc.subjectConditional randomfieldsen_US
dc.subjectNatural Language Processing.en_US
dc.titleDomain specific named entity recognition in tamilen_US
dc.typeConference-Full-texten_US

Files

Collections