Building a WordNet for Sinhala

dc.contributor.authorWijesiri, I
dc.contributor.authorGallage, M
dc.contributor.authorGunathilaka, B
dc.contributor.authorLakjeewa, M
dc.contributor.authorWimalasuriya, DC
dc.contributor.authorDias, G
dc.contributor.authorParanavithana, R
dc.contributor.authorDe Silva, N
dc.date.accessioned2017-01-16T04:01:19Z
dc.date.available2017-01-16T04:01:19Z
dc.description.abstractSinhala is one of the official languages of Sri Lanka and is used by over 19 million people. It belongs to the Indo-Aryan branch of the In-do-European languages and its origins date back to at least 2000 years. It has developed into its current form over a long period of time with influences from a wide variety of lan-guages including Tamil, Portuguese and Eng-lish. As for any other language, a WordNet is extremely important for Sinhala to take it into the digital era. This paper is based on the pro-ject to develop a WordNet for Sinhala based on the English (Princeton) WordNet. It de-scribes how we overcame the challenges in adding Sinhala specific characteristics which were deemed important by Sinhala language experts to the WordNet while keeping the structure of the original English WordNet. It also presents the details of the crowdsourcing system we developed as a part of the project - consisting of a NoSQL database in the backend and a web-based frontend. We con-clude by discussing the possibility of adapting this architecture for other languages and the road ahead for the Sinhala WordNet and Sin-hala NLP.en_US
dc.identifier.emailgihan@uom.lken_US
dc.identifier.emailnisansadds@cse.mrt.ac.lken_US
dc.identifier.journalVolume editorsen_US
dc.identifier.pgnos100en_US
dc.identifier.urihttp://dl.lib.mrt.ac.lk/handle/123/12222
dc.identifier.year2014en_US
dc.relation.urihttp://www.aclweb.org/anthology/W/W14/W14-0114.pdfen_US
dc.source.urihttp://www.aclweb.org/anthology/W/W14/W14-0114.pdfen_US
dc.titleBuilding a WordNet for Sinhalaen_US
dc.typeArticle-Abstracten_US

Files