Browsing by Author "Dias, G"

Now showing 1 - 20 of 73

item: Thesis-Abstract
A Focus group study on local language computing in SMES'
Perera, MWL; Dias, G
Information and Communication Technologies (ICTs) are regarded as a powerful tool for gaining competitive advantage in any industry. Even though our population has high literacy level, many Sri Lankans are not fully-fledged computer users. Therefore the advantages of using these technologies are limited to a privileged few. Although most ICT related products and services are based on English language, the majority of the society who would benefit from these technologies is not English conversant./ This is a principal reason behind the digital divide in society where some social groups lack access to ICT. One social group that does not use computers and internet are individuals and organizations that can afford them but lack English language knowledge and also lack awareness of available local language ICTs./ The objective of this research is to find out whether the ICT penetration level in the country may be increased using local language ICT./ As the first part of the research we conducted a survey of the current state of Sinhala language computer applications in Sri Lanka. We have identified the major local language application vendors, and their products. We also obtained the views of the vendors on the issues facing the industry. In the second part of the research, we surveyed the use of local language computing by a number of small and medium-scale industries (SMEs) outside Colombo in several sectors. We also obtained their views on the obstacles to the greater use of LLC./ We then analyzed whether the lack of local language computing products is an obstacle to the greater use of ICTs in this country. Based on these surveys, we conducted a SWOT analysis and derived recommendations for strategies to be taken to improve local language computing in the country.
item: Thesis-Abstract
A Marked-based Web bandwidth management system
Disanayake, DABC; Dias, G
The World Wide Web (WWW) is the most popular service of the Internet, which is used by millions of people in almost every country for their day-to-day operations. Since the demand for WWW is increasing rapidly, the infrastructures for this service such as bandwidth need to be developed and upgraded regularly. This is not affordable for developing countries like Sri Lanka, because of the cost and the technological deficiencies. Providing a satisfactory service to users by 'managing the existing bandwidth' is the best alternative for this. But it is very difficult to achieve both with the available bandwidth management mechanisms. We have designed and implemented a proxy-based system, which allows each user to request, and obtain a desired amount of bandwidth for web access. The server allocates the bandwidth for users considering the available bandwidth and demand. Bandwidth is priced dynamically based on the demand at any given time. The user is charged based on the bandwidth level and the duration of usage. This allows users in a bandwidth-constrained environment to prioritize their web usage, and encourages them to carry out bandwidth-intensive applications during off-peak hours. This user involvement in bandwidth allocation is the main innovation of this system. This feature was implemented by adding new functions to the Squid web cache server. These functions allow adding or removing IP addresses of users to the relevant delay pool of squid according to the user requested bandwidth. Users login to the proxy server through an intermediate server, which keeps the users accounts and login details. In addition to the user requested bandwidth allocation, this system provides user management and billing functions. According to our experience, this new system is highly suitable for the Internet Service Provides (ISPs) to offer a better quality user satisfied service while managing their available bandwidth resources.
item: Conference-Extended-Abstract
A New prosodic phrasing model for sinhala language
(2010) Bandara, WMC; Lakmal, VMS; Liyanagama, TD; Bulathsinghala, SV; Dias, G; Jayasena, S
This paper describes a new model of predicting prosodic phrase breaks in Sinhala language in order to improve the quality of the existing ITS Sinhala voices. In a Text To Speech (TTS) system, quality of the synthetic voice is mainly dependent on, how well its prosodic model is implemented The prosodic model adjusts the phrasing and the pitch of the voice while applying suitable durations and tones for words and diphones. Out of these, phrasing and pitch of the voice carries much importance since appropriate phrase breaking helps to clearly understand the synthesis voice. In a real world scenario, when we speak a sentence, we automatically divide it to small segments and apply pauses at those breaks. Also the pitch of the voice gets lowered near a break and gels increased in the other segments automatically. But in a ITS system, we do not have that advantage and therefore need to be precised with the phrase breaks. Otherwise it will create wrong meanings as well as producing unnatural speech.
item: Thesis-Abstract
A Study on the choice of free & open source software for government sector enterprise applications in Sri Lanka
Jayawardena, KMSP; Dias, G
This document presents a study on the selection of software for government sector enterprise applications. Many factors could influence the choice of software in government sector enterprise applications. The research is based on the following problem: what factors influence the choice of software for government sector enterprise applications, in the Sri Lankan context. Information systems (IS) projects in several selected government sector organizations have been studied in depth during the course of the research. Around 30% of the investigated government sector information systems projects have been found to have used FOSS. Several factors have been identified to have affected the choice of software for government sector enterprise IS. Out of these, technical compliance, cost, bidders/developers expertise and maintenance/support options were some of the most commonly indicated factors. Cost was highlighted as an important factor in a majority of the investigated IS. However, the analysis revealed that cost did not influence the choice between FOSS and proprietary software, when implementing the IS. This was quite unusual given the common perception that FOSS is used to lower costs. It was concluded from the analysis that certain other factors including bidders/developers expertise, technical compatibility with legacy proprietary systems and maintenance/support options override the cost factor, when selecting software. Based on the analysis and conclusions, several recommendations have been made to leverage on the benefits of FOSS in government sector enterprise. IS. These recommendations include ways to achieve cost advantages, especially in large scale replication. It is recommended to nurture a FOSS ecosystem and to develop internal FOSS expertise within government organizations, in order to leverage on the advantages of FOSS in government sector enterprise IS.
item: Conference-Extended-Abstract
An architecture for advanced proxying based on user requests
(2002) Fernando, MSD; Dias, G
Much research has been done on issues related to congestion over a data communication link connecting one or more networks to the Internet or a similar network. The main driving force behind such research has been to avoid congestion over high demand links. With the rapid growth of the Internet and its wide range of applications, the requirement of such schemes are a higher priority in order to maximize the utilization ofthe link bandwidth.
item: Article-Abstract
Automated Assessment of Multi-Step Answers for Mathematical Word Problems
Dias, G; Ranathunga, S; Kadupitiya, JCS
We present a system to automatically grade the mathematical word questions. The questions that we currently consider are at the level of GCE (General Certificate of Education) Ordinary Level (O/L) Mathematics paper standard in Sri Lanka. The solutions to these questions are open-ended multi step answers. The system uses a regular expression based information retrieval approach to validate the expressions in the answers. The implemented system properly evaluates student answers using a marking rubric and awards full/partial marks. We have tested the performance of the system using 500 answer scripts for five different questions from 50 students. The grades given by the system are compared against the manual grading marks and only one answer was graded wrongly. Therefore the accuracy of the system is 99.8%.
item: Conference-Extended-Abstract
Automatic assessment and error identification of multi-step answers for matrix questions
(2017) Thirunavukkarasu, N; Selvarasa, A; Rajendran, N; Yogalingam, C; Ranathunga, S; Dias, G
This paper presents an automatic assessment and error identification system for student answers with matrix expressions, and which may have multiple steps. Teacher’s intervention is needed only during the question set-up stage, to provide the marking rubric. The system currently supports four types of matrix questions: multiplying a matrix by a constant number, matrix addition and subtraction, finding unknown elements within a matrix, and finding the unknown matrix from an equation. A CAS (Computer Algebra System) is used to evaluate each step of the student’s answer. The system is capable of giving full/partial marks according to a marking rubric. Errors commonly made by students were identified and categorized by analyzing sample student answers. Using this categorization, the system is capable of identifying the exact error(s) made by a student.
item: Conference-Abstract
Automatic assessment of student answers for geometric theorem proving questions
(2017) Mendis, C; Lahiru, D; Pamudika, N; Madushanka, S; Ranathunga, S; Dias, G
In this paper, we present a system to automatically assess multi-step answers for geometric theorem proving questions in high school Mathematics. The system is capable of allocating partial marks for steps considering a marking rubric. Moreover, the system evaluates the natural language reasoning part in each step. Currently, 30 theorems related to straight lines have been implemented as inference rules. The system has been tested with 100 student answers for two geometric theorem proving questions.
item: Article-Abstract
Automatic Creation of a Sentence Aligned Sinhala-Tamil Parallel Corpus
Hameed, RA; Pathirennehelage, N; Ihalapathirana, A; Mohamed, MZ; Ranathunga, VSD; Jayasena, S; Dias, G; Fernando, S
A sentence aligned parallel corpus is an important prerequisite in statistical machine translation. However, manual creation of such a parallel corpus is time consuming, and requires experts fluent in both languages. Automatic creation of a sentence aligned parallel corpus using parallel text is the solution to this problem. In this paper, we present the first ever empirical evaluation carried out to identify the best method to automatically create a sentence aligned Sinhala-Tamil parallel corpus. Annual reports from Sri Lankan government institutions were used as the parallel text for aligning. Despite both Sinhala and Tamil being under-resourced languages, we were able to achieve an F-score value of 0.791 using a hybrid approach that makes use of a bilingual dictionary.
item: Conference-Abstract
Automatic creation of a word aligned Sinhala-Tamil parallel corpus
(2017) Mohamed, MZ; Ihalapathirana, A; Hameed, RA; Pathirennehelage, N; Ranathunga, S; Jayasena, S; Dias, G
A parallel corpus aligned at both sentence and word level is an important prerequisite in statistical machine translation. However, manual creation of such a parallel corpus is time consuming, and requires experts fluent in both languages. This paper presents the first ever empirical evaluation carried out to identify the best unsupervised word alignment technique for Sinhala and Tamil. It also presents a novel approach that combines the output of individual aligners, which outperforms the solitary use of these aligners. Sentence aligned parallel text from annual reports and letters of Sri Lankan Government institutions, and order papers from the Parliament of Sri Lanka were used in the evaluation.
item: Thesis-Full-text
Automatic evaluation and error identification of solutions to single-variable algebraic questions
Erabadda, ELBH; Ranathunga, S; Dias, G
There are two types of single-variable equation solving questions that are present in the Ordinary Level mathematics curriculum in Sri Lanka: linear equations with fractions and quadratic equations. Answers to these questions are open-ended and multi-step in nature. This thesis describes a mechanism that evaluates answers to these two types of questions and awards full/ partial credit. It is quite common that students make mistakes in their answers, which results in partial credit. They may repeat the same errors if they do not receive feedback on their mistakes. Therefore feedback in student errors is important for any subject. This thesis introduces a method to automatically identify the errors that the students make in their answers for the aforementioned two types of questions. To the best of our knowledge, this is the first work on automatically identifying student errors in complex multi-step solutions to single-variable equation solving questions. Our evaluations show that the system we have implemented is capable of awarding full/ partial credit to student answers according to a marking scheme and also to identify errors in student answers with minimal teacher intervention. These evaluations were carried out using student answers from different sources.
item: Conference-Full-text
Biss a - a scalable and distributed tuple space
(Computer Science & Engineering Society c/o Department of Computer Science and Engineering, University of Moratuwa., 2010-09) Wickramarachchi, CD; Sumanasena, D; Fernando, PR; Wckramasinghe, US; Dias, G; Perera, S; Weerawarana, S; Gunasekara, C; Wijegunawardana, P; Pavalanathan, U
The idea of tuple spaces is based on the whiteboard design pattern & made its first appearance in the late 1980s. Tuple space provides content addressed associative shared memory abstraction for the processors accessing it. Tuple spaces can be used to time and space decoupled communication between the processes. In our work, we have implemented a distributed and scalable tuple space middleware infrastructure called BISSA that can be used for decoupled communication between applications. The BISSA application scope span from browser based applications to java applications. This capability is given by two major implementations; a distributed hash table (DHT) based peer to peer tuple space implementation and a web browser based tuple space implementation. In this paper we present and discuses our implementation methodology, test results and possible applications of the middleware.
item: Article-Abstract
Building a WordNet for Sinhala
Wijesiri, I; Gallage, M; Gunathilaka, B; Lakjeewa, M; Wimalasuriya, DC; Dias, G; Paranavithana, R; De Silva, N
Sinhala is one of the official languages of Sri Lanka and is used by over 19 million people. It belongs to the Indo-Aryan branch of the In-do-European languages and its origins date back to at least 2000 years. It has developed into its current form over a long period of time with influences from a wide variety of lan-guages including Tamil, Portuguese and Eng-lish. As for any other language, a WordNet is extremely important for Sinhala to take it into the digital era. This paper is based on the pro-ject to develop a WordNet for Sinhala based on the English (Princeton) WordNet. It de-scribes how we overcame the challenges in adding Sinhala specific characteristics which were deemed important by Sinhala language experts to the WordNet while keeping the structure of the original English WordNet. It also presents the details of the crowdsourcing system we developed as a part of the project - consisting of a NoSQL database in the backend and a web-based frontend. We con-clude by discussing the possibility of adapting this architecture for other languages and the road ahead for the Sinhala WordNet and Sin-hala NLP.
item: Article-Abstract
Challenges of enabling IT in the Sinhala Language
Dias, G
Although Sinhala is the national language of Sri Lanka, even in 2004 most computer operating systems, databases and applications were in English and only a handful of Sinhala websites existed. Many people thought that “computers don’t work in Sinhala”. Sinhala was included in Unicode in 1998, but there were no implementations even by 2002. This paper first examines why IT in Sinhala was slow to develop. It then describes the development of Sinhala computing over the last two years, and the challenges faced. One reason for the non-adoption of Unicode was that the Unicode standard, as published, did not specify some common symbols and ligatures, which led to a perception that Unicode did not properly support Sinhala. Unicode was also complex, and difficult to understand. Another issue was the lack of operating system support, especially in Microsoft Windows. We were successful not only in defining the full representation of Sinhala script in Unicode, but also the specification and deployment of Sinhala computer keyboards. We learned that perception and prioritisation are as important as technical issues in enabling IT in a language.
item: Article-Abstract
Cloud To Cloud: Enabling Content Transfer among Personal Cloud Instances
Erabadda, B; Baddegama, V; Dias, G
With increasing globalization, it has become essential to share digital content with various parties. Meanwhile, it is important to preserve confidentiality and have control over how a particular party’s personal content is maintained. Although public clouds enable users to share files with anyone, privacy and confidentiality of client data is highly questionable with public cloud vendors as client data lies with external parties. As a result, personal cloud solutions are being introduced so that people can maintain their own clouds and have control over their data. But with personal clouds, it is not possible to share content among cloud instances as they operate individually and separately from each other. Cloud To Cloud is a solution which enables content transfer among two or more personal cloud instances. For the purpose of explaining the feasibility of the solution, we have implemented the solution using ownCloud, the best existing personal cloud solution with many features. The solution can be extended to interconnect any number of ownCloud instances. The same methodology can be adapted to any preferred personal cloud solution.
item: Article-Abstract
COMBINED SYMBOLIC AND NUMERICAL METHODS FOR SOLVING EQUATIONAL SYSTEMS
Dias, G; Nanayakkara, V
In this paper we discuss systems developed to assist scientists and engineers to solve their problems based on mathematical models using numerical and symbolic methods. Numerical library routines for numerical algorithms and Computer Algebra Systems (CASS) for symbolic computation are both now well established areas. The recent research interest in these areas is oriented towards combining these solvers, rather than on improving the individual solvers.
item: Conference-Full-text
Combined symbolic and numerical methods for solving equational systems
(1998) Nanayakkara, V; Dias, G
In this paper we discuss systems developed to assist scientists and engineers to solve their problems based on mathematical models using numerical and symbolic methods. Numerical library routines for numerical algorithms and Computer Algebra Systems (CASs) for symbolic computation are both now well established areas. The recent research interest in these areas is oriented towards combining these solvers, rather than on improving the individual solvers. We analyse the state of research in numerical and symbolic solvers as well as the area of these combining methods or coupled (or combined) systems as they are commonly referred to. We evaluate these coupled systems to identify their essential features and various methods of integration. Based on our analysis we then propose a new strategy for integration which can enhance the facilities provided by the existing symbolic and numerical systems.
item:
Comparison Between Performance of Various Database Systems for Implementing a Language Corpus
Upeksha, D; Wijayarathna, C; Siriwardena, M; Lasandun, L; Wimalasuriya, C; de Silva, NHND; Dias, G
Data storage and information retrieval are some of the most important aspects when it comes to the development of a language corpus. Currently most corpora use either relational databases or indexed file systems. When selecting a data storage system, most important facts to consider are the speeds of data insertion and information retrieval. Other than the aforementioned two approaches, currently there are various database systems which have different strengths that can be more useful. This paper compares the performance of data storage and retrieval mechanisms which use relational databases, graph databases, column store databases and indexed file systems for various steps such as inserting data into corpus and retrieving information from it, and tries to suggest an optimal storage architecture for a language corpus.
item:
Comprehensive Part-Of-Speech Tag Set and SVM Based POS Tagger for Sinhala
Fernando, S; Ranathunga, S; Jayasena, S; Dias, G
This paper presents a new comprehensive multi-level Part-Of-Speech tag set and a Support Vector Machine based Part-Of-Speech tagger for the Sinhala language. The currently available tag set for Sinhala has two limitations: the unavailability of tags to represent some word classes and the lack of tags to capture inflection based grammatical variations of words. The new tag set, presented in this paper overcomes both of these limitations. The accuracy of available Sinhala Part-Of-Speech taggers, which are based on Hidden Markov Models, still falls far behind state of the art. Our Support Vector Machine based tagger achieved an overall accuracy of 84.68% with 59.86% accuracy for unknown words and 87.12% for known words, when the test set contains 10% of unknown words.
item:
CONSTRUCTION OF A MULTILINGUAL PLACE NAME DATABASE FOR SRI LANKA
Weerasinghe, A; Dias, G
Although the national languages of Sri Lanka are Sinhala and Tamil, due to the fact that most of the computer systems operate in English, almost all the databases in Sri Lanka have been implemented in English. This paper discusses the necessity for such computer systems to have the ability to capture the input in any of the three languages Sinhala, Tamil or English and to output them in the user-preferred language.