Browsing by Author "Wimalawarne, K"

Now showing 1 - 2 of 2

item: Conference-Full-text
AxiCo2 Concurrency coordination runtime on top of apache axis2
(Computer Science & Engineering Society c/o Department of Computer Science and Engineering, University of Moratuwa., 2011-11) Mudannayaka, S; Bandara, L; Gunawardena, V; Weerasooriya, C; Perera, S; Wimalawarne, K; De Silva, R; Weerawardhana, S; Madusanka, A; Dilrukshi, T; Aravinda, H
AxiCo2 is an Application Programming Interface (API) designed to simplify the implementation of concurrency in local environment as well as in the invocation of web services. AxiCo2 thereby reduces many inherent difficulties undergone by developers in programming multi-threaded applications. The higher level Application Programming Interface provided by AxiCo2 hides complexities associated with concurrency constructs and web service invocations. As a framework for concurrency and coordination among threads, AxiCo2 provides a means of asynchronous communication among threads which are used for local tasks and service invocations using “Ports” which are subdivided as Local and Service Ports. AxiCo2 has a thread pool within itself eliminating inherent overheads of thread per task approach. Apart from being a high level Application Programming Interface to hide complexities of concurrency, AxiCo2 provides means to the developer to configure applications to respond to partial success through the variety of Receivers provided. This set consists of Join, Choice, Multiple Item and Timeout Receivers which are used in implementing various logical constrains between tasks. AxiCo2 derives benefits both in programmability and performance perspectives.
item: Conference-Full-text
Gpu acceleration of logistic regression with cuda
(Computer Science & Engineering Society c/o Department of Computer Science and Engineering, University of Moratuwa., 2011-11) Madhawa, PKK; Jeevananda, MS; Malmi, PMBC; Sandaruwan, URV; Wimalawarne, K; Weerawardhana, S; Madusanka, A; Dilrukshi, T; Aravinda, H
Logistic regression (LR) is a widely used machine learning algorithm. It is regarded unsuitably slow for high dimensional problems compared to other machine learning algorithms such as SVM, decision trees and Bayes classifier. In this paper we utilize the data parallel nature of the algorithm to implement it on NVidia GPUs. We have implemented this GPU-based LR on the newest generation GPU with Compute Unified Device Architecture (CUDA). Our GPU implementation is based on BFGS optimization method. This implementation was extended to multiple GPU and cluster environment. This paper describes the performance gain while using GPU environment.