Master of Science in Artificial Intelligence

Permanent URI for this collectionhttp://192.248.9.226/handle/123/15823

Browse

Now showing 1 - 20 of 76

item: Thesis-Abstract
Excavation using decentralized swarm robots for off-earth habitation
(2022) Gunasekara LKIK; Fernando KSD
Civilization has progressed over time as a result of exploration of previously undiscovered portions of the Earth and the subsequent exploitation of new resources. Humans will now explore and exploit new places beyond Earth, i.e. in space, as a natural continuation of this process. The Moon has been partially investigated, but it still needs to be fully explored and colonized. Mars and beyond will be the next phase. The establishment of outposts on these new space bodies will be necessary before they can be habited. But these extraterrestrial constructions are expensive, time-consuming, and risky. In this article building habitations by excavation is suggested and proven to be more convenient. Having many challenges in the physical excavation tasks, only a few researchers have been trying to innovate and improve excavation-related technologies. Amongst these, some have proposed adopting robotics in the aspects of excavation in the construction process but none has been tested practically. In our work, we try to introduce a novel approach for excavation using swarm robotics. Behaviour of a robotic swarm is collective and aimed at solving a problem using the collective conduct. This is similar to the natural animal swarm behaviour of bees/ ants/ termites...etc. Even though there are many researches and developments done in the field of swarm robotics, the concept has not yet made its way into industrial environments. In order to colonize planets and moons, it is required to build a surface structure which needs to be a few meters thick to protect living beings from solar/cosmic radiation, meteoroid impacts, and extreme temperature variations. However, creating a structure with thickness of a few meters has been a research challenge for many decades due to practical limitations of developing them on off-earth. This research proposes our approach Cave Construction using Swarm Robots acronymed as CCSR, a practical method to excavate the ground to create subsurface habitats using decentralized robot swarms. Our CCSR design uses vision, RFID, and orientation sensor data to decide actioneeded to be taken by the robot. Robots can do basic actions such as traversing, removing, and dumping regolith. The swarm consists of a set of robots that are practical to implement, with limited visibility and limited communication skills. Having only the local view of the terrain, robots in the swarm excavate a given shape in 3D in collaboration with the other robots in the swarm. With the application of swarm concepts in an improved manner, the swarm is able to construct the given shape through excavation, displaying true parallelism which in turn will improve the construction time.
item: Thesis-Abstract
Multi-Target multi-camera tracking optimization using probabilistic target search
(2022) Wijayasekara KS; Premachandra C; Fernando S
Smart surveillance in smart cities has become an important feature to be used in resource utilization and city-wide security areas. Multi-target multi-camera tracking has been one of the core areas in smart surveillance since the overlapped field of views within cameras cannot be expected in the real world scenarios. The inefficiency in MTMCT has caused this feature to not be used in real time applications. Hence how to make vehicle re-identification feature signature matching efficient in multi target multi camera tracking has become a research problem. This research introduces a trajectory based probabilistic search algorithm to reduce target search space and increase the efficiency of the MTMCT. The solution consists of a YOLO v4 based object detection module, IOU based single camera tracking module, OSNet based feature extraction module and a cross camera identification module using probabilistic target search algorithm. The system takes video streams in and outputs the global trajectory of each target target. The evaluation is done using identification F1 score and the efficiency was measured using the number of frames processed in a second.
item: Thesis-Abstract
Fabric defect detection using one-class classifier
(2022) Madhusanka VKN; Fernando S
Textile is wide, very important in critical industry, because it provide lot prodcut to the human day to day life. As example cloths, wipes, transportation materials, wipes, hosuning materials etc. Then quality of the products are very important for their demand. Therefore defect identification during the production is very importat and then they can maintain better price for their production. Therefore fabric defect detection and identification is a very impotant part of the textile industry's quality control process. Currently, there are many manualinspection method to identify defects and, to enhance the efficiency, it is needed to repace manual inspectionmethod bby a automatic inspection method. Machine vision is diversifying and expanding in defect detection using deep learning. Traditional systems like detecting and classifying defects using image segmentation, defect detection and image classification have some limitations like requiring a lot of defective data to the training process and needing pre-identification of defects in the datasets. However, it is very difficult to get a large amount of actual data with defects and real-time processes. The one-class classifier is a classical machine learning problem that has received considerable attention recently for fabric defect detection. Tin this scenario, only non-defective class data are available in the training process and avoid the requirement of defective to train process. However, State of art models in deep neural networks with one-class classifiers is still unable to record higher accuracy. This research proposes our approach, for identifying defective fabric using features of the non-defective fabric with higher accuracy. The implications of this research can be an initiative to such applications. That approach consists of a VGG-16 pre-trained framework and trainable network with a new Loss function for increase accuracy of defect detection.
item: Thesis-Abstract
Automated tourism knowledge graph and intent generation from audio content extracted from videos, by utilizing NLP
(2022) Seneviratne SS; Sumathipala S
Generating a knowledge graph for a chatbot is a time-consuming exercise which needs the help of an expert relevant to the field. This thesis presents our approach to synthesizes the creation of a knowledge graph and intents for a chatbot. Currently, the creation of a knowledge graph and intents for a chatbot is a tedious process and this process does not extract data from videos. Developing a chatbot also requires the support of experienced software engineers. This platform allows a user to build a customized chatbot according to a specific requirement in any field, without the intervention of experts. It also allows for the seamless development of a comprehensive knowledge graph from the video content through a simple and less tedious approach. The platform uses Natural Language Processing (NLP) machine learning models such as Naive Bayes and Logistic Regression and grammar correction techniques to supplement the experience of the users. The working process of this proposed system is Knowledge Extraction and generating the Knowledge Base. The user inserts keywords related to the chatbot’s domain as the first step of the process. The system retrieves the search results from YouTube. Finally, NLP will be used to retrieve data contained in videos to create a preliminary knowledge graph and intents for a chatbot. A scheduler is then activated automatically from time to time to update the knowledge graph and intents. The knowledge graph and intents generated have been tested on a chatbot created using the Rasa framework, with the chatbot giving the correct answers when questioned by a user.
item: Thesis-Abstract
2D Human animation synthesis from videos using generative adversarial neural networks
(2022) Udawatta PN; Fernando S
Synthesizing 2D human animation has many industrial applications yet is currently done manually by animators utilizing time and resources. Therefore, many types of research have been conducted to synthesize human animation using artificial intelligence techniques. However, these approaches lack the quality as well as capability to generalize to various visual styles. Thus, synthesizing high-quality human animations across different visual styles remains a research challenge We hypothesize that given video references for motion and appearance, synthesizing high-quality human animations across a variety of visual styles can be achieved via generative adversarial networks. Here we have come up with the solution known as HumAS-GAN, an acronym for Human Animation Synthesis Generative Adversarial Networks. HumAS-GAN accepts video references for motion and appearance and synthesis 2d Human animations. HumAS-GAN has three main modules, motion extraction, motion synthesis, and appearance synthesis. In motion extraction, the motion information is extracted via pre-trained human pose extraction [21], The motion synthesis module syntheses a motion representation matching the target human’s body structure which is then combined with the human pose coordinates to be used by the appearance synthesis module to generate the Human animation. HumAS-GAN is focused on improving the quality of the animation as well as the ability to use cross-domain/visual-style references to generate animation. This solution will be beneficial for many multimedia-based industries as it is capable of generating high human animations and quickly switching to any visual style they prefer. HumAS-GAN is evaluated against other methods using a custom dataset and a set of 3 experiments designed to evaluate the capability of generating human animations across various visual styles. Evaluations results prove the superiority of HumAS-GAN over other methods in synthesizing high-quality 2d human animations across a variety of visual styles.
item: Thesis-Abstract
Using web scraping in social media to determine market trends with product feature - based sentiment analysis
(2022) Nanayakkara T; Sumathipala S
Customer product reviews are openly available online and they are now widely used for deciding quality of product or service and to determine market trends and influence decision making of users. Due to the availability of a massive number of customer reviews on the web, summarizing them requires a fast classification system. Compared to supervised and unsupervised machine learning techniques for binary classification of reviews, fuzzy logic can provide a simple and comparatively faster way to model the fuzziness existing between the sentiment polarities classes due to the uncertainty present in most of the natural languages. But the fuzzy logic techniques are not much considered in this domain. This thesis proposes a model which measures product market value by using sentiment analysis conducted on the reviews of online products which are collected from a well known ecommerce website “Amazon”. Fuzzy logic approach is used in calculating the final product market demand. Hence, in this paper we propose a fine grained classification of customer reviews into weak positive, average positive, strong positive, weak negative, average negative and strong negative classes using a fuzzy logic model based on the most popularly known sentiment based lexicon SentiWordNet. By creating rules and relationships between fuzzy membership functions and linguistic variables, we can analyze the customer opinions towards online products. This proposed model provides the most reasonable sentiment analysis because we try to reduce all the problems from the related past researches. The outcomes can allow the business organization to understand their customer‟s sentiments and improve customer loyalty and customer retention techniques in order to increase customer values and profits result. Fine grained classification accuracy approximately in the range of 74% to 77% has been obtained by experiments conducted on datasets of electronic products containing reviews of smart phones, TV and laptops.
item: Thesis-Abstract
Topological pruner a neural network pruner using topological data analysis
(2022) Perera WMMJU; Fernando S; Amarasinghe A
Architectural damage due to neural network pruning has been a research problem. To recover the accuracy loss, after pruning, pruned neural network needed to be trained further for a certain time period to gain the accuracy back. If the damage done by the pruning process is severe, some layers can collapse and at worse, the entire model may become untrainable. Therefore, pruning process needs to be done carefully to prevent any significant damage to the neural network. Although some existing approaches have been used to overcome this issue by identifying the salience of a neuron with respect to the overall architecture, it is not computationally efficient. Further, the exiting solutions do not count the topological meaning of the neural network architecture during the pruning process. We believe that identifying the salience of neuron with respect to the layer is sufficient to avoid severe damages to the overall architecture. Topology, the champion of mathematical shapes, has been introduced to solve the aforesaid problem. We introduce ‘Topological Pruner’, a novel pruner that uses a genetic algorithm powered by a topological fitness function to identify removable neurons of each layer of a pre trained neural network. After pruning is done, the model is retrained so that the parameters of the remaining neuron can be readjusted to recover the model. As per to our knowledge this is the first ever attempt to use persistence homology, a topological tool for pruning. Number of parameters, FLOPs and recovery time of the new pruner is evaluated on CIFAR10 dataset on VGG-16 architecture against L1Filter Pruner, L2Filter Pruner and FPGM Pruner. Evaluation results show that the new pruner competes well with the existing pruners. We conclude that, topological data analysis can be used to explain the recoverability and mitigate damage cause by neural network pruning.
item: Thesis-Abstract
Dynamic ontology based Q&A system for pandemic situations case study COVID-19 pandemics
(2022) Subasinghe SAHP; Silva ATP
In dynamic pandemic situations like covid-19, Many writeups, reviews, articles have been published every day. Rapidly updated data leads information overload, which make the public difficult to keep up with the latest data on pandemic situation. This paper focuses on introduce an efficient Q&A system for dynamic pandemic situation which help public to update with the real time data. Several approaches including basic ontologies, expert knowledge base and linguistic knowledge have been used when model the knowledge base of Q&A systems. But these approaches are mainly based on experts’ knowledge and mainly human interaction in knowledge acquisition, less handling of multimodal data, inefficient inferencing. Even though there are number of solutions which help public to update with the pandemic data, there are no fully automated real time updated systems. So, the intention is to introduce a fully automated multimodal data based real time updated system. In order to archive this goal, fully automated dynamic ontology-based Q&A system was design, developed and evaluated for the pandemic situation like covid-19. Solution was design in such a way that users can enter question which is related to the covid-19 pandemic and retrieve a real time answer. Mainly the system is based on two modules as dynamic ontology module which use web scrapping for real time updated data extraction, process to map the changes in data and Q&A module which simplifies the questions into RDF triples based normal forms that effortlessly handled by database querying. Evaluation of the system was conducted two ways by evaluation of the dynamic ontology module and evaluation of the question and answer module. In both evaluation processes time evaluation and precision has considered.
item: Thesis-Abstract
Personalised movie recommendation based on multi model data integration
(2022) Madushanki JGI; Silva ATP
Recommendation systems plays an essential role in the modern era, and it is a part of routine life where it guides the users in a personalised manner towards interesting and useful objects in a large collection of possible options. The aim of the movie recommendation system is to help movie lovers by generating suggestions on what movie to watch. If movie recommender systems are not in place, movie lovers need to spend time on choosing a movie by going through long lists of movies, which is a time consuming task. Therefore, a lot of research has been conducted to generate movie recommendations using different approaches including pure recommendation techniques and hybrid techniques. However, the recommendations generated through these approaches lack personalisation and accuracy. This thesis presents our approach to generate personalised movie recommendations using multi model data integration to improve the personalisation and accuracy. Different data sources are integrated as inputs when designing this research. A content-based filtering technique collaborated with genetic algorithm-based optimization was utilized for implementation of this research. A precision value of 0.65 was obtained while evaluating the multi-model data integration-based movie recommender system with genetic algorithm-based optimization.
item: Thesis-Abstract
Summarization of large-scale videos to text format using supervised based simple rule - based machine learning models
(2022) Sugathadasa UKHA; Fernando S
Video Summarization has been one of the most interested research and development field since the late 2000s, thanks to the evolution of social media and the internet, due to the influence to provide a concise and meaningful summary of large-scale video. Even though the video summarization has been elongated through several non-ML and traditional based techniques and ML-based techniques, generation of correct and required summaries from the video is yet a limitation. To overcome this concern, different techniques have been attempted including vision-based approaches and NLP related approaches. With the inspiration of NLP related Transformer networks, researchers are looking to integrate such sequence-based learning algorithm into the video dimension as to apply spatiotemporal extractions. Despite the VS implementations, another extension of VS has been exponentially emphasized, namely TVS which generates the summaries of the video via a text format. Simply the evolution of VS towards TVS is not a straightforward journey since a lot of blockers have been eliminated using UL, RL, and SL based frameworks. When it comes to the STOA methods in TVS, Transformer based methods are eventually highlighted along the T5 based NLP frameworks. Since this area is still at the ground level, a lot of unknow facts and issues can be explored. Especially the attention-based sequence modelling of the learning algorithm should be carefully imitated to achieve the best accuracy improvements. All the improvements are subjected to apply into a real-time application ulteriorly. To tackle such improvements, a novel standalone method should be introduced with the simplest network layout which can be applicable to the embedded devices. This is where the Simple Rule-based Machine Learning Network to Text-based Video Summarization (SiRuML-TVS) has been unveiled. Though the network contains a single input of large-scale video and a single output of meaningful description for the given video, the high-level network layout compromises three ML modules for Video Recognition, Object Detection, and finally Text Generation. Each module is subjected to different evaluation criterions however, the end-to-end full network is evaluated on a single metric. Different combination of each module can be affected to the performance of the entire pipeline however, the combination of Transformers and CNNs provide the better tradeoff between accuracy and the computational inferencing. This makes a hope to deploy the proposed method in an edged device thus, the gap between theoretical explanation to practical implementation will be filled.
item: Thesis-Abstract
Ontology - driven personalized expert recommender system for IT service management
(2022) Chamalka KSWKBL; Silva ATP
Finding experts related to a given query in an industrial environment is a timeconsuming manual task. Much research has been conducted in this area using multiple intelligent techniques, but still, there are research gaps with personalizing the recommendation accurately. In this context, an expert recommender system should consider the expert’s preference, experience, and other factors as well as complex organizational processes involved in the recommendation task. Also achieving high accuracy with other conflicting conditions simultaneously is a popular topic in recent research related to recommender systems. This thesis presents our hybrid approach to enhance the personalized expert recommendation problem in enterprise context. We integrate semantic-based ontology with the TOPSIS based Artificial Bee Colony algorithm to achieve high accuracy in this problem domain. Ontology is used for knowledge modeling of the expert profiles and the TOPSIS-ABC algorithm is used for ranking the profiles for a given query based on the distance to the ideal solution.
item: Thesis-Abstract
Thinking like human approach to AI for gameplaying
(2021) Sumanapala SH; Karunananda AS
Playing games helps humans relax and their minds. Games have a very long history. They were used as a leisure activity even by early humans. It involves both mind and body. With the development of computers games became more complex and entertaining. Computers are extensively used to model, simulate and develop games. Games require us to use our cognitive power to plan and take sudden discussions to win. Chess, Go are highly complex games in terms of options that a player must think about when making a move. There are a lot of different paths that a player can take to win a game. Therefore, many researchers try to use various technologies to develop applications which can play games. The development of these technologies also helps to solve complex problems in completely different areas such as finance, economics, and warfare. According to the other researchers work. Multi Agent Systems (MAS) have ability to modal complex problems. Because of that we developed a system using MAS to play games. Suggested solutions are developed using thinking of human’s behaviors. When humans face a problem, they think about it in several ways. In their mind they compare, contrast and argue to find the best possible solution. Similarly, in multi agent systems intelligent agents solve complex problems by communication, coordination and negotiation. Game playing is a problem solving that involves thinking, decision making and negotiation skills of the human mind. Even though this is very intuitive to the human mind proper designing needs to be done to model this way of problem solving into an AI system. Here we designed a multi agent game playing system to play N-puzzle game. It contains mainly two types of agents: coordinator agent and decision agents. For the N puzzle game there are four decision agents namely up, down, left, right decision agents. They analyze the game state, negotiate with each other to determine the best move to make. For the development of the solution, we used SPADE architecture-based multi agent framework. The results show the proposed solution able to achieve notable improvement (~42%) compared to human players.
item: Thesis-Abstract
Neural machine translation approach for Singlish to English translation
(2021) Sandaruwan HGD; Fernando S; Sumathipala S
This dissertation is for a research that aimed at proposing a language model to translate texts written in Singlish to English. Singlish is an alternative writing system for Sinhala language that uses Latin scripts (English Alphabet) instead of using native Sinhala alphabet. This had been a requirement for long period, since many Sri Lankans use this writing method to write product reviews, social media posts and comments etc. This has been tried since couple of years by many research students but the main challenge was to find a proper data set to evaluate deep learning models for this Natural Language Processing (NLP) task. Hence, traditional statistic, rulebased models has been proposed with less data. This research addresses the challenge of preparing a data set to evaluate a deep learning approach for this machine translation activity and also to evaluate a seq2seq Neural Machine Translation (NMT) model. The proposed seq2seq model is purely based on the attention mechanism, as it has been used to improve NMT by selectively focusing on parts of the source sentence during translation. The proposed approach can achieve 24.13 BLEU score on Singlish-English by seeing ~0.15 M parallel sentence pairs with ~50 K word vocabulary.
item: Thesis-Abstract
Generic information extraction framework for document processing
(2021) Silva AKG; Silva A T P
Information extraction from documents has become great use of novel natural language processing areas. Most of the entity extraction methodologies are variant in a context such as medical area, financial area, also come even limited to the given language. Rather than tackling this problem in such manner, it is better to have one generic approach which is applicable for any of such document types to extract entity information regardless of language, context and structure. Also, the great barrier in such research is exploring the structure while keeping the hierarchical, semantic and heuristic features. Another problem identified is that usually, it requires a massive training corpus. Therefore, this research focus on mitigating such problems. Throughout the research timeline, several approaches have been identifying towards building document information extractors focusing on different disciplines. Starting from optical character recognition of document images to data mining of large corpus of documents this research area has been contributed to the development of natural language processing, semantic analysis, information extraction and conceptual modelling. Although in separate ways those are trying to achieve the generic ability to process any kind of document which unfortunately not being achieved successfully due to the approach and technical limitations. As per the approach within this research, it can process any kind of document in any domain by simply adhering the conceptual relations without being trying to extract component-wise and mapping into known structures. Just as a human being look at any unknown document and going through the relations and making best guesses on answering the queries, this system will also mimic the same behaviour. As per the output, it can either document Concept-Relation or some answer for the given query. The experimental strategy has partaken with regards to several different datasets originated from SQUAD 2.0, DOCVQA dataset, SQUAD 2.0 dataset and Kaggle based datasets. Based on F1 evaluation metric it performs with overall 87.01 performance rate on SQUAD 2.0 dataset showcasing its capable of question-answering task with higher accuracy. Upon diving into experimental design, starting from the dataset evaluation several experiments have been carried out. Datasets such as SQUAD 2.0 and DocVQA has been used to evaluate the overall performance over metrics such as F1 score, accuracy and ANLS providing scores 87.01,52.78 and 0.583 respectively. The F1 score, which is 87.01 showcase that the provided solution achieves the expected objectives in deriving a generic model fitting for any questionanswering task based on documents.
item: Thesis-Full-text
Optimizing robotic swarm based construction tasks
(2020) Jaliya DLTS; Fernando KSD
Construction is a field that grows with technological advancements by the day. The field has always adapted novel and innovative technologies to create marvels of engineering which were thought to be impossible before their time. Having many challenges in the physical construction tasks, there are researchers all over the globe trying to innovate and improve construction related technologies. With the advancements in technology, some construction projects have already adapted robotics in some aspects of the construction process. In this research, we try to introduce a novel approach for construction using swarm robotics. Behaviour of a robotic swarm is collective and aimed at solving a problem using the collective behaviour. This is similar to the natural animal swarm behaviour of bees/ ants/ termites...etc. Even though there are many researches and developments done in the field of swarm robotics, the concept has not yet made its way into industrial environments. Many researchers in the field of construction using swarm robots have come up with successful algorithmic approaches for constructing simple shapes. However many of them lack practicality due to the usage of pheromone trails / building blocks with communication capabilities / bots having a real time global view of the state of the construction...etc which are difficult to achieve in the real world with the existing technologies. Furthermore, most similar researches show a serial behaviour in the construction instead of parallel behaviour seen in nature. In this research we propose a novel and a practical approach for swarm robots for optimizing construction tasks in 2D using swarm robotics concepts. The swarm consists of a set of robots that are practical to implement, with limited visibility and limited communication skills. Having only the local view of the terrain, robots in the swarm construct a given shape in 2D in collaboration with the other robots in the swarm. With the application of swarm concepts in an improved manner, the swarm is able to construct the given shape displaying true parallelism which in turn will improve the construction time. Constructions using swarm robots is proposed as one of the most practical methods of constructing buildings/ shelters specially for colonizing space where sending skilled workers is too expensive. The implications of this research can be an initiative to such applications.
item: Thesis-Full-text
A hybrid approach for dynamic task scheduling in unforeseen environments using multi agent reinforcement learning and enhanced Q-learning
(2020) Shayalika JKC; Silva ATP
The process of assigning most appropriate resources to workstations or agents at the right time is termed as Scheduling. The word is applied separately to tasks and resources in task scheduling and resource allocation accordingly. Scheduling is a universal theme being conferred in technological areas like computing and strategic areas like operational management. The core idea behind scheduling is the distribution of shared resources across time for competitive tasks. Optimization, efficiency, productivity and performance are the major metrics evaluated in scheduling. Effective scheduling under uncertainty is tricky and unpredictable and it’s an interesting area to study. Environmental uncertainty is a challenging extent that effect scheduling based decision making in work environments where environment dynamics subject to numerous fluctuations frequently. Reinforcement Learning is an emerging field extensively research on environmental modelling under uncertainty. Optimization in dynamic scheduling can be effectively handled using Reinforcement learning. This research is about a research study that focused on Reinforcement Learning techniques that have been used for dynamic task scheduling. This thesis addresses the results of the study by means of the state-of-the-art on Reinforcement learning techniques used in dynamic task scheduling and a comparative review of those techniques. This thesis reports on our research on a Hybrid Approach for Dynamic Task Scheduling in Unforeseen Environments using the techniques; Multi Agent Reinforcement Learning and Enhanced QLearning. The proposed solution follows online and offline reinforcement learning approaches which works on real time inputs of heuristics like, Number of agents involved, current state of the environment and backlog of tasks and sub-tasks, Rewarding criteria etc. The outputs are the set of scheduled tasks for the work environment. The solution comes with an approach for priority based dynamic task scheduling using Multi Agent Reinforcement Learning & Enhanced Q-Learning. Enhanced Q-Learning includes developed algorithm approaches; QLearning, Dyna Q+ Learning and Deep Dyna-Q+ Learning which is proposed as an effective methodology for scheduling problem. The novelty of the solutions resides on implementation of model-based reinforcement learning and integration with the model-free reinforcement learning algorithmic approach by means of Dyna-Q+ Learning and Deep Dyna-Q+ Learning for dynamic task scheduling in an unforeseen environment. The research project also concentrates on how the dynamic task scheduling is managed within a constantly updating environment which the Deep Dyna-Q+ has provided a ground solution to cater this requirement. The end solution has comparatively evaluated the product using evaluation metrics in each of the three Q-Learning variations developed. As per the evaluation results it was revealed Deep Dyna-Q+ implementation would cater well the problem of dynamic task scheduling in an unforeseen environment.
item: Thesis-Full-text
Capsule network based super resolution method for medical image enhancement
(2020) Munasingha SC; Fernando S
Medical imaging has been one of the most attentive research and development areas since the 1950s, particularly due to the contribution to disease diagnosis. Despite the fact that imaging technologies have been advanced in multiple ways, yet resolution limitations can be observed. To overcome the resolution limitations, various image enhancement techniques have been used. Image Super-Resolution (SR) is the latest technique in the list to achieve higher resolution with much lower resolution images. Earlier, frequency based and interpolation based SR techniques were used for SR. The afterward achievements in SR techniques are obtained via Convolution Neural Network (SRCNN) based methods and have several flaws. Capsule net (Caps Net) is the state of the art alternative methodology for the problems which were previously solved by CNN. One recent attempt was made to assess the Caps Net for SR task. This new area has a lot to be explored. Especially the time inefficiencies of this approach should be addressed along with accuracy improvements. In this research several capsule network routing mechanisms have been investigated for Super Resolution pipeline with a medical image dataset. Standard Dynamic Routing and Expectation Maximization Routing methods are re-configured to improve the accuracy. Above all, a novel integration of state of the art routing mechanism, Inverted Dot Product based Attention Routing mechanism is introduced for Super Resolution task. With 300,000 medical image training pairs and 2,500 evaluation pairs, every model was evaluated. Along with different image quality indexes, it was shown that the Dynamic Routing based method outperformed all methods and the newest Attention Routing based approach has shown similar image quality performance to that of the state of the art method FSRCNN and less time complexity to that of the existing Caps Net based approaches. This implies that clinicians can use this system effectively in a clinical setting.
item: Thesis-Full-text
Stacked capsule autoencoder based generative adversarial network
(2020) Madhusanka GAC; Fernando KSD
Convolutional neural network based generative adversarial networks have become the dominant generative model in the field of generative deep learning. But limitations of convolutional neural networks affect generative adversarial networks also, since most of the current generative adversarial networks are based on convolutional neural networks. The main limitation of convolutional neural networks is that they are invariant. In other words, convolutional neural networks can’t preserve spatial information of features in an image. In contrast, capsule networks gained attention in recent years due to their equivariant architecture which preserves spatial information. Stacked capsule autoencoder is a type of capsule networks that is able to overcome the limitations that convolutional neural networks suffer from. Stacked capsule autoencoder is an equivariant model which preserves spatial, relational, geometrical information between parts and objects in an image. So in this research we implemented a generative adversarial network which uses stacked capsule autoencoder as the discriminator of it, by replacing the conventional convolutional neural network discriminator. Then we evaluated the implementation of stacked capsule autoencoder based generative adversarial network using MNIST images. As the qualitative evaluation we observed the visual quality of generated images. Quality and diversity of the generated images are acceptable. Then we evaluated our model quantitatively using inception score for MNIST. Findings of this research show that, the stacked capsule autoencoder can be used as the discriminator of a generative adversarial network instead a convolutional neural network and its performances are plausible.
item: Thesis-Full-text
An Intelligent hardware system for real-time infant cry detection and classification
(2020) Pathirana UPPD; Sumathipala S
Cry, the universal communication language of the infants encodes vital information about the physiological and psychological health of the infant. Experienced caregivers can understand the cause of cry based on the pitch, tone, intensity, and duration. Similarly, pediatricians can diagnose hearing impairments, brain damages, and asphyxia by analyzing the cry signals, providing a non-invasive mechanism for early diagnosis in the first few months. Hence, automated cry classification has gained great importance in the fields of medicine and baby-care. With the emergence of the concept of the Internet of Things coupled with Artificial Intelligence, baby monitors have recently gained huge popularity due to features like sleep analysis, cry detection, and motion analysis through multiple sensors. Since cry classification involves audio processing in real-time, most of the solutions have either complex and costly designs or distributed computing, which leads to privacy concerns of the users. This research presents a low-cost intelligent hardware system for real-time infant cry detection and classification. The proposed solution presents the selection of the hardware to suit the requirements of audio processing while adhering to financial constraints and the firmware design, which includes voice activity detection, cry detection, and classification. This proposes the use of the multi-agent system as a resource management concept while proving that AI concepts can also be extended to resource-limited hardware platforms as the novelty. Firmware and algorithm are designed to maintain the accuracy figures above 90% while processing the audio signal at a higher rate than its production to maintain stability. A voice activity detector was designed to filter human voice through temporal features while cry detection and classification were respectively based on Artificial Neural Network and K-Nearest Neighbor algorithm trained with a spectral-domain feature vector called Mel Frequency Cepstral Coefficients (MFCC). Evaluations under diverse conditions showed accuracy figures of 96.76% and 77.45% in cry detection and classification, respectively
item: Thesis-Full-text
Theme park crowd simulation using multi agent system
(2020) Gunawardena DMLNP; Fernando KSD
Computer-based crowd simulation has become a dominant research topic today. Computerbased simulation applications are used in education, entertainment, training, theme park design and building evacuation. Among them, virtual crowd simulation has become a dominant topic in theme park industry. Limited research has been conducted in theme park crowd simulation using multi-agent system. Virtual simulations can be done changing the configurations, to decide the best-suited locations for stalls in the premises. Otherwise, it will cost a lot to change physically located items as experience and feedback. In this research, Multi-Agent Technology has been used to simulate crowed behavior in Theme Park when an emergency is caused due to fire. NetLogo, a multi-agent simulation software, is used to build the modal. The crowd in the park is identified as agents. Different agents, children, parents, individuals and couples are programmed to behave as for social norms, defined under social science. The basic goal of every living agent, is to stay away from fire and evacuate from the closet exit as quickly as they can. But there are exceptional scenarios, unique to different agents. For instance, parents try to find their children, before existing from environment. We have defined coordinator agents to manage crowded areas and to help parent agents, who get lost while looking for their children. Logics, that governs each type of agent behavior are programmed in NetLogo. The simulation is tested changing the number of agents and observed increment of evacuation time when the number of agents are increased. In research simulation, few emergent phenomena were observed. One is, some areas get crowed while agents are evacuating the theme park. Another is, exits which are away from the fire location are getting crowded. And parent agents get lost in theme park while looking for their children.