Comparison Between Performance of Various Database Systems for Implementing a Language Corpus

dc.contributor.authorUpeksha, D
dc.contributor.authorWijayarathna, C
dc.contributor.authorSiriwardena, M
dc.contributor.authorLasandun, L
dc.contributor.authorWimalasuriya, C
dc.contributor.authorde Silva, NHND
dc.contributor.authorDias, G
dc.date.accessioned2017-01-16T04:02:10Z
dc.date.available2017-01-16T04:02:10Z
dc.description.abstractData storage and information retrieval are some of the most important aspects when it comes to the development of a language corpus. Currently most corpora use either relational databases or indexed file systems. When selecting a data storage system, most important facts to consider are the speeds of data insertion and information retrieval. Other than the aforementioned two approaches, currently there are various database systems which have different strengths that can be more useful. This paper compares the performance of data storage and retrieval mechanisms which use relational databases, graph databases, column store databases and indexed file systems for various steps such as inserting data into corpus and retrieving information from it, and tries to suggest an optimal storage architecture for a language corpus.en_US
dc.identifier.emailgihan@uom.lken_US
dc.identifier.journalInternational Conference: Beyond Databases, Architectures and Structuresen_US
dc.identifier.pgnos82-91en_US
dc.identifier.urihttp://dl.lib.mrt.ac.lk/handle/123/12226
dc.identifier.year2015en_US
dc.relation.urihttp://link.springer.com/chapter/10.1007/978-3-319-18422-7_7en_US
dc.source.urihttp://link.springer.com/chapter/10.1007/978-3-319-18422-7_7en_US
dc.subjectCorpus,Relational Databasesen_US
dc.subjectNoSQL
dc.subjectGraph Database
dc.subjectNeo4j
dc.subjectIndex file system,Apache Solr
dc.subjectColumn stores
dc.subjectCassandra
dc.titleComparison Between Performance of Various Database Systems for Implementing a Language Corpusen_US

Files