Browsing by Author "Udantha, M"
Now showing 1 - 4 of 4
- Results Per Page
- Sort Options
- item: Conference-Full-textAn Episode-based approach to Identify Website user access patterns(2016-03-08) Udantha, M; Ranathunga, S; Dias, GMining web access log data is a popular technique to identify frequent access patterns of website users. There are many mining techniques such as clustering, sequential pattern mining and association rule mining to identify these frequent access patterns. Each can find interesting access patterns and group the users, but they cannot identify the slight differences between accesses patterns included in individual clusters. But in reality these could refer to important information about attacks. This paper introduces a methodology to identify these access patterns at a much lower level than what is provided by traditional clustering techniques, such as nearest neighbour based techniques and classification techniques. This technique makes use of the concept of episodes to represent web sessions. These episodes are expressed in the form of regular expressions. To the best of our knowledge, this is the first time to apply the concept of regular expressions to identify user access patterns in web server log data. In addition to identifying frequent patterns, we demonstrate that this technique is able to identify access patterns that occur rarely, which would have been simply treated as noise in traditional clustering mechanisms.
- item: Conference-Full-textModelling website user behaviors by combining the em and dbscan algorithms(IEEE, 2016-04) Udantha, M; Ranathunga, S; Dias, G; Jayasekara, AGBP; Bandara, HMND; Amarasinghe, YWRWeb logs can provide a wealth of information on user access patterns of a corresponding website, when they are properly analyzed. However, finding interesting patterns hidden in the low-level log data is non-trivial due to large log volumes, and the distribution of the log files in cluster environments. This paper presents a novel technique, the application of Density- Based Spatial Clustering of Applications with Noise (DBSCAN) and Expectation Maximization (EM) algorithms in an iterative manner for clustering web user sessions. Each cluster corresponds to one or more web user activities. The unique user access pattern of each cluster is identified by frequent pattern mining and sequential pattern mining techniques. When compared with the clustering output of EM, DBSCAN, and kmeans algorithms, this technique shows better accuracy in web session mining, and it is more effective in identifying cluster changes with time. We demonstrate that the implemented system is capable of not only identifying common user behaviors, but also of identifying cyber-attacks.
- item: Conference-AbstractModelling Website User Behaviors by Combining the EM and DBSCAN AlgorithmsUdantha, M; Ranathunga, S; Dias, GWeb logs can provide a wealth of information on user access patterns of a corresponding website, when they are properly analyzed. However, finding interesting patterns hidden in the low-level log data is non-trivial due to large log volumes, and the distribution of the log files in cluster environments. This paper presents a novel technique, the application of Density-Based Spatial Clustering of Applications with Noise (DBSCAN) and Expectation Maximization (EM) algorithms in an iterative manner for clustering web user sessions. Each cluster corresponds to one or more web user activities. The unique user access pattern of each cluster is identified by frequent pattern mining and sequential pattern mining techniques. When compared with the clustering output of EM, DBSCAN, and k-means algorithms, this technique shows better accuracy in web session mining, and it is more effective in identifying cluster changes with time. We demonstrate that the implemented system is capable of not only identifying common user behaviors, but also of identifying cyber-attacks.
- item: Article-AbstractREST-based Offline e-Mail SystemDias, G; Karunarathna, M; Udantha, M; Gunathilake, I; Pathirathna, S; Rathnayake, TOver the years the Internet has grown from a research tool to a worldwide communication medium. One of the applications that has grown up with the Internet is e-mail. e-Mail has become an indispensable tool for both corporations and individuals, and web-based e-mail systems have become very popular. However, a major problem with web-based email is we cannot access them when not connected to the Internet. We have built an off-line web-based e-mail system to overcome this issue, and to provide fast response even over slow connections. This system is based on Representational State Transfer (REST) and maintains HTML5 local storage to store mail and meta-data in the browser without installing any plug-ins. The system records all user actions locally and synchronizes with the server when connected to the Internet.