Identification and characterization of crawlers through analysis of web logs

dc.contributor.authorAlgiriyage, N
dc.contributor.authorJayasena, VSD
dc.contributor.authorDias, G
dc.contributor.authorPerera, A
dc.contributor.authorDayananda, K
dc.date.accessioned2017-01-16T05:24:00Z
dc.date.available2017-01-16T05:24:00Z
dc.description.abstractWeb crawlers are software programs that automatically traverse the hyperlink structure of the world-wide web in order to locate and retrieve information. In addition to crawlers from search engines, we observed many other crawlers which may gather business intelligence, confidential information or even execute attacks based on gathered information while camouflaging their identity. Therefore, it is important for a website owner to know who has crawled his site, and what they have done. In this study we have analyzed crawler patterns in web server logs, developed a methodology to identify crawlers and classified them into three categories. To evaluate our methodology we used seven test crawler scenarios. We found that approximately 53.25% of web crawler sessions were from “known” crawlers and 34.16% exhibit suspicious behavior.en_US
dc.identifier.conferenceIEEE 8th International Conference on Industrial and Information Systemsen_US
dc.identifier.emailgihan@uom.lken_US
dc.identifier.pgnos150-155en_US
dc.identifier.urihttp://dl.lib.mrt.ac.lk/handle/123/12238
dc.identifier.year2013en_US
dc.relation.urihttp://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6731972en_US
dc.source.urihttp://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6731972en_US
dc.titleIdentification and characterization of crawlers through analysis of web logsen_US
dc.typeConference-Abstracten_US

Files