MERCon - 2020
Permanent URI for this collectionhttp://192.248.9.226/handle/123/16315
Browse
Browsing MERCon - 2020 by Subject "affinity propagation clustering"
Now showing 1 - 1 of 1
- Results Per Page
- Sort Options
- item: Conference-Full-textTamil news clustering using word embeddings(IEEE, 2020-07) Fayaza, MSF; Ranathunga, S; Weeraddana, C; Edussooriya, CUS; Abeysooriya, RPNews aggregators support the readers to view news from multiple news providers via a single point. At the moment, the only news aggregator that supports Tamil news is Google news, which has some noticeable shortages. In this study, Term Frequency–Inverse Document Frequency and word embedding (fastText) document representation techniques were experimented with one pass and affinity propagation clustering algorithms to news title, as well as title and body in order to implement a news aggregator for the Tamil language. For this study we collected data from nine different news providers. When fastText was applied with one pass algorithm to news title and body, it managed to beat other approaches to achieve an average pairwise F-score of 81% with respect to manual clustering. Also, we were able to create a Tamil fastText word embedding model using more than 21 million words.