Institutional-Repository, University of Moratuwa.  

Classification of cyberbullying Sinhala language comments on social media

Show simple item record

dc.contributor.author Amali, HMAI
dc.contributor.author Jayalal, S
dc.contributor.author Jayalal, S
dc.contributor.editor Weeraddana, C
dc.contributor.editor Edussooriya, CUS
dc.contributor.editor Abeysooriya, RP
dc.date.accessioned 2022-08-09T09:32:36Z
dc.date.available 2022-08-09T09:32:36Z
dc.date.issued 2020-07
dc.identifier.citation H. M. A. Ishara Amali and S. Jayalal, "Classification of Cyberbullying Sinhala Language Comments on Social Media," 2020 Moratuwa Engineering Research Conference (MERCon), 2020, pp. 266-271, doi: 10.1109/MERCon50084.2020.9185209. en_US
dc.identifier.uri http://dl.lib.uom.lk/handle/123/18582
dc.description.abstract Due to technological revolution over the years, bullying which was confined to physical boundaries has now moved online. Denigration or insult is one form of cyberbullying. According to Sri Lanka Computer Emergency Readiness Team, social media cyberbullying incidents are escalating. Insulting words are dynamic, and same word can have several meanings according to the context. Simply because a comment contains such a word, it cannot be classified as bullying. Hence, when labeling comments, simple keyword spotting techniques are inadequate. Other languages have addressed this issue using lexical databases such as WordNet which provides synonyms and homonyms of words. Since there is no proper lexical database developed for Sinhala language, detecting a word as bullying is a challenge. Therefore, we used rules to overcome this issue. Twitter comments with profane words were collected, outliers were removed, and remaining tweets were pre-processed. To determine insult in the text, five rules were used for feature extraction. Afterward, we applied Support Vector Machine (SVM), K-nearest neighbor (KNN) and Naïve Bayes algorithms. The results show that SVM with an RBF kernel performs better with an F1-score of 91%. Novelty of this research is the focus on Sinhala language cyberbully detection which has not been addressed before. en_US
dc.language.iso en en_US
dc.publisher IEEE en_US
dc.relation.uri https://ieeexplore.ieee.org/document/9185209 en_US
dc.subject cyberbullying en_US
dc.subject social media en_US
dc.subject text mining en_US
dc.subject sentiment analysis en_US
dc.subject machine learning en_US
dc.title Classification of cyberbullying Sinhala language comments on social media en_US
dc.type Conference-Full-text en_US
dc.identifier.faculty Engineering en_US
dc.identifier.department Engineering Research Unit, University of Moratuwa en_US
dc.identifier.conference Moratuwa Engineering Research Conference 2020 en_US
dc.identifier.place Moratuwa, Sri Lanka en_US
dc.identifier.pgnos pp. 266-271 en_US
dc.identifier.proceeding Proceedings of Moratuwa Engineering Research Conference 2020 en_US
dc.identifier.email amalihma_im14002@stu.kln.ac.lk en_US
dc.identifier.email shantha@kln.ac.lk en_US
dc.identifier.doi 10.1109/MERCon50084.2020.9185209 en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record