Show simple item record

dc.contributor.advisor Thayasivam U
dc.contributor.author Karunarathne WI
dc.date.accessioned 2020
dc.date.available 2020
dc.date.issued 2020
dc.identifier.uri http://dl.lib.uom.lk/handle/123/16487
dc.description.abstract Sentiment analysis has become a popular topic since the last decade. The increase in the use of internet has led to the increase of user-generated content. This has played an important role in making sentiment analysis more popular among researchers. The user-generated content can provide some valuable insight about the public opinion to the government and various industries. This research has mainly focused on sentiment analysis of Sinhala language. Sinhala is the most spoken language in Sri Lanka. With the increased use of the internet and social media, there is a considerable amount of information communicated via Sinhala. This has presented a good opportunity to mine the information presented in Sinhala language. Performing Sinhala language sentiment analysis has some difficulties, as Sinhala is morphologically rich and is a language of free order compared to English. Lack of Sinhala language resources has brought challenges from gathering and generating data sets to stemming / lemmatizing algorithms. This research has tried to address the above challenges by developing a Sinhala dataset suitable for sentiment analysis and by developing a stemming algorithm for Sinhala. The dataset is developed by collecting Tweets from Twitter and it has been manually annotated. In addition to the resource creation, sentiment analysis of Sinhala language is also performed using word embedding as features. Several sentiment analysis experiments are performed by using several machine learning techniques. The accuracy as well as precision and recall are used to identify the best performing model. The problems faced when conducting sentiment analysis for Sinhala language are discussed in the research. The research has discussed the difference between the user-generated content in English and Sinhala. en_US
dc.language.iso en en_US
dc.subject COMPUTER SCIENCE- Dissertation en_US
dc.subject COMPUTER SCIENCE & ENGINEERING - Dissertation en_US
dc.subject SENTIMENT ANALYSIS - Sinhala Language en_US
dc.subject SINHALA LANGUAGE SENTIMENT ANALYSIS en_US
dc.subject MACHINE LEARNING TECHNIQUES en_US
dc.title Sentiment analysis of Sinhala tweets en_US
dc.type Thesis-Full-text en_US
dc.identifier.faculty Engineering en_US
dc.identifier.degree MSc in Computer Science and Engineering en_US
dc.identifier.department Department of Computer Science and Engineering en_US
dc.date.accept 2020
dc.identifier.accno TH4288 en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record