Show simple item record

dc.contributor.author Jenarthanan, R
dc.contributor.author Senarath, Y
dc.contributor.author Thayasivam, U
dc.date.accessioned 2019-10-22T05:41:43Z
dc.date.available 2019-10-22T05:41:43Z
dc.identifier.uri http://dl.lib.mrt.ac.lk/handle/123/15169
dc.description.abstract The purpose of text emotion analysis is to detect and recognize the classification of feeling expressed in text. In recent years, there has been an increase in text emotion analysis studies for English language since data were abundant. Due to the growth of social media large amount data are now available for regional languages such as Tamil and Sinhala as well. However, these languages lack necessary annotated corpus for many NLP tasks including emotion analysis. In this paper, we present our scalable semi-automatic approach to create an annotated corpus named ACTSEA for Tamil and Sinhala to support emotion analysis. Alongside, our analysis on a sample of the produced data and the useful findings are presented for the low resourced NLP community to benefit. For ACTSEA, data were gathered from twitter platform and annotated manually after cleaning. We collected 600280 (Tamil) and 318308 (Sinhala) tweets in total which makes our corpus largest data collection which is currently available for these languages. en_US
dc.language.iso en en_US
dc.subject NLP en_US
dc.subject Emotion Analysis en_US
dc.subject Sentiment Analysis en_US
dc.subject Emotion Corpus en_US
dc.subject Morphological Generator en_US
dc.subject Corpus Generator en_US
dc.title ACTSEA : annotated corpus for Tamil & Sinhala emotion analysis en_US
dc.type Conference-Abstract en_US
dc.identifier.faculty Engineering en_US
dc.identifier.department Department of Computer Science and Engineering en_US
dc.identifier.year 2019 en_US
dc.identifier.conference Moratuwa Engineering Research Conference - MERCon 2019 en_US
dc.identifier.place Moraruwa, Sri Lanka en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record