dc.contributor.author |
Nandathilaka, M |
|
dc.contributor.author |
Ahangama, S |
|
dc.contributor.author |
Weerasuriya, GT |
|
dc.contributor.editor |
Wijesiriwardana, CP |
|
dc.date.accessioned |
2022-12-05T05:39:43Z |
|
dc.date.available |
2022-12-05T05:39:43Z |
|
dc.date.issued |
2018 |
|
dc.identifier.citation |
M. Nandathilaka, S. Ahangama and G. T. Weerasuriya, "A Rule-based Lemmatizing Approach for Sinhala Language," 2018 3rd International Conference on Information Technology Research (ICITR), 2018, pp. 1-5, doi: 10.1109/ICITR.2018.8736134. |
en_US |
dc.identifier.uri |
http://dl.lib.uom.lk/handle/123/19643 |
|
dc.description.abstract |
Speech recognition, natural language processing, language translation and deep learning researches are bridging the communication gap between humans as well as between humans and machines. Sinhala is a native language in Sri Lanka which is being used by 19 million people approximately. The growth of Sinhala natural language processing tools is less when compared to European and other Asian Languages. A lemmatizer for Sinhala can be used for the morphological analysis and is an essential module in Sinhala language processing mechanisms. Lemmatizing is a complex process in morphological analyzing where base/root of words are derived. There is not much work published focusing on lemmatizer approaches for Sinhala. This paper presents a rule based lemmatizing approach which can be used to determine the base form of Sinhala words with an accuracy of 77.3%. It differs from similar works because the data used in the research are extracted from social media. |
en_US |
dc.language.iso |
en |
en_US |
dc.publisher |
Information Technology Research Unit, Faculty of Information Technology, University of Moratuwa, Sri Lanka |
en_US |
dc.relation.uri |
https://ieeexplore.ieee.org/document/8736134 |
en_US |
dc.subject |
Sinhala Morphology |
en_US |
dc.subject |
Lemmatization |
en_US |
dc.subject |
Inflection |
en_US |
dc.subject |
Rule-based |
en_US |
dc.subject |
Social media data |
en_US |
dc.title |
A rule-based lemmatizing approach for sinhala language |
en_US |
dc.type |
Conference-Full-text |
en_US |
dc.identifier.faculty |
IT |
en_US |
dc.identifier.department |
Information Technology Research Unit, Faculty of Information Technology, University of Moratuwa. |
en_US |
dc.identifier.year |
2018 |
en_US |
dc.identifier.conference |
3rd International Conference on Information Technology Research 2018 |
en_US |
dc.identifier.proceeding |
Proceedings of the 3rd International Conference in Information Technology Research 2018 |
en_US |
dc.identifier.email |
praba.14@itfac.mrt.ac.lk |
en_US |
dc.identifier.email |
supunmali@uom.lk |
en_US |
dc.identifier.email |
thiliniw@uom.lk |
en_US |
dc.identifier.doi |
doi: 10.1109/ICITR.2018.8736134 |
en_US |