dc.contributor.author |
Sarveswaran, K |
|
dc.contributor.author |
Dias, G |
|
dc.contributor.author |
Butt, M |
|
dc.contributor.editor |
Wijesiriwardana, CP |
|
dc.date.accessioned |
2022-12-05T05:40:51Z |
|
dc.date.available |
2022-12-05T05:40:51Z |
|
dc.date.issued |
2018 |
|
dc.identifier.citation |
K. Sarveswaran, G. Dias and M. Butt, "ThamizhiFST: A Morphological Analyser and Generator for Tamil Verbs," 2018 3rd International Conference on Information Technology Research (ICITR), 2018, pp. 1-6, doi: 10.1109/ICITR.2018.8736139. |
en_US |
dc.identifier.uri |
http://dl.lib.uom.lk/handle/123/19645 |
|
dc.description.abstract |
ThamizhiFST is a Morphological Analyser
and Generator (MAG) for Tamil. It was developed to
extend the coverage of the computational Tamil grammar
being developed using Lexical Functional Grammar
(LFG). ThamizhiFST covers the simple verbs in
Tamil as an initial step. A Finite State Transducer
(FST) approach was used to develop the MAG and
it was implemented using the FOMA Open Source
Software. Since morphological rules are of a finite
nature and represent a known quantity, a rule-based
approach like FST is more appropriate than possible
machine learning alternatives, especially with respect
to achieving reliably good accuracy that is required for
computational grammar development. A set of 3250
Tamil verb lemmas from 13 paradigms together with
their 260 conjugation forms were used in the construction
of ThamizhiFST. Further, a set of 27 labels
were used to mark the morphosyntactic information
of the verbs. The whole system was developed as
a three-layer web-based system to tackle the issues
arising when processing an agglutinative language like
Tamil and to ensure its extendability. Unlike other
existing MAGs, ThamizhiFST also provides the morpheme
corresponding to each morphosyntactic label
and marks morpheme boundaries. An evaluation shows
that ThamizhiFST has an f-measure of 0.97 for simple
verbs. Future and current work include work on extending
the system to cover more verbs and nouns and
make it generally available. |
en_US |
dc.language.iso |
en |
en_US |
dc.publisher |
Information Technology Research Unit, Faculty of Information Technology, University of Moratuwa, Sri Lanka |
en_US |
dc.relation.uri |
https://ieeexplore.ieee.org/document/8736139 |
en_US |
dc.subject |
Morphological analyser |
en_US |
dc.subject |
Morphological generator |
en_US |
dc.subject |
Finite state transducer |
en_US |
dc.subject |
TamilM |
en_US |
dc.title |
Thamizhifst: a morphological analyser and generator for tamil verbs |
en_US |
dc.type |
Conference-Full-text |
en_US |
dc.identifier.faculty |
IT |
en_US |
dc.identifier.department |
Information Technology Research Unit, Faculty of Information Technology, University of Moratuwa. |
en_US |
dc.identifier.year |
2018 |
en_US |
dc.identifier.conference |
3rd International Conference on Information Technology Research 2018 |
en_US |
dc.identifier.proceeding |
Proceedings of the 3rd International Conference in Information Technology Research 2018 |
en_US |
dc.identifier.email |
sarvesk@uom.lk |
en_US |
dc.identifier.email |
gihan@uom.lk |
en_US |
dc.identifier.email |
miriam.butt@uni-konstanz.de |
en_US |
dc.identifier.doi |
doi: 10.1109/ICITR.2018.8736139 |
en_US |