Optical character recognition for sinhala text using feature analysis

Watugala, GK; Kumara, WMP

UoM IR
→
Research Publications
→
Conference Proceedings
→
UoM Conferences
→
Faculty of Engineering Research Unit (ERU & MERCon)
→
ERU - 1999
→
View Item

dc.contributor.author	Watugala, GK
dc.contributor.author	Kumara, WMP
dc.date.accessioned	1999T15:21:02Z
dc.date.available	1999T15:21:02Z
dc.date.issued	1999
dc.identifier.uri	http://dl.lib.mrt.ac.lk/handle/123/9329
dc.description.abstract	Optical Character Recognition (OCR) is now a reality for documents printed in English. In the present study, the groundwork for the recognition of Sinhala characters is done. Matrix Matching and Feature Analysis are the two commonly used methods for the recognition of English letters. In this study the Feature Analysis method is investigated to recognize Sinhala characters. Matrix matching method is found to be suitable for recognizing documents containing text with known font and typeface. It should also be used to identify and extract the modifiers used on top, down or after the character. This helps in the identification of the base character using feature analysis. Several features of Sinhala characters can be extracted by running simple programs on the pixel array of the character. These features include aspect ratio, inscribing octagon, and number of pixel curves crossed when the character is sliced at different angles. By running these programs on Sinhala characters one can prepare a set of values of these features for standard characters. Afterwards the features of an unknown character can be compared with the standard data for recognition. Programming for Sinhala Character Recognition is done using MathCAD, an application package for complex mathematical calculations. Since the algorithms are written in pseudocode it is easy to convert these algorithms to a C++ program.	en_US
dc.language.iso	en	en_US
dc.title	Optical character recognition for sinhala text using feature analysis	en_US
dc.type	Conference-Full-text	en_US
dc.identifier.year	1999	en_US
dc.identifier.pgnos	319-331	en_US