Texture for Script identification
The problem of determining the script and language of a document image has a number of important applications in the field of document analysis, such as indexing and sorting of large collections of such images, or as a precursor to optical character recognition (OCR). In this paper, we investigate the use of texture as a tool for determining the script of a document image, based on the observation that text has a distinct visual texture. An experimental evaluation of a number of commonly used texture features is conducted on a newly created script database, providing a qualitative measure of which features are most appropriate for this task. Strategies for improving classification results in situations with limited training data and multiple font types are also proposed.
Impact and interest:
Citation counts are sourced monthly from and citation databases.
Citations counts from theindexing service can be viewed at the linked Google Scholar™ search.
|Item Type:||Journal Article|
|Keywords:||Script Identification, Wavelets and Fractals, Texture, Document Analysis, Clustering, Classification and Association Rules|
|Subjects:||Australian and New Zealand Standard Research Classification > INFORMATION AND COMPUTING SCIENCES (080000) > ARTIFICIAL INTELLIGENCE AND IMAGE PROCESSING (080100) > Pattern Recognition and Data Mining (080109)
Australian and New Zealand Standard Research Classification > INFORMATION AND COMPUTING SCIENCES (080000) > ARTIFICIAL INTELLIGENCE AND IMAGE PROCESSING (080100) > Artificial Intelligence and Image Processing not elsewhere classified (080199)
|Divisions:||Past > QUT Faculties & Divisions > Faculty of Built Environment and Engineering|
|Deposited On:||17 Jun 2009 13:43|
|Last Modified:||25 Feb 2013 05:21|
Repository Staff Only: item control page