FIELD: advance processing of vector-raster image of graphic file, containing image of text.
SUBSTANCE: in accordance to the invention, processing of text objects includes division onto separate symbols and groups of symbols based on supposed locations of spaces or other non-display symbols and analysis or combination of symbol groups into words, processing of vector objects includes detection of separators, background, processing of raster objects includes analysis to detect presence of text image in non-text objects, and/or analysis of presence of vector objects, different from separators, including those exiting the limits of objects, while it is additionally possible to perform encoding correctness analysis, and correct when necessary, to that end separate symbols are examined to determine association with given alphabet, and text words are examined to determine association with given vocabulary.
EFFECT: increased reliability of recognition of text, raster and vector objects, production of information about formatting of document and acceleration of processing process.
3 cl
Authors
Dates
2007-10-27—Published
2005-12-08—Filed