FIELD: information technology.
SUBSTANCE: character string is extracted in a set of character string records to create a set of reference character strings by means of the factor analysis of the principal components (PCFA) in the character string comparing method, to compare the candidate character string with the set of the character string records stored in the database. A binary index key is generated for each character string in the set of records and for the candidate character string, containing several bits of binary information. Each bit indicates the similarity degree of a character string to a set of reference character strings. A set of character string records is defined that includes a binary index key that exactly matches the binary index key of the candidate character string. The candidate character string record is indexed in the database based on the match.
EFFECT: increasing the speed and efficiency of obtaining an approximate match for a character string in the database, without having to calculate the similarity metric across the entire database.
18 cl, 10 dwg, 14 tbl
Authors
Dates
2017-06-29—Published
2013-04-29—Filed