FIELD: physics, computer engineering.
SUBSTANCE: invention relates to digital data processing using computer systems, particularly to methods of processing data, particularly meant for special-purpose functions and mobile applications. A method of extracting useful content from setup files of mobile applications for further computer data processing comprises steps of downloading, from the Internet onto a server, an application setup file which is always a some zip-file; selecting an archive extract utility therefor; in case of successful selection of an archive extract utility, unzipping the downloaded setup file into a directory with files; analysing the obtained directory, making a list of files contained therein; selecting a file for further analysis from the list; selecting software for reading the file by searching all known formats; in case of successful selection of software for reading the file, analysing the selected file for search of primary content; creating a list of internal addresses of the primary content in the form of a set of lines; moving to analysis of the next file until there are files in the directory; performing analysis of the text content of the list of internal addresses of the primary content and dividing the text of each line into a set of characters which identify a method of storing the corresponding unit of content, a set of characters which identifies a document to which said unit of content relates, and a set of characters which identifies the type of said unit of content; dividing the lines of internal addresses of the unit of content based on the storage method into secondary content and useful content; deleting the secondary content; selecting on the remaining list groups of lines with internal addresses of units of content having groups of characters with completely matching position and text, which reflect the content storage method; performing statistical filtering of the selected groups; performing analysis of the text content of the lines of the list of addresses on the set of characters identifying the document, and selecting groups of addresses of units of content relating to each document of the useful content of the application; downloading, from the application, useful content relating to each document into a separate file, thereby creating application documents; indexing the obtained application document files for association therewith, thereby creating a description of the content thereof; storing, in a database, the name of the application, a link to the application and the description of the application; downloading the setup file of a new application and repeating all of the described sequences; performing computer processing of the obtained database; storing the created indexable database array on a server; using for search queries of users received via the Internet.
EFFECT: automatic extraction of useful content from setup files of mobile applications for further indexing, computer data processing and storage of the useful content of mobile applications in a database on a server for further search.
13 cl, 2 dwg
Authors
Dates
2015-11-20—Published
2014-01-24—Filed