algorithm - Is it indexing Or tagging? -


i have 2 classes claim , index. have field in claim class called topic string. m trying index topic column not using database index column features. should coding following method. suppose have claim 1, claim 1 topic field ("i love muffins muffins") ll folowing treatment

#1. create empty dictionary "word"=>occurrences #2. create list of stopwords exemple stopwords = ("for","this".....etc ) #3. create list of delimiters  exemple delimiter_chars = ",.;:!?" #4. split text(topic field) words delimited whitespace. #5. remove unwanted delimiter characters adjoining words. #6. remove stopwords. #7. remove duplicate #8. create multiple index object (word="love",occurences = 1,looked = 0,reference on claim 1),(word="muffins",occurences = 2,looked = 0,reference on claim 1),   

now whenever word muffins exemple looked increase 1 , move record in database. question following method ? better database index features ? there someways improve things ?

what think looking called b-tree. in case, use 26 (or 54 if need case sensitivity) branch node in tree. make finding objects fast. think time nlogn or something. in node, have pointer actual data in array, list, file, or else.

however, unless willing put time in code specific application, might better off using database such oracle, microsoft sql server, or mysql because these professionally developed , profiled maximum performance possible.


Comments

Popular posts from this blog

apache - PHP Soap issue while content length is larger -

asynchronous - Python asyncio task got bad yield -

javascript - Complete OpenIDConnect auth when requesting via Ajax -