Institute For Information Industry v. Google Inc.

Filing 1

COMPLAINT For Patent Infringement against Google Inc. ( Filing fee $ 400 receipt number 0540-4133094.), filed by Institute For Information Industry. (Attachments: # 1 Civil Cover Sheet, # 2 Exhibit A)(Huff, Winston)

Download PDF
      EXHIBIT  A   1 1111111114)1111 11 1111111111111111 11111111 ! 1 11 11 1 1 1 ( 2 United States Patent 1) (to) Patent No.: U S 6,845,354 B 1 (45) Date of Patent: J a n . 18, 2005 Kuo et al. (54) I N F O R M AT I O N R E T R I E VA L SYSTEM W I T H A N E M O - F U Z Z Y STRUCTURE (75) Inventors: You-Hwang Kuo, Tainan (11V); Jang-Pong Hsu, Tainan (TIV) (73) Assignee: Institute for Information Industry, Taipei (TIV) Notice: 6,101,491 A * 8 / 2 0 0 0 Wo o d s (57) Int. C I . 7 U.S. Cl. 7 0 4 / 9 ; 707/6 G Field o f Search 7 0 4 / 1 , 9, 10; 707/2-3, 707/4, 5-6, 104, 102; 715/531, 532-534 O 6 References Cited (56) F U.S. PATENT DOCUMENTS 1 7 5,642,502 A * 6 / 1 9 9 7 D r i s c o l l 7 0 7 / 5 5,724,571 A * 3 / 1 9 9 8 Wo o d s 7 0 7 / 5 / 5,933,822 A * 8/1999 Braden-Harder et al. 7 0 7 / 5 2 5,956,711 A * 9 / 1 9 9 9 Sullivan et al. 7 0 7 / 6 7 (51) (52) (58) (-101 ENCODING SECTION QUERY WORDS (11) B S T R A C ( SIMILARITY CALCULATION SECTION 1 0 7 3 T 22 Claims, 7 Drawing Sheets 0 3 DETERMINING SECTION 1 L. A ( CUSTOMIZABLE SYNONYM ADJUSTING BLOCK 9 ( / An intelligent information retrieval system for finding information components corresponding to an input query word. The system includes a synonym block for finding synonymous indexed keywords of the query word; an information indexing block for finding corresponding information components based o n t h e indexed keyword; a component ranking-filtering block for ranking and filtering the found information components and outputting the desired information components being selected; a synonym adjusting block for adjusting the fuzzy mechanism o f the synonym block based on the found information components; and a filtering adjusting block for adjusting the fuzzy mechanism of the ranking-filtering block based on the found information components. T h e aforementioned synonym b l o c k and information-indexing block are implemented b y neurofuzzy networks f o r accelerating parallel processing and automatic learning. Further, the synonym block can tolerate input errors by way o f query word encoding and position shift compensation. Filed: S e p . 9, 1999 RETRIEVED COMPONENT KEYWORDS (19) 7 * cited by examiner Appl. No.: 09/392,572 (22) 0 Primary Examiner—David D. Knepper (74) Attorney, Agent, o r Firm—Birch, Stewart, Kolasch & Birch, LLP Subject to any disclaimer, the term of this patent is extended or adjusted under 35 154(b) by 0 days. (21) 7 ( CORRELATION I N D E X E D SECTION K E Y W O R D (13) 1 0 5 U.S. P a t e n t Sheet 1 of 7 J a n . 18, 2005 U S 6,845,354 B1 I — ril M •c: N 0 If) L glr'" 6 '61 7_ ( - - - c L ' U.S. P a t e n t J a n . 18, 2005 . CO T Sheet 2 of 7 C O C U S O . t C D C O T 1 - a) c o = I 2 / ›< >< o i C) co c ) cD > > 0 c D c o >< > < _ J (--) CO C..) CD > > h; CL >N F- E a) -1-, c o >, (/) 14' E Ja m cn 6,845,354 B1 U.S. P a t e n t Sheet 3 of 7 J a n . 18, 2005 U S 6,845,354 B1 1) c4 4 1 x c) n ' OIMEMEMINIMILEMMIMM , 4 1 t•••,.,._, r z i 1—' r A • • • • • • • • • • , L i r Z7., a L 1-CD L 0 zz 10 6 415 0 E' 0 0 7 41 4 1 : L . Z C/) A A L. i U . S . P a t e n t J a n . 18, 2005 CD CD C:) C:) CD CD CD CD Cb CD 1 CD CD C:) CD CD C) C:) CD u:) 1 T - co 1 CD n--- CD CD co I CNI CNI CD Sheet 4 of 7 ci N 1 CD C:) C:) CD C) C) C:) C:) C) o C:) CD CD C) CD . 3 , - C:) CD 1 0 2 2 (NJ 1 1 .%• r s • 1 to 1 >.% re) , I N N Co olf• — •• , I l 1 .1 • • • Pr) • T 1 1 N••••• CD 1 — '.1• •I••••• CD Co IC) re) I CD C:) 6,845,354 B1 ol••• mr-- CD co 1 CNI 1 . • CD CD I 1••• CD CD 0) S — • 1 (NI C:) S CZ) o l • '1' . • U 1 4 — •I . • U.S. P a t e n t Sheet 5 of 7 J a n . 18, 2005 U , S L, 2 -z H ,, i 6,845,354 B1 , 1.0 6 L2 7 a L .— LL, L L. L _ I ri ._. 4 t • __ , U.S. P a t e n t J a n . 18, 2005 re) CD re) L Sheet 6 o f 7 U s 6,845,354 B 1 U.S. Patent Jan. 18, 2005 Sheet 7 of 7 US 6 9 845 9 354 B 1 -, .., 1 US 6,845,354 B1 I N F O R M AT I O N R E T R I E VA L SYSTEM W I T H A NEURO-FUZZY STRUCTURE BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an information retrieval system. M o r e particularly, i t relates t o a n information retrieval system with a neuro-fuzzy approach for building a customizable thesaurus, tolerating an operator's input errors, performing fuzzy queries, and providing a quick select option by prioritizing the information retrieved by the system according to the operator's preference. 2. Description of the Related Art The rapid growth in the Internet and information industry has diversified the w a y information i s retrieved. I t has become crucial for an operator to effectively find the information o f his or her interest. For example, a software programmer who develops software programs with objectoriented technology may hope to quickly assemble a prototype software application directly from the appropriate retrieved and developed software components and test i t after acquiring the formal specification upon receipt o f demands f o r large scale software development. Such demands also hold true in public library query system, World Wide Web search tools and other information services (such as the management and query of massive news archives and legal documents) where operators hope to get the desired information that meets their professional needs. However, the exponential growth in information loads or software components included i n software component libraries has led t o uncertainty i n the entire process o f retrieving information and software components. Efficiency is normally contingent upon an operator's experience and expertise, that is, whether the query words the operator inputs precisely describe the features of the desired information. Consequently, information retrieval systems have become a critical issue i n information management and information recycling technology. Conventional information retrieval systems are constructed based on Boolean logic. That is, the query words entered by an operator are matched with specific keywords in the information o r software components to determine whether they meet operator's needs. Such conventional information retrieval systems offer very limited flexibility and have the following drawbacks: 2 should be accurate, i.e., conventional systems tolerate no erroneous input. Perfect accuracy cannot be guaranteed for routine input, and synonymous words with a different part of speech may be entered. For example, 5 a n operator may skip a character in typing a query word, or the operator may enter a noun form o f the query word instead of the verb form as found in the thesaurus. Similar situations will prohibit the operator from finding a corresponding keyword. 10 T o tackle this problem, prior art has created a replace table to make up the deficiency. The replace table lists occurrences of possible errors, declensions and inflections of the query words in the thesaurus to revise potential input errors the operator may make. However, this kind of replace table 15 i s quite rigid and basically unable to compensate f o r all possible errors. 3. Information retrieval systems o f the prior art fail to process indefinite information, that is, fail to provide fuzzy query processing. In other words, the operator 20 h a s to precisely specify various descriptors or features of the query f o r information archives o r software components to obtain the desired information. That is to say, the operator has to have certain knowledge o f the information document or software components to 25 b e queried; otherwise, the queried result from erroneous information will not fulfill expectations. 4. Lastly, a significant number of documents or software components can normally be retrieved for a specific keyword in conventional information retrieval systems. 30 G e n e r a l l y , advanced information retrieval systems may find the correspondence between documents or software components and query words and output the ranking. Since the ranking-filtering functions are rigid, every operator has to follow the same ranking-filtering 35 p r o c e d u r e s . Consequently, no flexibility is provided in using the systems. Current information retrieval systems, as specified above, can perform basic query and retrieval tasks; however, they are not conveniently accessible and have difficulty finding 40 appropriate files or software components. And this problem is what this invention intends to resolve. SUMMARY OF THE INVENTION Therefore, the primary object o f this invention i s t o provide an intelligent information retrieval system, where the thesaurus can be customizable according to the operator's preference and be queried according to the keyword the operator enters after performing the customizable conversion of synonyms. 50 Another object of this invention is to provide an intelligent information retrieval system that tolerates erroneous input. That is, the system can still proceed with a query even the operator has to some extent made mistakes. 45 1. I n a typical information retrieval system, the system normally expects the query words an operator enters upon accessing query libraries to fully match the keywords provided by the system. In other words, those keywords are formulated to describe the features of an information archive or software component when the system i s built. However, i n practice, i t i s nearly 55 S t i l l another object o f this invention is to provide an intelligent information retrieval system, which performs and impossible to demand that operators precisely use the processes indefinite data, that is, performs fuzzy query, to keywords stored in the system. enhance the flexibility in usage. To tackle this problem, a thesaurus is created for keywords in the systems of the prior art. That is, each query And another object o f this invention is to provide an word in the system is defined by a group of synonyms in the 60 intelligent information retrieval system capable of performthesaurus. It can therefore be inferred that an input word is ing ranking of the retrieved software components according a query w o r d when one o f the synonyms i s entered. to operator's preference and performing filtering according However, a thesaurus as such generally adopts a rigid to operator's settings, thereby enabling easy access and formula to list the correspondence between query words and enhancing the system's function. synonyms. 65 I n accordance w i t h the aforementioned objects, this 2. I n conventional information retrieval systems, i t i s invention provides an intelligent information retrieval sysassumed that query words entered b y the operator tem for querying the corresponding information components US 6,845,354 B1 4 with an inputted query word. The system includes a synonym block for finding the correspondence o f at least an indexed keyword according to the query word based on a fuzzy synonym mechanism; an information indexing block for finding at least an information component corresponding to the indexed keyword based on the aforementioned query word; a component ranking-filtering block for filtering and ranking the found information components based on a fuzzy ranking-filtering mechanism and for outputting the desired information components for selection; a synonym adjusting block for adjusting the fuzzy mechanism o f the synonym block based on the selected information components; and a filtering adjusting block for adjusting the fuzzy mechanism of the component ranking-filtering block based o n the selected information components. This invention consists of information components independently packed f o r a specific task as aforementioned information components. I n addition, the above synonym block and information indexing block are implemented by fuzzy neural networks for accelerating parallel processing. Also, the synonym block can tolerate input errors due t o query word encoding and position shift compensation. The aforementioned synonym block can perform in two phases, namely: recall phase and learning phase. In the recall phase, the synonym block encodes a query word with an encoding section to generate an encoded pattern of the query word t o correspond w i t h a feature o f the query word. Further, a similarity section is used to match the encoded pattern of the query word with the encoded pattern of system synonyms to generate a corresponding similarity value representing the degree o f similarity between the query word and those synonyms in the system. Those system synonyms correspond respectively to a plurality of keywords set in the system. Comparisons are made with the similarity value and the correlation value between the system synonyms and system keywords through a correlation section and a default threshold so as to select the output as the above-mentioned indexed keywords from those system keywords. In practice, however, the similarity calculation section can be composed of the nodes o f a fuzzy neural network with each node corresponding to a synonym set in the system. The correlation section can be composed o f nodes to correspond to each individual keyword set in the system. In the learning phase, the synonym b l o c k generates respectively an encoded pattern o f a query word and an encoded pattern of a retrieved component keyword for the query word and the retrieved component keyword corresponding to the selected information component through an encoding section. Next, the similarity calculation section matches respectively the encoded pattern of the query word and the encoded pattern of the retrieved component keyword with the encoded pattern of the system synonym to generate a first set of corresponding similarity values and a second set of corresponding similarity values. The correlation section calculates the corresponding output values based on the first and second sets of similarity values and the original correlation value between the system synonyms and the system keywords. Finally, the determining section is used to determine whether it should correct the aforementioned similarity calculation section and the correlation section according to the first set and the second set of similarity values and the above output value. The aforementioned information indexing block includes an indexed keyword register section for sequentially storing the indexed keywords; and a plurality of component indexing units for corresponding to various information components and for storing the component keywords correspond- ing to the information components and the identification data of the corresponding information components. When an indexed keyword i s received, its similarity to the saved component keyword is calculated, and it is judged whether 5 a predetermined threshold is exceeded to determine whether the identification data o f the corresponding information component should be outputted. BRIEF DESCRIPTION OF THE DRAWINGS io T h e above objects, features and advantages of this invention will become apparent from the detailed description of a preferred embodiment w i t h reference t o t h e following drawings, wherein FIG. 1 i s a block diagram illustrating the functional 15 structure o f an intelligent information retrieval system o f this invention; FIG. 2 is a schematic diagram illustrating an exemplary query screen of the intelligent information retrieval system of this embodiment; 2 FIG. 3 i s a block diagram illustrating the functional structure of the synonym block and customizable synonym adjusting section of the embodiment of this invention; FIG. 4 is a schematic diagram illustrating an example of 25 the encoding procedure o f the embodiment; FIG. 5 is a diagram illustrating the detailed structure of a synonym block realized with the structure o f fuzzy neural network of the embodiment of this invention; FIG. 6 i s a block diagram illustrating the functional 30 structure o f the information indexing block of the embodiment of this invention; and FIG. 7 is a diagram illustrating the detailed structure of an actualized information indexing block with the structure of the fuzzy neural network of the embodiment of this inven35 t i o n DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS The information retrieval system as disclosed i n this 40 embodiment is a system for retrieving software components. The software component mentioned here refers to source codes for a specific functions as well as private data used by the source codes, w h i c h a r e packed a s independent components, such as the software components used i n 45 object-oriented technology f o r performing a function. However, application of this invention should not be limited to the environments as described in this embodiment since modifications o f this embodiment involving, for example, environments i n which multimedia data such as various 50 documents, images, animated pictures a n d songs are retrieved, can still be made by those who are skillful in the art. According to the above description o f the prior art, the information retrieval system of this embodiment is provided 55 t o achieve the following objects by: Firstly, enabling the operator to customize the thesaurus for easy query according to personal habit and preference; secondly, tolerating input errors; thirdly, providing representation and processing o f indefinite information; and fourthly, ranking the retrieved 60 software components and filtering out those with low correlation according to operator's personal retrieval preference and habit for convenience in selection. The above objects of this invention are realized by making use of fuzzy information retrieval, knowledge engineering and machine learning 65 technologies. The embodiment of this invention is described in further detail with reference to the accompanying drawings. US 6,845,354 B1 5 FIG. 1 i s a block diagram illustrating the functional structure o f an intelligent software component retrieval system o f this invention. A s shown i n FIG. 1, the entire software component retrieval system is composed o f synonym block 1, information-indexing block 3, component ranking-filtering b l o c k 5 , customizable ranking-filtering adjusting block 7 , and customizable synonym adjusting block 9. Functionally, the system is divided into two parts: the query retrieval mechanism comprising synonym block 1, information indexing b l o c k 3 a n d component rankingfiltering block 5 as shown in the upper section of FIG. 1, and the adjusting mechanism comprising customizable rankingfiltering adjusting b l o c k 7 a n d customizable synonym adjusting block 9 in the bottom section of FIG. 1. In the process o f query retrieval, the operator 10 first specifies a query word 11 to the synonym block 1 for finding corresponding keywords. The keywords are set in the system for specifying the characteristics of software components to be retrieved. I n this embodiment, the query word 11 specified b y the operator 1 0 includes f o u r query facets f o r defining the software components to be retrieved by the operator. The four facets are software component function, software component file type, associated medium and system type, respectively. The operator can define the four facets for retrieving the desired software components. The query item o f software component function is for defining the function and capability of the software components. I n this embodiment, the query item for software component function accepts multiple query w o r d input, which are divided by slashes. The query item for software component type is for defining the file type of the software components, such as OCX, VBX (visual basic), VCL, DLL (dynamic link library), C++, and delphi among others. The query item for associated medium i s f o r defining a device used b y a software component, such as disk, keyboard, input/output port, network, screen and interface. The query item f o r system type is for defining an appropriate operating platform for a software component, such as Windows 3.x/951NT, DOS, OS/2, and UNIX, among others. In this embodiment, only the query item for the software component function is entered by the operator, other query items, which are in a fixed keyword table provided with the system, are for the operator to select. FIG. 2 is a schematic diagram illustrating an exemplary query screen o f the information retrieval system o f this embodiment. A s shown in the figure, the operator 10 can define the desired software components with settings for various query items, wherein 62 denotes a software component function, 64 an associated medium, 66 a file type and 68 a system type. It is clear from the figure that the operator can freely key in query words to define the function of the software components; other query items can be selected directly from the system keywords. Consequently, only the query words the operator enters in the function item will be processed for synonyms. After a query word is entered into the synonym block 1 for synonymy analysis and processing, its corresponding keywords predefined b y the system are called "indexed keywords" 1 3 i n t h i s embodiment. Indexed keywords belong to keywords predefined in the system and have a certain degree o f synonymy with the entered query word. Such synonymy is defined by the synonym block 1. In this embodiment, the synonym block 1 comprises a fuzzy neural network and can tolerate input errors by virtue of query word encoding and position shift compensation. T h e detailed construction of the synonym block 1 will be explained later. Also, as described above, since the synonym block 1 of this 6 embodiment only processes the synonyms f o r the query words in the function item of the software components, other query words entered i n the query item w i l l be directly selected from the keywords preset i n the system, hence, 5 problems associated with synonyms and input errors are eliminated. The information-indexing block 3 performs the retrieval of software components based on the indexed keywords found i n the synonym block 1. I n this embodiment, the to information-indexing block 3 w i l l output an identification data 15 for a retrieved software component. Moreover, the information-indexing block 3 also comprises a fuzzy neural network and can enhance parallel processing efficiency. Detailed description follows. Finally, the identification data of software component 15 5 outputted from the information indexing block 3 is supplied to the component ranking-filtering block 5 for ranking and filtering processing and outputted to the display device for the operator 10 to select. Generally, a significant number of software components that match the conditions set for the 20 query word by the operator will be found for a query word after the processing in the synonym block 1 and the information indexing block 3. The number of indexed words that match the query word w i l l increase with the burgeoning growth i n the volume and field of software components. It 25 w i l l be deemed improbable for the a operator to easily find the desired software component from such a massive number of messages. Therefore, it is necessary to categorize and filter out the retrieved software components. The purpose of ranking processing is to sequentially display the retrieved 30 software components according to the correlation with the query word and the operator's previous preference in selecting software components so the operator may find and select the desired software component as soon as possible. The purpose o f filtering processing i s t o filter o u t software 35 components with lower correlation so the operator may find the desired software components more easily. Consequently, a threshold may be modified and set by the operator in the component ranking and filtering block 5. The operator may then select the software components 21 he really needs from 40 the displayed software components. Another feature of the software component retrieval system of this embodiment lies in that an automatic feedback adjusting mechanism is provided for adapting the component ranking-filtering b l o c k 5 a n d synonym b l o c k 1 , 45 respectively, according to the retrieved components 21 and the keywords corresponding to the software components desired and selected by the operator 10, and the operating habit and retrieval preference the operator has demonstrated. Customizable ranking-filtering adjusting block 7 adapts 50 according t o t h e retrieved components 2 1 t h e ranking mechanism and filtering mechanism i n the component ranking-filtering block 5 to make them learn to suit the needs of practical operations. Customizable synonym adjusting section 9 adapts according to software component keywords 55 the synonymy in the synonym block 1 to make it learn to suit the needs of practical operations. The synonym block 1, the information-indexing block 3 and component ranking-filtering block 5 are described in the following, respectively, with reference t o the associative 60 drawings. Synonym Block (1) FIG. 3 i s a block diagram illustrating the functional structure of the synonym block 1 and customizable synonym adjusting section 9 of the embodiment of this invention. As 65 shown i n the figure, the synonym block 1 comprises an encoding section 101, similarity calculation section 103, correlation section 105 and a determining section 107. US 6,845,354 B1 7 8 The synonym block 1 undergoes its operation i n two The correlation value between system-defined synonyms phases: the recall phase and the learn phase. and system-defined keywords is saved i n the correlation Recall Phase section 105. For example, i f the keyword for describing the In the recall phase, the query word 11 the operator inputs function of a system-defined software component is "about is first encoded i n t h e encoding section 101. I n t h i s 5 d i a l o g box," the synonyms may be "interface," "dialog," and embodiment, the encoding format displays in parallel the "help" among others. From user experience, the above three number of letters and the location o f their first appearance synonyms differ in the degree of correlation with the actual with the word to be encoded (at this time, the query word). keyword. For example, "dialog" may have a high correlation The encoding process of this embodiment is as follows: of 0.7 with "about the dialog box," the synonymy of "help (a) Convert first all letters in capital (upper) case into to b o x " with a lower correlation of 0.5, and "interface" with an lower case, and all other characters other than a through even lower correlation 0.3 due to its broad meaning. z and space into spaces. Consequently, the correlation between a query word and (b) Next, set 27 registers corresponding to the English each keyword can be determined based on the correlation alphabet a to z and the space. value between each synonym and an actual keyword after (c) Further, calculate the number o f each letter o f the 15 the similarity value being calculated by the similarity calconverted word to be encoded and save the number of culation section 103 is supplied to the correlation section these letters in their corresponding registers. 105. I n this embodiment, the lesser o f the corresponding (d) By now, there are still some registers with empty data. similarity value and correlation value is taken to determine Therefore, those empty registers can be used for saving the correlation between the query word and each keyword the location o f the first occurrence o f each letter 20 and matched w i t h a pre-determined threshold t o further (including the space) being converted but remaining to determine whether the keyword is to be outputted as an be encoded. Saving is realized by calculating the corindexed keyword 13. responding location o f each letter to be encoded and In the recall mode, the synonym block 1 will use neither assigning it a negative value and then saving it in an the determining section 107 nor the customizable synonym empty register adjacent t o the corresponding letter 25 adjusting 9. Also, the synonym block 1 further includes a register, preferably, i n the l e f t register. I f the l e f t position shift calculation section (not shown in FIG. 3, its register is already filled with data, then the data will be function is realized in L2 node of FIG. 5) for providing the saved in the right register. I f both registers are filled capability o f tolerating input errors based on the encoding with data, then the location value is abandoned. format o f the query words 11 to be described later. (e) Though most registers are filled with location data, 30 Learning Phase there are still some among these 27 registers that w i l l In the learning phase, the synonym block 1 may make use remain empty. At this time, all letters being converted of the functions provided b y the customizable synonym adjusting section 9 to adjust the similarity value calculation but not yet encoded (that is, letters a to z and the space) are sequentially inserted i n t o a l l empty registers. parameters (to be described later) of the similarity calculaHence, the encoding o f the words to be encoded is 35 t i o n section 103 and the correlation value o f correlation realized. section 105. The adjustment is determined based on the FIG. 4 is a schematic diagram illustrating the encoding software component chosen by the operator and its corresponding keyword 19 o f the retrieved component. procedure o f "fuzzy control" o f this embodiment. The (a), First, the query word 11 and the retrieved component (b), (c) and (d) i n FIG. 4 represent the content o f each register of the above steps (b), (c), (d) and (e), respectively. 40 keyword 19 are inputted to the encoding section 101 for Therein, " 0 " represents empty data status, and the first 26 encoding so as to obtain the encoding format corresponding registers correspond respectively to letters "j," "e," "q," "i," to the query word and the encoding format corresponding to "r," "w," "t," "k," "o," "v," the retrieved component keyword. The encoding format "1," "g," "c," " y, " " u , " " h , " " p , " " d , " and ''in''. The 2 7 corresponding to the query word and the encoding format register corresponds to the space. Such a distribution o f 45 corresponding t o the retrieved component keyword are th letter locations follows the rule that most frequently used further simultaneously supplied to the similarity calculation letter w i l l be assigned to a location adjacent to the least section 103 for calculating respectively the degree of simifrequently used letter based on the encoding features of the larity between the two and each system synonym. A t this encoding section to increase the data recording capacity in time, two sets of similarity value are generated to correspond the coding after the encoding. 50 the query word and the system synonym, respectively. The The encoding format generated by the encoding section two sets o f similarity values w i l l be supplied to both the 101 will be supplied to the similarity calculation section 103 correlation section 105 and the determining section 107. However, at this time, the correlation section 105 executes for calculating its similarity with system-defined synonyms and then forwarded to the correlation section 105 for deterthe operation different from that in the recall phase, further mining what system keyword is relevant to the actual query 55 generates another set o f output values to the determining word, the determined result being outputted as an indexed section 107. The determining section 107 w i l l determine keyword. whether it should adjust and how it should adjust based on The encoding format o f system-defined synonyms are the similarity value and the output value o f the t w o sets, pre-saved i n the similarity calculation section 103. I t s further the adjustment is actually performed by the customiencoding is in the same manner as described above. System- 60 zable synonym adjusting section 9. defined synonyms correspond to a specific system-defined The encoding of the encoding section 101 and the calculation o f the similarity of the similarity calculation section keyword, the keyword is the word for directly describing the 103 are performed in the same manner as those specified in function o f a software component (in this embodiment). Consequently, the query word 11 and system keyword are the recall phase. matched in the similarity calculation section 103 to generate 65 F I G . 5 i s a diagram illustrating the detailed structure o f a similarity value representing the degree o f similarity the synonym block realized with the structure of the fuzzy between them. neural network o f the embodiment o f this invention. The US 6,845,354 B1 9 fuzzy neural network provides advantages of parallel processing and learning capabilities. As shown in FIG. 5, the fuzzy neural network comprises nodes in five layers, represented as L i , L2, L3, L4 and L5, respectively. The function of the node o f each layer is described in the following. Layer L i There are two sublayers for nodes on layer L i : L l a and L i b , each sublayer has n nodes. So sublayer H a comprises nodes q k , „ i , Sublayers H a and Lib are for storing query words 11 and the encoding format of the retrieved component keyword 19. q , Therefore, n in this embodiment is set to 27, and the function , nodes q of q words 11 and the total number of each letter, the location of i u qappearance of each letter and actual alphabetical data in first , , the , retrieved component keyword 17. Encoded results of the „ q encoding section 101 o f FIG. 3 are saved in nodes o f L i a u layer, except f o r those actual performing modules (not n a n shown). d d Moreover, in the recall phase, only the data in every node s sublayer H a are used, whereas in the learning phase, the of o n u b l e nodes of both sublayers L i a and L i b are used. data of d Layer a y L2 s e The L2 layer comprises nodes P k r P, L i f o r corresponding t o each system-defined synonym, i respectively, and hence, also for storing the encoding format i , P 2 , P , the system synonym. Each node receives from layer L i of b 3encoding formats o f query word 11 and the retrieved the k P c , component keyword 19. I n the recall phase, the similarity , o 4 value between both the encoding format of the query word „ P , 11 m ,and the encoding format o f the retrieved component k keyword 19 is calculated; whereas in the learning phase, the a n p „ , similarity values between the encoding format o f system d r , synonym and the encoding format of the query word 11 and i encoding format of the retrieved component keyword 19 the i s calculated, respectively. The similarity value represents are s e degree or extent o f similarity between the input word the t (query word 11 or the retrieved component keyword 19) and s o the n system synonym. Moreover, input errors can be detected s through the position shift amount calculated with the nodes o t layer L 2 t o compensate the encoding format o f the of d o inputted query word H . e In this embodiment, each node o f layer L2 performs the r s following calculations: e k q (1) i u s(q, P , X1}] i e r y ) = d(q, Pi) = II q— Pill = 10 Please take note that equation (1) only calculates the negative value o f the encoding format, i.e., calculates only when both q, and P if a r e u q, is negative, then q,' equals 1 or 0; i f p 5 pn e sg a te i gv ae t i v e , u i n q„-Ph u . t e n calculated to have a value equal to or greater than 1 or u ' T h e r e smaller than or equal to —1, all position information with a o e f ro r value f o r the encoding format q needs t o b e negative e , q iO lo compensated with (by deducting or adding) this shift value. . u n The position shift amount s (q, P,), calculated with the W a e encoding format and equation (1), can check for missing q u h character in inputting the query word. For example, i f the l a t i e operator intends to key in "fuzzy" as the query word but s o n misses the first letter by keying in "uzzy" instead. It will be 15 n 1 ( 1 found that the position shift amount is greater than or equal t o ) , to 1 when the shift i s calculated and this shift amount r h denotes that there is a letter missing in the front, so the shift 0 e amount is compensated to the position amount of the format ; p 20 after encoding and the problem associated with the missing i o letter in front of the inputted query word is then resolved. f s Therefore, input errors will be reduced by making use of the i q fuzzy mechanism and encoding format of this embodiment. , t Equation ( 2 ) denotes the average Hamming distance a i 25 between a query word q and a synonym P symbols d in o the three sets are defined first: S,, is a set of possible values n u .d T h e t h r e e of , various letters recorded in an encoded pattern of a word, p s P a u u h set with a positive integer value smaller than 27; S 30 o f valuesa with a negative integer value larger than —27 for s e t , a ip i s recording the location where each letter displays for the first f n a r time indan encoded pattern of a word; and S, is a set for q „ e t recording each letter in an encoded pattern o f a word, a set n e n a ASCII values larger or equal to 32. So there are four of e e m d 35 possible values for d o t g u absolute value of P, and finally, the absolute value of q„-P as u v a l o a ; shownuinethe following: ii n b t 1 , , i f (q E S t e h i t e p 40 s (q = 0 and ( P E S e x v a b s o l a n d p o r P ( P S, and (1 (q1 c ( tl e p e u i / E = = 3 0 q i a a S E ,S ) (q , v l 0 r o P ) p , n t u ee o iE S, and P i ' e S, and q i (q o n r a r E S d h P id (q E S, and ( P f 45 o P r iP E j S ) h e f u (P i)f ) p j j i e E io j n q „ I qj — /Ail otherwise / o Sr E S r E s r b , 3 p e S 6 a e , t h ) , E S n ) 50 Moreover, P a . , e e d o n u H following: ) ) q e the (2) u c t u ( ,u(g, Pi) = (1 + (d(q, P l 3 "q i i ) ) F d ) u h i F. Equation ( 1 ) denotes the position shift amount s(q, P ) ta ( ' ) 55 P between the synonym (P,) and the inputted query word 11 l 1 e ' (q) saved in a node (P s l represents the encoding format o f query word q and the p i j encoding format of synonym P, j is a number between 1 and s q ) o f t h e 27. W i e l a L2 and r j y e the 60 ) layer u / Lhe and can be considered as the importance o f the j t v . L2 d2a t a t r 2 e h Wrt h data r e o f the query word U . Generally, such u i ncoded m item e e 1 p importance can be represented with the probability o f the o i sn f , e occurrence of a specific letter. For example, letter "z" only t e q n th appears in query words; therefore, i t w i l l become e seldom i s i more important i f it shows up, and should be more heavily t a weighted. n n o w d P d e e i P i , , r d o ( r q E S i / / 0 o t h e r w ps e o r; 3 o e ,= 6 o s1 r i q Also, q, lmay have three possible values, as shown in the i bi f E following: P f S ( i a Pl j J i f q ,) E i E e ij ) S p E s , (11 = {„ 1Si tfh e(qwes e 0o r i S S p o r q j a p 65 E S o s c Equation (3) calculates the similarity between the two r s according to the average Hamming distance, wherein Fd and ) P i h j o E S w , e m ar y e vt e a o rf P e h i US 6,845,354 B1 12 11 Fe represent t w o respective fuzzifiers, which define the Niv, represents the correlation with the ith synonym of layer range covered in the membership function o f the average L4. R Hamming distance between patterns after encoding and the represent a respective synonym with the node o f layer LA, i sensitivity to the average Hamming distance. Therefore, the that is, with the i d e similarity value /..t, (q,P,) of a query word to a corresponding 5 p „ . Q is an encoded pattern o f the query word 11, k an t L n o 4 synonym can be generated. encoded pattern o f the retrieved component keyword 19. n t eo d e The calculation of the similarity value for query words 11 The n n e c o correlation between the query word and the related as described above can also be applied to the calculation for s keywordncan be obtained with equation (5). c t i g retrieving component keywords 19. Moreover, the function t In equation (6), L , represents the identification data o f of nodes o f the L 2 layer corresponds t o the similarity t h 10 system keyword to indicate an actual word rather than its calculation section 103 o f FIG. 3. o e encoding format. Further, threshold 0 Layer L3 L i 1 operator for n 2 c a processing in a broader term, narrower term or b e Layer L3 includes a node SA. Node SA is used for finding 2 normal t means, the smaller the threshold 0 t s e term. Thatb y a similarity value that is most similar to the input query word n value,h o broader the term used for processing; the greater the e 2 n t 11 among all values supplied from all nodes of layer U . In 15 the threshold 0 d e 1 o this embodiment, node S A functions principally i n the 2 1 processing. In equation (7), 0 ( 1 s d learning phase, its operation is represented as follows: respective v a3 l u e , output value o f nodes o f L 2 layer relevant to p e nodes of L4 layer. t 1 ,h e o P, ) = M K , P i )a Finally, O 0 ( P n f ar n r d o a 20 o f R nodes of layer LA in the recall phase and learning phase ) w 2e r n l ( supplied respectively to the information indexing block 3 ( h Q I d layer L 5 . Layer L A corresponds t o the correlation a where K4 denotes a query word 11 or a retrieved compo- t ) and n a p r e s e n t ) section nent keyword 19. X denotes an aggregation function, or the e r e 105 o f FIG. 3. y d max function i n this embodiment. ,a(K,P,) denotes the t a e L5 e Layer r r ye similarity output for node of layer L2 corresponding to the 25 L a0 m r , L5 comprises only a node RA and receives input ( sI synonym found most similar to K. Also in this embodiment, u from every node o f layer L 4 i n the learning phase. The L Q d the output SA of node S A , is a vector comprising another e function o f node R A is to select the maximal output R A , 4 r e pair of nodes w of layer L2 most similar to the query word f from correlation (co-dependency procedure) outputs of each , 11 and the retrieved component keyword 19 in addition to a p o node orf layer L4. Its action can be illustrated as follows: a pair with a value most similar to the query word 11 and the 30 r n e sA R retrieved component keyword 19, as mentioned above. e n 0 d Layer L 4 t = p In this embodiment, layer L 5 and layer L 3 are i n the Layer L 4 comprises nodes R a m phase, and determine whether any adjustment is learning a respectively correspond t o system-defined keywords. A s , x ( i r ye( necessary or how the adjustment will be made. As described described above, a system keyword m a y correspond t o 35 a , R s above,Rip, this embodiment, layer L 3 outputs at least two n several system-defined synonyms, each o f which having a n 2 ) maximal ,similarity values corresponding to the query word e )c different correlation to the system keyword. Nodes of layer d (with encoded pattern q) and the retrieved component , R 11 t ii = L4 determine the correlation between the input query word 1 , p 3 keyword 19 (with encoded pattern k), respectively. Layer L5 v ,e 11 and system-defined synonyms according to the similarity , , R value calculated in aforementioned layer L2 (corresponding 40 then outputs an output with the greatest value. Customizable o e ( synonym adjusting section 109 can perform adjustment with 4 to each individual system synonym) and the correlation u 5 these three values and two independently set thresholds 03 , corresponding to ,each individual node of layer U . In R value t ) and O t o this embodiment, the correlation between the query word 11 p (Case 1) r and the keywords can be determined by the maximal simiu When the output o f node R A ( L 5 ) i s greater than the larity value (that is, most similar synonym) and its minimal 45 t threshold 0,, or both of the two output similarity values of corresponding correlation value. Next, each node of layer L4 node SA(L3) are greater than the threshold 0,, then execute is set with a threshold 0 the following learning procedures: of ( 2 L4i node) to determine whether the query word 11 and its (a) Modify system encoded patterns P correspondinge n t s keywords meet the requirement. I f the system r e p r e s n d P 50 thewt lw oacorresponding nodes (set as W , and W requirement is met, then the L4 node can output the correa n w 2 s a values o sponding keyword as a n indexed keyword 1 3 t o t h e 2 maximal similarity v e d f layer L 2 . Modification i s i n d e performed ) w ii t h n as follows, where N information-indexing block 3. x respective d w l a n number of variations being made. N Each L4 node has different operative characteristics in the o r recall phase and the learning phase. In the recall phase, the w 2 p i d c min n operationi is executed once, whereas the min operation 55 r e v r e s e n t p a executed twice i n the learning phase. The operation t o r is (9) a p 1r 2 „ S r , , performed by each L4 node is represented as follows: = 2 / ( k v 2 A p v 1 + = 1 ) X p v 1 • Ar2_XTK + ,p 1 i 2 w + 2 + 1 1 0 2 + 1 ) • k , X p 60 v (6) LR(Ri) i f li(Ri) 0 OR(R) = ' where 1 i=10. Nutt, o t h e r w i s e / 2 (b) Create a new L4 node w3 when the output value of the ( node RA(L5) is smaller than the threshold 03, wherein the X correlation values being saved and the initial value o f p 0L(1,)=min(min(it(q,P2,),*'),min(y(1cP2),V1P)) ( 7 ) 65 variations are set respectively as: v Therein, equations (5) and (6) are performed in the recall 1 (10) phase and equation (7) in the learning phase. In equation (5), + 1 ) T , ,u(R,)=min(y(q,P2,),TP) ( 5 ) US 6,845,354 B1 13 14 Xr' 3 Otherwise, the correlation value I V .1 layer L4 with the v e d output from min operation of L4 W3 3 s a maximal layernwill be modified ase positive learning. Modifications are i t h W3 performed as follows, where N represents the number of n o d e variations. o f tion 301 and software component indexing unit 303 f o r generating corresponding I D 15 for software components that meet some requirements based on the indexed keyword 13 supplied by the synonym block 1. As described above in this embodiment, there are four query items f o r operators t o enter, namely: function o f software components, associated medium, file type of software components and system type, as shown i n FIG. 2. Therein, only the input for function of software components can be freely set to generate several relevant indexed keywords 13 after the processing in the synonym block 1; other index items can be selected with one of keywords provided by the system. The indexed keywords 13 (only those corresponding to the function o f software components) found by the synonym block 1 are sequentially selected and the-keywords from the four query items are combined to form a set o f actual query information, which are then supplied to and saved in the indexed keyword register 301, respectively. Next, each keyword stored i n the indexed keyword register 301 is supplied to the software component indexing unit 303 for matching so as to find an appropriate software component, then the identification data for those software components are outputted to complete the function of software component retrieval. The above process continues until all indexed keywords are completely searched. FIG. 7 is a diagram illustrating the detailed structure of an actualized information indexing block with the structure of the fuzzy neural network of the embodiment of this invention. The fuzzy neural network for actualizing this information indexing block 3 consists o f two layers, L 6 and L7, corresponding t o the indexed keyword register 301 and software component indexing unit 303, respectively. The function of each node of layers L 6 and L7 is described as follows. l ' i r 1 3 = 1 3 0 ' 1 3 + 1 1 . 1 ) V 1 3 + 1 ( ( ' ' 1 3 1 ) . 1 . 5 10 3 = g " ' 3 + 1 ( 1 1 ) (c) Output in the recall phase a node o f layer L4 with a value other than the maximal value in later L4. Assume the correlation value saved i n the node to be Wr, and T4'' to represent the number of variations, the modification, known as negative learning, is performed as follows: 15 Tfr=(k+b-F1)•Tir-ill(k+b-F1)•0,(R,) k=k+1 (12) 20 where b represents a bias f o r preventing the correlation value from being greatly reduced in negative learning in the initial several learning processes. (Case 2) 25 When the output o f node RA(L5) is smaller than the threshold 0 dures: 3 , (a) Create a new L2 node w l when the output with the maximal similarity value o f the node RA(L2) to the 30 t h e n e x query word 11 (with encoded pattern q) is smaller than e c u the threshold 01, wherein the encoded patterns being t saved and the initial, value of variations are set respece tively as: Layer L6 t 35 Layer L6 is composed of m sublayers designated respech , tively as L61, L 6 2 „ L6m, and each sublayer has n nodes. e For example, sublayer L61 comprises n nodes from Q (13) f o l iQ,„ l o and o l1 t so on. Each sublayer is for storing a keyword for each (b) Create a new L2 node w2 when the output with the w i 40 query keyword set, hence, m represents t h e keyword , maximal similarity value o f the node RA(L2) to the n g included i n each set o f query words ( m i s 4 i n this a retrieved component keyword 19 (with encoded pattern l e embodiment), and whereas n represents the maximal numk) is smaller than the threshold 0 n ber of letters for the query word (n is 21) in this embodiment. a patterns being saved and the initial value of variations 1r d In other words, in this embodiment, the nodes Q n are set e r e i n , i w h respectively as: s 45 of sublayer L61 are for storing the keywords for the function 1 n g h t e u software of 1 t o components and the keywords are sequentially Pr Q p e2 n c o d e d b set2by the indexed keywords found with the synonym block 1 r =w g l The nodes Q 1. 1 k 2 o a keywords for associated medium. The nodes Q 1 i 1 = c (c) Create a new L 4 node w3, wherein the correlation 50 sublayer L63 are for storing the keywords for software file , y (i 1 t o Q 2 e = value being saved and the initial value of variations are 1 type. o nodes Q e t 3 1 The Q 2 1 4 set ,respectively as: the 1 r 2 1keywords for system type. In this embodiment, the latter , ) three o keywords are selected from the keywords predefined 4 iW w L 3 t f o Q f o by 1 system. i 2 the b l 6 u 3 s 55 Layer L 7 IS (15) 4 y e r f o m a Layer L7 consists of u nodes, designated as C '= 1 . 1 s u b l a c respectively. Each node of layer L7 corresponds to each o L i 6 C„, 1 It has become evident from the above description of the y e r 3 o 2 , C synonym block 1 and the customizable synonym adjusting individual software component set in the system. Software = 1 6 1 m component keywords describing the corresponding software a z section 9 t h a t the information (software component) 60 4 p retrieval system o f this embodiment is able not only t o components and the identification data corresponding to the r , C a software rcomponents are saved respectively i n nodes o f dynamically adjust the structure of the synonym procedure, r e 3 e but also tolerate input errors. layer L7. Therein, software component keywords containing i f , the Information Indexing Block (3) f keywords for the four query items with their values o s o inputted by nodes of layer L6 when the nodes of layer L7 are FIG. 6 i s a block diagram illustrating the functional r e r structure o f the information-indexing block of the embodi- 65 created. Theo I D number indicates the location o f the softs t s t ware component in the software libraries or the location of ment of this invention. As shown in the figure, the informar ir n n o retrieval in the Internet. tion procedure 3 comprises an index keyword register secg n i n o g d t e h US 6,845,354 B1 15 16 Each node o f layer L7 receives a keyword supplied by each individual node o f layer L 6 f o r matching with the software c o m p o n e n t k e y w o r d s , i f t h e y m e e t t h e requirements, their corresponding identification data are outputted. The action o f each node o f layer L 7 can be represented in the following equation: The setting of the threshold 0 described. L 7 4 o f When startingsto use the information retrieval system o f n o d e this invention, the operator will be prompted to enter various w i l l 5 q u en y items as displayed on the query screen shown in FIG. r o w 2 f o r indexing software component libraries. The query b input methods as adopted in this embodiment can be e word (11Q, CA = A „ , (16) if K m if K * m 0 otherwise 114 (1 — pk) = a 0 n d K 11(Q,C,)=2* (1+(cl(Q,C i)1Far') m * (17) one to be freely entered by the operator as in function o f software components 62 or another to be selected by the to operator from the pop-up menu, such as associated medium 64, software file type 66 and system type 68. I f software component function 6 2 i s selected f o r direct input, the = 0(C,)= 0 LC(Ci) i f il(Q, Ci) operator can key in multiple query words with each of the (18) (19 ) query words being separated by a slash. Query words in the 15 query items w i l l be processed one after another b y the system. Each individual result w i l l finally be combined and processed. I f query words are selected from the pop-up menu, only one query item will be selected each time, since options are individually independent and mutually exclu- 20 sive. Besides, only software component function 62 w i l l undergo processing in the synonym block 1 to obtain numerous keywords. In all other query items that do not require Equation (16) is used for calculating the distance between synonymy processing, the query word selected by the operathe keyword set Q (composed o f four keywords) being tor is the default keyword. Each time i n the information entered and the software component keyword set C, of the 25 index block 3, a keyword is selected from each query item i to form a set of keywords for component retrieval processIn t this embodiment, Q includes four vectors (1—m) and each ing. vector includes 21 elements (1—n. Similarly, C, also includes h There are three possible outcomes when all the software four vectors and each vector includes 2 1 elements. I n components found b y the information index block 3 are s addition, the value A represents the distance of a vector (one 30 presented to the operator for selection. First, the operator o of the 1—m) and is calculated with equation (17). may find the appropriate software component immediately f In equation (17), the distance between a vector Q o f the from a fair number o f the retrieved software component. t input keywords and the corresponding vectors A„, o f the Second, the operator may find it difficult to select a software w encoded pattern C, o f software component is calculated. component from a massive number of the retrieved software a Particularly, only portions o f query items (vectors) w i t h 35 components. Third, the operator may not find the approprir actual input keywords in the input keywords Q are calcuate software component from a limited number of software e lated; the distance A components. The problems with the last two outcomes can to O. c . fI no this manner, the processing can be smoothly perm r be resolved by resetting the threshold 0 formed r y o u e i n this embodiment even when the operator i s the component e q 4 o f t h ranking-filtering block 5. When the number L 7 uncertainr o f d certain keyword for querying (for example, 40 o f software components becomes excessive, the threshold 0 a m o w n o d e s unsure of system type). That is, fuzzy input retrieval can be 4 is reset higher. That is, the threshold 0 p s i n realized. Therein, W"' represents the weight of the M output r c o n t r o l l i can 4 f o of software components n g be reset higher to reduce o w i t (that e c t o r the number e f software components for output when the o t h vis, query item) of the input keyword set Q and indicates t h n h the importance of the vector among all vectors in the input number o f the outputted software components becomes e n o keyword set. 45 excessive. Conversely, the threshold 0 n Equation 18 is used for calculating the similarity value i n p 4 when the number of the e s e components becomes too c a n b e r software t t u t between the input keyword set Q and the software compo- l limited so e that rthe operator may have more software como w d i nent keyword set C ponents to choose from. e o f Fd' h e Fe' are fuzzifiers for describing the range s Therein, t and From the above description of information index block 3, r coveredgby the membership function and the sensitivity in 50 i t becomes evident that the information (software s o r i i n a relation component) retrieval system o f this embodiment can perc l y to distance between the input keyword set Q and the e l form retrieval in a fuzzy retrieval manner in response to the software component keyword set C, saved in the node. The r t s a v similarity value calculated with equation 18 can be supplied query words the operator has entered. That is, even when the i e d operator can not describe the exact content for a query item, not only to the L7 node for determining whether the idenb i tification data of the corresponding software component can 55 this information retrieval system can still find all possible e n software components in a most approximate manner. be outputted, but also to the component ranking-filtering d n o Component ranking-filtering block (5) block 5 for determining the order of ranking for the retrieved a d softwaree components. The component ranking-filtering b l o c k 5 filters t h e n Equation 19 is used for determining whether the software amount of the retrieved software component based on the C d component corresponding t o L 7 node i s retrieved. That 60 threshold 0 . s means, i f the similarity value kt(Q,C,) between the input 4 software components based on the corresponding similarity o r value r keyword set Q and the software keyword set C a d e t e supplied by the information index block 3. I n other words, s ithe software i component has exceeded a predetermined m i n e the greater the similarity value, the more approxiv d e s c r b i n g mate the input query word to the keyword describing the threshold 0 e t h software component is determined to be outputted, its cor- 65 software component. 4 d e Customizable ranking-filtering adjusting section 7 can responding similarity value is also outputted to the compo, b r a n nent e n adjust the ranking mechanism and the filtering mechanism t h ranking-filtering block 5 for further processing. y k i n t t h e g i h d e n o r t i f i e d e Null, o t h e r w i s e US 6,845,354 B1 17 18 of the component ranking-filtering block 5 according to the software component the operator actually selects to adjust to suit personal habit and retrieval preference. According to the above description, the software component retrieval system o f this embodiment provides the following advantages of: Firstly, enabling the operator to customize the thesaurus for easy querying according to personal habit and preference; Secondly, tolerating input errors; T h i r d l y, p e r f o r m i n g a n d p r o c e s s i n g i n d e f i n i t e information, that is, providing methods of fuzzy querying; and Fourthly, ranking the retrieved software component and filtering out those low in correlation according to operator's personal retrieval preference and habit for convenience in selection. It should be understood that this invention is not limited to the preferred embodiment as disclosed above. Variations and modifications can be made by those who are skillful in the art without departing from the spirit and scope o f the present invention as defined in the appended claims. What is claimed is: 4. The information retrieval system as claimed in claim 1, wherein the correlation means comprises: a plurality o f correlation units, corresponding t o the system keywords, respectively, for receiving the simi5 l a r i t y values generated b y the similarity means and comparing the predetermined threshold value with the combination of the correlation values and the similarity values, and determining whether the corresponding system keywords serve as the indexed keywords. lo 5 . The information retrieval system as claimed in claim 1, wherein the encoding means encodes the query word into the corresponding encoded pattern o f the query word by recording the occurrences of the alphabetical letters in the query word, the location of the first occurrence for each of 15 the alphabetical letters and a portion of the query word. 6. The information retrieval system as claimed in claim 5, wherein the synonym block further comprises a shifting means for calculating the position shift amount o f each alphabetical letter by the location of the first occurrence of 20 each alphabetical letter to determine whether there is any input error, and compensating the query word when there is an input error. 7. An information retrieval system for finding correspond1. An information retrieval system for finding corresponding information components according to an entered query ing information components according to an entered query 25 word, comprising: word, comprising: a synonym block for finding indexed keywords correa synonym block for finding indexed keywords according sponding to the query word; to the query word; an information indexing block for finding a corresponding an information indexing block for finding a corresponding information component based o n the indexed keywords; and 30 a component ranking-filtering block for ranking and filtering the found information components and for outputting the information components for selection; 35 wherein the synonym block comprises: encoding means f o r encoding the query word and generating an encoded pattern of the query word, the encoded pattern of the query word corresponding to a feature of the query word; 40 similarity means for comparing the encoded pattern of the query word with the encoded patterns of system synonyms, the latter generated by encoding a plurality o f system synonyms, and generating corresponding similarity values, the similarity values rep- 45 resenting the similarity between the query word and the system synonyms, each of the system synonyms corresponding t o a t least o n e o f t h e system keywords, respectively; and correlation means for selecting the indexed keywords 50 from the system keywords according to correlation values between the system synonyms and the system keywords, the similarity values generated b y the similarity means and a predetermined threshold value for comparison. 55 2. The information retrieval system as claimed in claim 1, wherein the information components are software components classified by types, functions, associated storage media and operating platforms. 3. The information retrieval system as claimed in claim 1, 60 wherein the similarity means comprises: information component based o n the indexed keywords; and a component ranking-filtering block for ranking and filtering the found information components and for outputting the information components for selection; w h e r e i n the synonym block comprises: encoding means for encoding the query word and a keyword for the found information components and generating an encoded pattern of the query word and an encoded pattern o f the keyword for the found i n f o r m a t i o n components, the encoded pattern of the query word corresponding to a feature o f the query word, the encoded pattern o f the keyword for the found information components corresponding to a feature o f the keyword for the found information components; similarity means f o r respectively comparing t h e encoded pattern of the query word and the encoded pattern o f the keyword for the found information components w i t h encoded patterns o f system s y n o n y m s , the latter generated by encoding a plurality of system synonyms, and generating a first set of similarity values and a second set o f similarity values, the first set of the similarity values representing the similarity between the query word and the s y s t e m synonyms, the second set o f the similarity values representing the similarity between the keyword for the found information components and the system synonyms, each o f the system synonyms corresponding t o a t least o n e o f t h e system k e y w o r d s , respectively; correlation means f o r calculating a n output value a plurality of similarity-calculating units for respectively according to the first set of the similarity values, the storing the encoded patterns o f the system synonyms second set o f the similarity values and original corresponding to the system synonyms, receiving the correlation values between the system synonyms and encoded pattern of the query word, and calculating the 65 t h e system keywords; and corresponding s i m i l a r i t y values according t o a determining means for determining whether the simisimilarity-calculating function. larity means and the correlation means need adjust- US 6,845,354 B1 19 20 ment according t o the first set o f the similarity a plurality o f indexing units, respectively corresponding values, the second set of the similarity values and the to the information components and storing a compooutput value. nent keyword set and identification data pertaining to 8. The information retrieval system as claimed in claim 7, the corresponding information components, for calcufurther comprising an adjusting means f o r adjusting the 5 l a t i n g a similarity value between the component keysimilarity means and the correlation means according to the word set and the received indexed keywords and outoutput value generated by the determining means. putting t h e identification data pertaining t o t h e 9. The information retrieval system as claimed in claim 7, corresponding information component when the simiwherein the information components are software compolarity value exceeds a predetermined threshold value. nents classified by types, functions, associated storage media lo 1 6 . The information retrieval system as claimed in claim and operating platforms. 15, wherein the information components are software com10. The information retrieval system as claimed in claim ponents classified b y types, functions, associated storage 7, wherein the similarity means comprises: media and operating platforms. a plurality of similarity-calculating units for respectively 17. The information retrieval system as claimed in claim storing the encoded patterns of the system synonyms 15 1 5 , wherein the predetermined threshold value is adjustable corresponding to the system synonyms and receiving and the adjusted threshold value i s used t o decide the the encoded pattern of the query word and the encoded number of the outputted identification data pertaining to the pattern o f the keyword f o r the found information corresponding information components. components, to calculate the first set of the similarity 18. A n information retrieval system for finding conevalues and the second set of the similarity values. 20 sponding information components according to an entered 11. The information retrieval system as claimed in claim query word, comprising: 8, wherein the determining means comprises a first selection a synonym block for finding indexed keywords responmeans for comparing a first threshold value with the maxisive to the query word according to a fuzzy synonym mal of the first set of the similarity values and the maximal procedure; of the second set to decide an adjustment procedure per- 2 5 an information-indexing block for finding a correspondtaining to the similarity means and the correlation means. ing information component based on the indexed key12. The information retrieval system as claimed in claim words; 8, wherein the determining means comprises a second a component ranking-filtering block for ranking and filselection means f o r comparing a second threshold value 30 t e r i n g the found information components and for outwith the maximal of the output values to decide an adjustputting t h e information components f o r selection ment procedure pertaining to the similarity means and the according to a fuzzy ranking-filtering procedure; correlation means. a synonym adjusting block for adjusting the fuzzy syn13. The information retrieval system as claimed in claim onym procedure of the synonym block according to the 7, wherein the encoding means encodes the query word and the keyword for the found information components into the 35 f o u n d information components; and corresponding encoded patterns b y recording the occurrences of the alphabetical letters in the query word and the keyword for the found information components, the location of the first occurrence for each of the alphabetical letters and a portion of the query word and the keyword for the found information components. 14. The information retrieval system as claimed in claim 13, wherein the synonym block further comprises a shifting means f o r calculating the position shift amount o f each alphabetical letter by the location of the first occurrence o f each alphabetical letter to determine whether there is any input error, and compensating the query word when there is an input error. 15. A n information retrieval system f o r finding corresponding information components according to an entered query word, comprising: a synonym block for finding indexed keywords according to the query word; an information indexing block for finding a corresponding information component based o n the indexed keywords; and a component ranking-filtering block for ranking and filtering the found information components and for outputting the information components for selection; wherein the information indexing block comprises: storing means for sequentially storing the indexed keywords; and a ranking-filtering adjusting block for adjusting the fuzzy ranking-filtering procedure of the component rankingfiltering block according to the found information components. 40 1 9 . The information retrieval system as claimed in claim 18, wherein the information components are software components classified b y types, functions, associated storage media and operating platforms. 20. The information retrieval system as claimed in claim 45 1 8 , wherein the synonym block is implemented by a fuzzy neural network for accelerating parallel processing. 21. The information retrieval system as claimed in claim 18, wherein the information-indexing block is implemented by a fuzzy neural network for accelerating parallel process50 i n g 22. The information retrieval system as claimed in claim . 18, wherein the synonym block comprises encoding means for encoding the query word into an encoded pattern by recording the occurrences of the alphabetical letters in the 55 query word, the location of the first occurrence for each of the alphabetical letters and a portion of the query word; and shifting means for calculating the position shift amount of each alphabetical letter by the location of the first occurrence of each alphabetical letter to determine whether there is any 60 input error and compensating the query word when there is an input error.

Disclaimer: Justia Dockets & Filings provides public litigation records from the federal appellate and district courts. These filings and docket sheets should not be considered findings of fact or liability, nor do they necessarily reflect the view of Justia.


Why Is My Information Online?