I/P Engine, Inc. v. AOL, Inc. et al

Filing 1

COMPLAINT against AOL, Inc., Gannett Company, Inc., Google, Inc., IAC Search & Media, Inc., Target Corporation ( Filing fee $ 350 receipt number 14683024186.), filed by I/P Engine, Inc. (Attachments: # 1 Exhibit A, # 2 Exhibit B, # 3 Receipt, # 4 Civil Cover Sheet, # 5 Letter)(ecav, )

Download PDF
EXHIBIT A US006314420B1 United States Patent (io) Patent No.: US 6,314,420 Bl (45) Date of Patent: *Nov. 6,2001 Lang et al. (54) (75) COLLABORATIVE/ADAPTIVE SEARCH 5,608,447 • ENGINE Inventors: Andrew K. Lang; Donald M. Kosak, Notice: 6,014,665 1/2000 Culliss 6,029,161 • 2/2000 Lang et al patent is extended or adjusted under 35 U.S.C. 154(b) by 0 days. retrieval, Jul. 6, 1994, pp. 339-348." Appl. No.: 09/204,149 * cited by examiner Dec. 3, 1998 Primary Examiner—Thomas Black Assistant Examiner—Frantz Coby Related U.S. Application Data (63) I.LP Apr. 4, 1996, now Pat. No. 5,867,799. Int. Cl.7 (52) U.S. Cl (58) (74) Attorney, Agent, or Finn—Testa, Hurwitz & Thibeault Continuation-in-pait of application No. 08/627 436 filed on (51) (57) G06F 17/30 internet. The search engine system employs a regular search engine to make one-shot or demand searches for information entities which provide at least threshold matches to user 707/5 707/1, 10, 102, 707/3, 2, 5 References Cited U.S. PATENT DOCUMENTS 5,019,961 5,117,349 5,249,262 5,471,610 5,537,586 5,544,049 5,563,998 5,563,999 5/1991 Addusso et al 5/1992 Tirfing ct al 9/1993 11/1995 7/1996 8/1996 10/1996 10/1996 Baule Kawaguchi el al Amram ct al Henderson et al Yaksichclal Yaksich et al 364/192 707/5 395/66 707/4 707/3 364/419.19 395/149 395/149 ABSTRACT A search engine system is provided for a portal site on the 707/3; 707/10; 707/2; Field of Search (56) 707/5 707/5 Michael Persin, Document Filtering for Fast Ranking, Pro ceeding of the seventeenth annual international ACM-SIGIR conference on research and development in information claimer. Filed: 6/2000 Culliss 1/2001 Culliss 707/1 707/5 OTHER PUBLICATIONS This patent is subject to a terminal dis (22) '" 707/5 707/1 6,078,916 6,182,068 Subject to any disclaimer, the term of this (21) " 707/1 5,983,214 ♦ 11/1999 Lang et al 6,006,222 12/1999 Cullisi Assignee: Lycos, Inc., Waltham, MA (US) (*) 343/7 707/10 707/2 5,867,799 • 2/1999 Lang el al both of Pittsburgh, PA (US) (73) 3/1997 Fany el al 5,649,186 • 7/1997 Ferguson 5,842,199 ♦ 11/1998 Miller et al queries. The search engine system also employs a collaborative/content-based filter to make continuing searches for information entities which match existing wire queries and are ranked and stored over time in useraccessible, system wires corresponding to the respective queries. A user feedback system provides collaborative feedback data for integration with content profile data in the operation of the collaborative/contcnt-based filter. A query processor determines whether a demand search or a wire search is made for an input query. 36 Claims, 10 Drawing Sheets KCVCIC cnuu: IfQMOi CHAHtCTiailtlCH TZI USER 30' r_ 29 33- TJ INFORMATION FILTER APPARATUS J 32 SECOND FIRST PREDICTION ADAPTATION ADAPTATION MEANS MEANS MEANS DISTRIBUTED NETWORK RESOURCE |I 28e lib COMMUNITY FILTER USER COMMUNICATION MEANS NETWORK 31 COMPUTER STORAGE MEANS RESOURCE §2 COMPUTER SYSTEM -25 35 27q s 17 USER USER USER USER USER FILTER FILTER FILTER FILTER FILTER 28a CREDIBILITY FILTER COMMUNITY FILTER ADAPTIVE FILTER MEANS r-19 ADAPTIVE CONTENT FILTER EXTRACTION MEANS 1.5 #3 DISTRIBUTED FIG. I 16 OS © © to o i 7" 64b \ ' 48a if MC MC ~62o PROFILE MC LPROFILE 65a PROFILE W 65a LPROFILEJ FEEDBACK 59cr MC 54br \ 66a- r 53o SECOND PROCESSOR FEEDBACK COMMUNITY PROFILE NOTE: AF=ADAPTIVE FILTER THIRD PROCESSOR WC iUSER 61a.. 54o 51 I 66b' AF MC I MC FIG. 2 MC ' "user" V AF 65d ]' MC ^ MC NOTE: MC=MEMBER CLIENT 54d -59b COMMUNITY PROFILE PROFILE ADAPTIVE CONTENT PROFILE ADAPTIVE THIRD PROCESSOR T61d -57 SECOND PROCESSOR _N 53b 49 FEEDBACK PROFILE MC PROFILE PROCESSOR PROFILE FEEDBACK PROFILE 65c MC PROriLE 54c r55 52 PROCESSOR THIRD v -66c AF FOURTH PROCESSOR ..64a I THIRD PROCESSOR V AF 58^ FIRST PROCESSOR -56 ON la o © f o U.S. Patent Nov. 6,2001 100- Sheet 3 of 10 PROVIDING DYNAMIC INFORMON CHARACTERIZATION COMMUNITY FILTER 135- US 6,314,420 Bl -105 ■110 140—1 CLIENT FILTER ADAPTIVELY FILTER RAW INFORMONS PRESENTING PROPOSEO INFORMON TO USER -115 RECEIVING FEEDBACK PROFILE -120 ADAPTING -125 CONTENT AND/OR COLLABORATIVE PROFILE UPDATING DYNAMIC INFORMON CHARACTERIZATION IPREDICTING MC RESPONSESV -150 r170 GROUP INTO PREFERENCE COHORT 175 TARGETED INFORMON -165 -155 CREATE CONSUMER PROFILE CREDIBILITY FILTER RECOMMENDATION FILTER FIG. 3 -160 UPDATE CREDIBILITY PROFILE CONSULTATION FILTER U.S. Patent Nov. 6,2001 Sheet 4 of 10 US 6,314,420 Bl ■205 PREFILTER PARTITION EACH USER INTO MULTIPLE MC's 200- GROUP MC's TO FORM MULTIPLE COMMUNITIES -210 PREDICTING COMMUNITY PROFILE -215 PREDICT MC PROFILE -220 EXTRACT -225 RAW INFORMONS r230 SELECT PROPOSED INFORMON 255 CONTENT FILTER 260- COLLABORATION FILTER 265 MC FILTER PROVIOE PROPOSED \*i -235 INFORMONS TO USER FIG. 4 UPDATE PREDICTION CRITERIA -245 RECEIVE USER FEEDBACK ■240 U.S. Patent 345- 340- FIG. 5 INPUT r 420 425b L. 425c 425d LEARNING FUNCTION r LEARNING FUNCTION COLUBORATIVE 415- r 425a LEARNING FUNCTION UNSTRUCTURED FEATURE INFORMATION 410- STRUCTURED FEATURE INFORMATION 405- CORRELATED-FEATURE, LEARNING ERROR-CORRECTION FUNCTION UNIT 400 428d- 428c- 428b- 428a- -426d r 426c r 426b r 426a 427a 427c FIG. 6 FUNCTION WEIGHTED IRP r 427d WEIGHTED IRP FUNCTION r WEIGHTED IRP FUNCTION WEIGHTED IRP FUNCTION r 429d 430 COMPLETE RATING PREDICTOR -432 UP=UNCERTAINTY PREDICTOR NOTE: IRP=INDEPENDENT RATING PREDICTOR CERTAINTY WEIGHTING FUNCTION r o I—» o ON o 13 to in U.S. Patent Nov. 6,2001 Sheet 7 of 10 US 6,314,420 Bl U.S. Patent Nov. 6,2001 USER Sheet 8 of 10 US 6,314,420 Bl 20C PROVIDES QUERY TERMS LOOKUP QUERY 22C IN TABLE DOES A WIRE ALREADY EXIST ON THE QUERY DISPLAY RESULTS FROM PREVIOUSLY CREATED WIRE TOPIC? PERFORM REGULAR SEARCH ON QUERY ENGINE FIG. 8 28C 26C -> OTHER USER -36CI INDIVIDUAL USER 30C FEEDBACK PROCESSOR- 43C — 32C SEARCH RETURN PROCESSOR 48C RATING DEMAND MODE PROCESSING/ RATING INFORMON WIRE MODE CONTENT-BASED PROFILE DEMAND PROFILES 50C 48C2 ■40C 38C 42C WIRE PROFILES MOLD AGENT MIND FILTER STRUCTURE BASED CONTENT- MODE SELECTOR I TO CURRENT INFORMON FEEDBACK DATA RELEVANT ADAPT! VF FFFDBACK DATA FEEDBACK DATA ADAPTIVE QUERY PROCESSOR FEEDBACK DATA FEEDBACK DATA ■ MEMORY SPIDER 46C FIG. 9 M WEB SITE A WEB SITE W OS o SO o C/5 a 62C-=, 64C STATIONS USER QUERY ► PROCESSOR 60C—-I 74C r CONSIDERED INFORMONS OTHER USER STATION £ 80FD WIRE 66C MEMORY MEMORY -78CM MEMORY ACQUISITION SPIDERCONTINOUS INFORMON 72C 80C r RETURN PROCESSOR FIG. 10 MEMORY DEMAND SEARCH -82C ON DEMAND ACQUISITION FILTER STRUCTURE BASED CONTENT- ENGINE (DEMAND) SPIDERINFORMON SEARCH DEMAND SEARCH 76C — r COLLABORATIVE COLLABORATIVE/ CONTENT-BASED FEEDBACK FILTER FEEDBACK PROCESSOR- DATA STRUCTURE FEEDBACK DATA FOR VE FEEDBACK DATA FOR USER CONTENT PROFILE INDIVIDUAL ADAP ADAPTIVE FEEDBACK DATA FOR OTHER USER CONTENT PROFILES , WIRE RETURNS TO USERS RELEVANT INFORMONS LIST -78C -68C «-► WEB SITE M «-► WEB SITE A td O GO C5 US 6,314,420 Bl found to be relevant. The system maintains the ranked informons in a stored list from which the individual user can select any listed informon for consideration. COLLABORATIVE/ADAPTIVE SEARCH ENGINE This application is a continuation-in-part of copending application Ser. No. 08/627,436 filed on Apr. 4, 1996 now 5 U.S Pat. No. 5,867,799, the entire contents of which are hereby incorporated by reference. As the system continues to feed (he individual user's "wire", the stored relevant informon list typically changes due to factors including a return of new and more relevant informons, adjustments in the user's query, feedback evalu ations by the user for considered informons, and updatings BACKGROUND OF THE INVENTION in collaborative feedback data. Received informons are The present invention relates to information processing 10 similarly processed for other users' wires established in the systems for large or massive information networks, such as information filter system. Thus, the integrated information the internet, and more particularly to such information filter system performs continued long-term searching, i.e., it systems especially adapted for operation in portal and other compares network informons to multiple users' queries to web sites wherein a search engine operates with collabora find matching informons for various users' wires over the tive and content-based filtering to provide better search 15 course of time, whereas conventional search engines initiate responses to user queries. a search in response to an individual user's query and use content-based filtering to compare the query to accessed In the operation of the internet, a countless number of network informons typically to find matching informons information are available for downloading from any of at during a limited, short-term search time period. least thousands of sites for consideration by a user at the user's location. A user typically connects to a portal or other 20 The present invention is directed lo an information pro cessing system especially adapted for use at internet portal or other web sites lo make network searches for information entities relevant to user queries, with collaborative feedback data and content-based data and adaptive filler structuring, based filter in a search engine to search the internet and find 25 being used in filtering operations to produce significantly information which match the query. This process is basically improved search resulls. web site having a search capability, and thereafter enters a particular query, i.e., a request for information relevant to a topic, a field of interest, etc. Thereafter, the search site typically employs a "spider" scanning system and a content- a pre-search process in which matching informons are found, at the time of initiating a search for the user's query, SUMMARY OF THE INVENTION by comparing informons in an "informon data base" to the user's query. In essence, the pre-search process is a short ^ term search for quickly rinding and quickly identifying information entities which are content matched to the user's query. The return list of matching informons can be very exten sive according to the subject of the query and the breadth of the query. More specific queries typically result in shorter return lists. In some cases, the search site may also be structured to find web sites which probably have stored informons matching the entered query. Collaborative data can be made available to assist in informon rating when a user actually downloads an informon, considers and evaluates it, and returns data to the search site as a representation of the value of the considered informon to the user. A search engine system employs a content-based filtering system for receiving informons from a network on a con tinuing basis and for tillering the informons for relevancy to a wire or demand query from an individual user. A feedback system provides feedback dala from other users. Another system controls the operation of the filtering system to filler for one of a wire response and a demand response and 10 return the one response to the user. The filtering system combines pertaining feedback data from the feedback system with conlent profile dala in determining the 40 relevancy of the informons for inclusion in al least a wire response to the query. BRIEF' DESCRIF11ON OF THE DRAWINGS 45 FIG. 1 is an diagrammatic representation of an embodi ment of an information filtering apparatus according lo the In the patent application which is parent to this present invention. continuation-in-part application, i.e. Ser. No. 08/627,436, filed by the present inventors on Apr. 4,1996, now U.S. Pat. FIG. 2 is an diagrammatic representation of another No. 5,867,799 and hereby incorporated by reference, an embodiment of an information filtering apparatus according advanced collaborative/content-bascd information filter sys- 50 to the present invention. tem is employed to provide superior filtering in the process FIG. 3 is a flow diagram for an embodiment of an of finding and rating informons which match a user's query. informalion filtering mclhod according to the present inven The information filter structure in this system integrates tion. content-based filtering and collaborative filtering to deter FIG. 4 is a flow diagram for another embodiment of an mine relevancy of informons received from various sites in 55 information filtering method according to the present inven the Internet or other network. In operation, a user enters a tion. query and a corresponding "wire" is established, i.e., the FIG. 5 is a flow diagram for yei another embodiment of query is profiled in storage on a content basis and adaplively an information filtering method according lo the present updated over time, and informons obtained from the net work are compared to the profile for relevancy and ranking. (,0 invention. FIG. 6 is an illustration of a threc-componcnl-inpul model A continuously operating "spider" scans the network to find and profile with associated predictors. informons which are received and processed to determine relevancy to the individual user's wire or to wires estab lished by numerous other users. FIG. 7 is an illustration of a mind pool hierarchy. FIG. 8 is a logic diagram illustrating a search selection The integrated filter system compares received informons 65 feature of Hie invenlion; to the individual user's query profile data, combined with FIG. 9 is a functional block diagram of an embodiment of collaborative da la, and ranks, in order of value, informons the invenlion in which an intcgralcd informalion processing US 6,314,420 Bl system employs a search engine and operates with combined collaborative filtering and content-based tillering, which is preferably adaptive, to develop responses to user queries. i.e., that share a subset of attributes or interests. In general, ihc subset of shared attributes forms the community profile for a given community and is representative of the commu nity norms, or common client attributes. The "relevance" of a particular informon broadly describes how well it satisfies the user's information need. The more relevant an informon is to a user, the higher the "signal" content. The less relevant the informon, the higher the "noise" content. Clearly, the notion of what is relevant to a particular user can vary over time and with context, and the user can find the relevance of a particular informon limited queries to the appropriate structure for responses. lo only a few of the user's potentially vast interest areas. Because a user's interests typically change slowly, relative DETAILED DESCRIPTION OF THE lo the data stream, it is preferred lo use adaptive procedures EMBODIMENTS 15 lo track the user's current interests and follow them over The invention herein is preferably configured with an lime. Provision, too, is preferred to be made for sudden apparatus and method for information filtering in a computer changes in inlerest, e.g., taking up antiquarian sword col system receiving a data stream from a computer network, in lecting and discontinuing stamp collecting, so that the which entities of information relevant to the user, or method and apparatus track the evolution of "relevance" to "informons," are extracted from the data stream using 2n a user and Ihe communities of which the user is a member. content-based and collaborative filtering. The information In general, information filtering is the process of selecting filtering is long term in the sense that it operates on a the information that a users wishes to see, i.e., informons, continuing basis, and is both interactive and distributed in from a large amount of data. Content-based filtering is a structure and method. It is interactive in that communication process of filtering by extracting features from the informon, is substantially big-directional at each level of the filter. It is 25 e.g., the text of a document, to determine the informon's distributed in that all or part of the information filter can relevance. Collaborative filtering, on the other hand, is the include a purely hierarchical (up-and-down/parent-child) process of filtering informons, e.g., documents, by deter structure or method, a purely parallel (peer-to-peer) structure mining what informons other users wilh similar interests or or method, or a combination of hierachical and parallel needs found lo be rclcvanl. FIG. 10 shows another and presently preferred embodi ment of the invention in which an information processing 5 system includes an integrated filter structure providing collaborativc/adaplivc-conlcnt-bascd filtering to develop longer term, continuing responses to user queries, and a search engine structure which provides short term, demand responses to user queries, with the system directing user io structures and method. 30 The system apparatus includes a filler structure having As used herein, the term "informon" comprehends an adaptive content based filters and adaptive collaborative information entity of potential or actual interest lo a par fillers, which respectively include, and respond to, an adap ticular user. In general, informons can be hetcrogenous in tive content profile and an adaptive collaboration profile. As nature and can be all or part of a textual, a visual, or an audio used herein, the term "content-based filter" means a filter in entity. Also, informons can be composed of a combination of 35 which content data, such as key words, is used in performing the aforementioned entities, thereby being a multimedia ihe filtering process. In a collaborative filter, other user data entity. Furthermore, an informon can be an entity of pat is used in performing Ihe filtering process. A collaborative terned data, such as, a data file containing a digital repre filter is also sometimes referred to as a "content" filter since sentation of signals and can be a combination of any of the it ultimately performs the lask of finding an object or previously-mentioned entities. Although some of the data in 40 document having content relevant to the content desired by a data stream, including informons, may be included in an a user. If there are some instances herein where the term informon, not all data is relevant to a user, and is not within "contenl filter" is used as distinguished from a collaborative the definition of an informon. By analogy, an informon may filler, it is intended thai the term "content filter" mean be considered to be a "signal," and the total data stream may "conteni-bascd filler." The adaptive filters each are preferred be considered to be "signal +noise." Therefore, an informa 45 lo include at lcasl a portion of a community filter for each tion filtering apparatus is analogous to other types of signal community serviced by the apparatus, and a portion of a filters in that it is designed to separate the "signal" from the member client filler for each member client of Ihe serviced "noise." communities. For this reason, ihe adaptive filtering is dis Also as used herein, the term "user" is an individual in tributed in that each of the community fillers perform communication with the network. Because an individual so adaptive collaborative filtering and adaptive content user can be interested in multiple categories of information, filtering, even if on different levels, and even if many filters the user can be considered to be multiple clients each having exisl on a given level. The integrated filtering permits an a unique profile, or set of attributes. Each member client individual user to be a unique member client of multiple profile, then, is representative of a particular group of user communities, wilh each community including multiple preferences. Collectively, the member client profiles asso 55 member clicnls sharing similar interests. The adaptive fea ciated with each user is the user profile. The present inven tures permit the interests of member clients and entire tion can apply the learned knowledge of one of a user's communities lo change gradually over time. Also a member member clients to others of the user's member clients, so client has the ability to indicate a sudden change in that the importance of the learned knowledge, e.g., the user's preference, e.g., the member client remains a collector but is preference for a particular author in one interest area as 60 no longer interested in coin collecting. represented by the member client, can increase the impor tance of that particular factor, A's authorship, for others of Ihc user's member clients. Each of Ihc clients of one user can be associated with Ihc individual clicnls of olhcr users insofar as the profiles of the respective clients have similar 65 attributes. A "community" is a group of clicnls, called member clients, that have similar member client profiles, The filter structure also implements adaptive credibility filtering, providing member clicnls wilh a measure of infor mon credibility, as judged by other member clients in the community. For example, a new member client in a first community, having no credibility, can inject an informon into Ihe data flow, thereby providing other member clients in other communities with the proposed informon, based on the US 6,314,420 Bl 5 6 respective community profile and member client profiles. If the other member clients believe the content of the informon topic, or the quality of the recommendation. As before, the responders can accrue quality points, value tokens, or "info to be credible, the adaptive credibility profile will reflect a growing credibility. Conversely, feedback profiles from bucks," as apportioned by the requester, in return for the useful recommendation informon recipients that indicate a lack of credibility cause 5 Furthermore, certain' embodiments are preferred to be the adaptive credibility profile, for the informon author to , IP , • • . ,. , „ f,. . \. ... . reflect untrustworlhinessHowever, the growth and dcclina«K*P«»n«»g " thaI *>™ °r a»°f *e adaptive filters used lion of credibility are not "purely democratic," in the sense f the syslem ^"^.cally seek optimal values for the that one's credibility is susceptible to the bias of others1 functlon lnIended bv Ihe filler. «**. content analysis, perceptions, so the growth or declination of one's credibility collaboration, credibility, reliability, etc. member client is viewed by other member clients. preferences of individual member clients and communities. nities may be used to establish those member clients as sutncr nccds in a timcly man"cr is generally proportional to how the credibility of the new T"6 filter structure herein is capable of identifying, the Member clients can put their respective reputations "on providing direct and inferential consumer preference the line," and engage in spirited discussions which can be information, and tracking shifts in Ihe preferences whether refcreed by other interested member clients. The credibility lhe shifts be gradual or sudden. The consumer preference profile further can be partitioned to permit separate credinformation can be used to target particular consumer prefibility sub-profiles for the credibility of the content of the erence S^F5' or c°h°f|s. ™d provide members of the informon, the author, the author's community, the reviewers, cohort wlth lar8eled informons relevant to their consumer and the like, and can be fed back to discussion participants, preferences. This information also may be used to follow reviewers, and observers to monitor the responses of others M dcmographical shifts so that activities relying on accurate to the debate. The adaptive credibility profiles for those demographical data, such as retail marketing, can use the member clients with top credibility ratings in their commuconsumer preference information to anticipate evolving con"experts" in their respective communities. To provide a basis for adaptation, it is preferred that each With this functionality, additional features can be ,s raw informon be processed into a standardized vector, which implemented, including, for example, "instant polling" ona " may be on lne order of 20,000 to 100,000 tokens long. The matter of political or consumer interest. In conjunction with learning and optimization methods that ultimately are choboth content and collaborative filtering, credibility filtering, sen are preferred to be substantially robust to Ihe problems and the resulting adaptive credibility profiles, also may be which can be presented by such high-dimensional input used to produce other features, such as on-line consultation 30 sPaces- Dimensionality reduction using methods such as the and recommendation services. Although the "experts" in the singular value decomposition (SVD), or auto-encoding neucommunities most closely related to the topic can be ra| networks attempt to reduce the size of the space while afforded special status as such, member clients from other initially retaining the information contained in the original communities also can participate in the consultation or representation. However, the SVD can lose information recommendation process. In one embodiment of the consultation service, credibility filtering can be augmented to include consultation filtering. 3J during the transformation and may give inferior results. Two adaptation/learning methods that are presently preferred include the TF-IDF technique and lhe MDL technique. With this feature, a member client can transmit an informon to the network with a request for guidance on an issue, for FIG. 1 illustrates one embodiment of an information filtering apparatus 1 structured for search engine implemcn- example, caring for a sick tropical fish. Other member 40 Iat'on m accordance with the invention as described subsc- clients can respond to the requester with informons related quently herein in connection with FIGS. 8 and 9. In general, lo the topic, e.g., suggestions for water temperature and a dala stream is conveyed through network 3, which can be antibiotics. The informons of the responders can include a global internet work. A skilled artisan would recognize that apparatus 1 can be used with other types of networks, their respective credibility profiles, community membership, and professional or avocational affiliations. The requester 45 including, for example, an enterprise-wide network, or can provide feedback to each of the responders, including a "intranet." Using network 3, User #1 (5) can communicate rating of the credibility of the respondcr on the particular wiIn olner users> for example, User #2 (7) and User #3 (9), topic. Additionally, the responders can accrue quality points, and a'so w'tn distributed network resources such as resource value tokens, or "info bucks," as apportioned by the #1 (U) and resource #2 (13). requester, in return for useful guidance. Similarly, one embodiment of an on-line recommendation 50 Apparatus 1 is preferred lo be part of computer system 16, although User #1 (5) is not required to be the sole user of service uses recommendation filtering and adaptive recomcomputer system 16. In one present embodiment, it is mendation profiles to give member clients recommendations preferred that computer system 16 having information filter on matters as diverse as local auto mechanics and worldapparatus 1 therein filters information for a plurality of class medieval armor refurbishers. In this embodiment, Ihe 55 users- One application for apparatus 1, for example, could be requester can transmit the informon to Ihe network bearing that user 5 and similar users may be subscribers to a the request for recommendation. Olher member clients can commercial information filtering service, which can be respond 10 Ihe requester with informons having specific provided by Ihe owner of computer sysiem 16. recommendations or dis-recommcndalions, advice, etc. As Extraction means 17 can be coupled wilh, and receives with lhe consultation service, the informons of lhe respond- 60 dala stream IS from, network 3. Extraction means 17 can ers can be augmented to include their respective credibility profiles, community membership, and professional or avocational affiliations. A rating of each recommendation provided by a respondcr, rclalivc to olher responders' identify and cxiraci raw informons 19 from dala stream 15. Each of the raw informons 19 has an information content, Extraction means 17 uses the adaptive content filter, and at least pan of lhe adaptive content profile, lo analyze the data recommendations, also can be supplied. The requester can 65 stream for Ihe presence of raw informons. Raw informons provide feedback lo each of the rcspondcrs, including a arc those dala entities whose contcnl identifies them as being raling of lhe credibility of the respondcr on the particular "in the ballpark," or of potential interest 10 a community US 6,314,420 Bl 8 coupled to apparatus 1. Extraction means 17 can remove duplicate informons, even if the informons arrive from different sources, so that user resources are not wasted by handling and viewing repetitive and cumulative information. Extraction means 17 also can use at least part of a community profile and a user profile for User #1 (5) to determine whether the informon content is relevant to the community 5 spaced feedback responses, to update the adaptive collaboration profile, the adaptive content profile, or both, with an adapted future prediction 34, in order to minimize subsequent prediclion errors by the respective adaptive collaboration filter and adaptive content filter. in one embodiment of the invention herein, it is preferred ,hat prediction means 33 be a self-optimizing prediction means using a preselected learning technique. of which User #1 is a part. Filter means 21 adaptively filters raw informons 19 and Such techniques can include, for example, one or more of produces proposed informons 23 which are conveyed to " a op-key-word-selection learning technique, a nearestUser #1 (5) by communication means 25. A proposed neighbor learning technique, a term-weighting learning informon is a selected raw informon that, based upon the technique, and a probabilistic learning technique. First adaprespective member client and community profiles, is pretation means 30 also can include a neural nelwork therein dieted to be of particular inierest to a member client of User and employ a neural network learning technique for adap5. Filter means 21 can include a plurality of community « tation and prediction. In one present embodiment of ihe fillers Zla.b and a plurality of member client fillers 28«-e, invention, the lerm-weighting learning technique is preeach respectively having community and member client ferred to be a TF-IDF technique and the probabilistic leamprofiles. When raw informons 19 are filtered by filter means ing technique is preferred to be a MDL learning technique 21 those informons that are predicted to be suitable for a Firsl adaptalion means 30 further can include ^^ pariicuhr member client of a particular community, e.g., 20 adaplation ^eans 32 for adaplin al least one of Ihe ad ive User #1 (5), response to the respective community and coiiaboration profiles, the adaptive content profiles! the member client profiles, are conveyed thereto. Where such .s dfi «w JT ?r h P, r,!-r lllty « lC raw informons 19 according .0 a credulity profile. communitv proglc> and the J, fl, responsive o at least one of ihe other profiles. In this ma^er, trends a.irib- „ ulable lo individual member clienls. individual users, and 25 individual communilics in one domajn of syslem 16 can be It is preferred that the adaptive filtering performed within recognized by, and influence, similar entities in other domains (melding agent "minds"), contained within system 16 to the extern lhal the respective entities share common filter means 21 by the plurality of filters 27o,fc, 2&a-e, and 35, use a self-optimizing adaptive filtering so that each of ihe parameters processed by filters 21a,b, 28«-e, and 35, is driven continually .0 respective values corresponding .0 a minimal error for each individual parameter. Self- » error, etc., are favored to prevail. Apparatus 1 also can include a computer slorage means 31 for slori the fil inch|di me ad profile and |ne Jtivc coUaboratfon opumizationencouragesadynamic,marketplace-likeopera- tion of the system, ,n that those entities having the most desirable value e.g. h.ghes. cred.b.hty, lowest pred.c.ed attributes ;v* fileP Addiun™ trend-lracking information can be stored for later retrieval in storagc mcans 31> or ^ cd ,0 nc,work 3> remolc analysiS( fof cxamp]e> by ^ #, (?) Self-optimization can be effected according to respective preselected self-optimizing adaption techniques including, 3 f FIG. 2 illustrates another preferred embodiment of infor- maIion filIering apparams s0, in computer system 51. Appa- for example, one or more of a lop-key-word-selection adap- lation technique, and a neural network learning technique. In ratus 50 can include firsl processor 52, second processor 53a,b, third processor 64a-d, and a fourth processor 55, .0 effect the desired information filtering. First processor 52 can be coupled lo, and receive a data stream 56 from weighting adaptation technique is preferred to be a TF-IDF by extracling raw fnformons 58 from data streaPm & anon technique, a nearest-neighbor adaptation technique, a lerm-weighting adaptation technique, a probab.listic adapone present embodiment of the invention, the term- SlTV Sn?'? *UC PtaU°n tCChmqUe 'S PR!" ferred to be a MDL technique. nelwork 57. Firsl processor 52 can serve as a pre-processor . « sive «° ^processing profile 49 and conveying informons 58 to second processor 53fl,6. When user 5 receives proposed informon 23 from appa- Because of the inconsistencies presented by Ihe ncarlyinfinite individual differences in the modes of along with the proposed informon. By answering, user 5 conceptualization, expression, and vocabulary among users creates a feedback profile that corresponds to feedback 50 even wiihin a communily of coinciding interests, similar response 29. User feedback response 29 can be active notions can be described with vastly different terms and feedback, passive feedback, or a combination. Active Teedconnotations, greatly complicating informon characierizaback can include the user s numerical rating for an Iion. Mode variations can be even greater between disparate informon, hmts, and indices. Hints can include like or dislike communilies, discouraging interaction and knowlcdgeof an author, and informon source and timeliness. Indices 55 sharing among communities. Therefore, it is particularly can include credibility agreement with consent or author, preferred thai processor 52 create a mode-invariant reprehumor, or value. Feedback response 29 provides an actual station for each raw informon, thus allowing fast accurate ratus 1, user 5 is provided with multiple feedback queries response to proposed informon 23, which is a measure of the relevance of the proposed informon to the information need of user 5. Such relevance feedback attempts ,0 .rnprove ,he performance lor a particular profile by modifying the profiles, based on feedback response 29. informon characterization and collaborative filtering Mode- 60 invariant representations lend 10 facilitate relevant informon selection and distribulion wilhin and coramuniticSj thereby promoting knowledge-sharing, thereby benefiting ,he group of inler,inked communilies, i.e., a society, as welL A predicted response anticipated by adaptive filtering First processor 52 also can be used to prevent duplicate means 21 can be compared to the actual feedback response informons, e.g., the same information from different 29 of user 5 by firsl adaplation means 30, which derives a 65 sources, from further pcnelraling, and thus consuming the prediction error. First adaptation mcans 30 also can include resources of, the filtering process. Other processors 53 a b prediction mcans 33, which collects a number of temporally54<7-rf, also may be used 10 perform the duplicate informa- US 6,314,420 Bl 10 tion elimination function, but additionally may measure the differences between the existing informon and new infor- ferred to be a TF-IDI-' technique and the probabilistic adap tation technique Ls preferred to be a MDL technique. mons. That difference between the content of the informon the previous time the user reviewed it and the content of the informon in its present form is the "delta" of interest. processors 52-55 could be combined functionally so that the learning" can be accomplished by second processor SSa,b, alone or in concert with third processor 54a-d. Indeed, third suitable processors. As described in the context of FIG. 1, An artisan would recognize that one or more of the s actual number of processors used in the apparatus 50 could Processors 53a,b, S4a-d may eliminate the informon from be less than, or greater than, that illustrated in FIG. 2. For further processing, or direct the new, altered informon to the example, in one embodiment of the present invention, first member client, in the event that nature or extent of the processor 52 can be in a single microcomputer workstation, change exceeds a "delta" threshold. In general, from the with processors 53-55 being implemented in additional notion of exceeding a preselected delta threshold, one may ]0 respective microcomputer systems. Suitable microcomputer infer that the informon has changed to the extent that the systems can include those based upon the Intel® Pentiumchange is interesting to the user. The nature of this change Pro™ microprocessor. In fact, the flexibility of design can be shared among all of a user's member clients. This presented by the invention allows for extensive scalability of delta threshold can be preselected by the user, or by the apparatus 50, in which the number of users, and the compreselected learning technique. Such processing, or "delta 15 munities supported may be easily expanded by adding the interrelation of the several adaptive profiles and respec tive filters allow trends attributable to individual member processor 54a-d adapts a delta learning profile for each clients, individual users, and individual communities in one member client of the community, i.e. user, thus anticipating 2q domain of system 51 to be recognized by, and influence, those changes in existing informons that the user may find similar entities in other domains, of system 51 to the extent "interesting." that the respective entities in the different domains share Second processor S3a,b can filter raw informons 58 and common attributes. extract proposed community informons 59a,b therefrom. The above described system operates in accordance with Informons 59a,b are those predicted by processor S3a,b to 25 100 for information filtering in a computer system, as be relevant to the respective communities, in response to a illustrated in FIG. 3, which includes providing a dynamic community profiles 48a,b that are unique to the communi informon characterization (step 105) having a plurality of ties. Although only two second processors 53a,b are shown profiles encoded therein, including an adaptive content in FIG. 2, system SI can be scaled to support many more profile and an adaptive collaboration profile; and adaptivcly processors, and communities. It is presently preferred that 3U filtering the raw informons (step 110) responsive to the second processor S3a,b extract community informons 59a,b dynamic informon characterization, thereby producing a using a two-step process. Where processor 52 has generated proposed informon. The method continues by presenting the mode-invariant concept representations of the raw proposed informon to the user (step 115) and receiving a processor 54a-d can be the locus for delta learning, where informons, processor 53a,b can perform concept-based feedback profile from the user (step 120), responsive to the indexing, and then provide detailed community filtering of 35 proposed informon. Also, the method includes adapting at each informon. least one of the adaptive content profile (step 125) and the Third processors 54a-d can receive community infor adaptive collaboration profile responsive to the feedback mons 59a,b from processors Sia,b, and extract proposed profile; and updating the dynamic informon characterization member client informons 6la~d therefrom, responsive to (step 130) responsive thereto. unique member client profiles 62a-d for respective ones of 40 The adaptive filtering (step 110) in method 100 can be member clients 63a-d. Each user can be represented by machine distributed adaptive filtering that includes commu multiple member clients in multiple communities. For example, each of users 64a,b can maintain interests in each of the communities serviced by respective second processors nity filtering (sub-step 135), using a community profile for each community, and client filtering (sub-step 140), simi Fourth processor 55 can include a plurality of adaptive fillers 66a-d for each of the aforementioned profiles and computer storage therefor. It is preferred that the plurality of adaptive filters 66a-d be self-optimizing adaptive fillers. 155) of Ihe raw informons responsive lo an adaptive cred ibility profile and updaling the credibility profile (step 160) responsive to the user feedback profile. Method 100 further can include creating a consumer profile (slcp 165) respon- larly using a member client profile for each member client S3a,b, and each receive separate member client informons 45 of each community. It is preferred that the filtering in 61b,c and 61a,d, respectively. sub-steps 135 and 140 be responsive to the adaptive content Each member client 63a-d provides respective member profile and the adaptive collaboration profile. Method 100 client feedback 6Sa-d to fourth processor 55, responsive to comprehends servicing multiple communities and multiple the proposed member client informons 61a-d. Rased upon of users. In turn, each user may be represented by multiple the member client feedback 6Sa-d, processor 55 updates at 50 member clients, with each client having a unique member least one of the preprocessing profile 49, community profiles client profile and being a member of a selected community. 48a, fc and member client profiles 62a-d. Also, processor 55 It is preferred that updating the dynamic informon charac adapts at least one of the adaptive content profile 68 and the terization (step 130) further include predicting selected adaptive collaboration profile 69, responsive to profiles 49, subsequent member client responses (step 150). 48a, b, and 62a-d. 55 Method 100 can also include credibility tillering (step Sclf-optimizalion can be effected according to a preselected 60 sive to the user feedback profile. In general, Ihe consumer self-optimizing adaptation technique including, for example, profile is representative of predetermined consumer prefer one or more of a top-key-word-seleclion adaptation ence criteria relative to Ihe communities of which the user is technique, a nearest-neighbor adaptation technique, a terma member client. Furthermore, grouping selected ones (step weighting adaptation technique, and a probabilistic adapta 170) of the users into a preference cohort, rcspoasive to the tion technique. Any of the adaptive fillers 66a-d may 65 preselected consumer preference criteria, can facilitate pro include a neural network In one present embodiment of the viding a targeted informon (slcp 175), such as an invention, the term-weighting adaptation technique is pre advertisement, to the preference cohort. US 6,314,420 Bl 11 12 FIG. 4 illustrates yet another preferred method 200. In structure found in the article headers. For example, in general, method 200 includes partitioning (step 20S) each user into multiple member clients, each having a unique addition to typical words such as "seminar" counting as tokens, the punctuation mark "$" and the symbol "News- member client profile with multiple client attributes and grouping member clients (step 210) to form a multiple 5 communities with each member client in a particular community sharing selected client attributes with other member clients, thereby providing each community with a unique community profile having common client attributes. group:comp.ai" are also tokens. Using noun phrases as tokens also can be useful. Next a vector of token counts for the document is created, This vector is the size of the total vocabulary, with zeros for tokens not occurring in the document. Using this type of vector is sometimes called Ihe bag-of-words model. While Method 200 continues by predicting a community profile 10 the bag-of-words model does not capture the order of the (step 215) for each community using first prediction criteria, tokens in the document, which may be needed for linguistic and predicting a member client profile (step 220) for a or syntactic analysis, it captures most of the information member client in a particular community using second needed for filtering purposes. prediclion criteria. Method 200 also includes the steps of Although, it is common in information retrieval systems extraciing raw informons (step 225) from a data stream and 15 io group the tokens together by their common linguistic selecting proposed informons (step 230) from raw inforroots, called stemming, as a next step it is preferred in the mons. The proposed informons generally are correlated with one or more of the common client attributes of a community, and of the member client attributes of the particular member present invention that the tokens be left in their unstemmed form. In this form, the tokens are amenable to being classifted into mode-invariant concept components, posed informons permits the updating of Ihe first and second vocabulary and conceptualization. Each community can client to whom the proposed informon is offered. After 2" Creating a mode-invariant profile (slep 305), C, includes providing the proposed informons to the user (step 235), creating a conceptual representalion for each informon, A, receiving user feedback (step 240) in response to Ihe pro,hat is invariani wilh respecl ,0 lhe form-of-expression, e.g., prediction criteria (step 245) responsive to the user feed- ^ kack- Method 200 further may include prefiltering the data stream (step 250) using the predicted community profile, with the predicted community profile identifying the raw informons in the data stream. consisl o[ a «Meta-U-Zine" collection, M, of informons. 25 Based upon profile C, the appropriate communities, if any. for each informon in the data stream are selected by conceptbased indexing (step 310) into each M. That is, for each concept C that describes A, put A into a queue QM, for each M which is related to C. It is preferred that there is a list of Step 230 of selecting proposed informons can include filtering the raw informons using an adaptive content filter (step 255) responsive to the informon content; filtering the raw informons using an adaptive collaboration filter (step 260) responsive to the common client attributes for the Ms that is stored for each concept and that can be easily index-searched. Each A that is determined to be a poor fit for a particular M is eliminated from further processing. Once A has been matched with a particular M, a more complex community profile PM is developed and maintained for each an adaptive member client filler (step 265) responsive to the unique member client profile. determine whether it matches l'M strongly enough to be retained or "weeded" out (step 325) at this stage. pertaining community; and filtering the raw informons using 35 M (step 315). If A has fallen into QM, then A is analyzed to Il is preferred that updating the first and second prediction Each A for a particular M is sent to each user's personal criteria (step 245) employ a self-optimizing adaptation 40 age"t, or member client U of M, for additional analysis technique,, including, for example, one or more of a topbased on the member client's profile (step 325). Each A that key-word-selection adaptation technique, a nearest-neighbor ?lls U's interests sufficiently is selected for U's personal adaptation technique, a term-weighting adaptation informon, or "U-Zine," collection, Z. Poor-filling informons technique, and a probabilistic adaptation technique. It is *& eliminated from placement in Z (step 330). This userfurther preferred that the term-weighting adaptation tech- 45 leve' sla8e of analysis and selection may be performed on a nique be a TF-IDF technique and the probabilistic adaptacentralized server site or on the user's computer, lion technique be a minimum description length technique. Next, the proposed informons are presented to user U The information filtering method shown in FIG. 5 pro(sleP 33S) for review. User U reads and rates each selected vides rapid, efficient data reduction and routing, or filtering, A found in Z (step 340). The feedback from U can consist to the appropriate member client. The method 300 includes 50 °^ a rating for how "interesting" U found A to be, as well as parsing the data stream into tokens (step 301); creating a one or more °f tne following: mode-invariant (MI) profile of the informon (step 305); selecting the most appropriate communities for each Opinion feedback: Did U agree, disagree, or have no opinion regarding the position of A? informon with regard to its fit within each community; Informon Qualities: How does lhe user rate the informons informon, based on the Ml profile, using concept-based indexing (slep 310); detailed analysis (step 315) of each 55 eliminating poor-fitting informons (step 320); detailed filtcring of each informon relalive to fit for each member client (step 325); eliminating poor-fitting informons (step 330); presenting the informon Io lhe member client/user (step 60 335); and obtaining the member client/user response, includ- ing multiple ratings for different facets of the user's response to the informon (slep 340). II is preferred that coherent portions of the dala stream. i.e., potential raw informons, be first parsed (slep 301) into 65 generalized words, called tokens. Tokens include punctua- lion and other specialized symbols thai may be part of lhe Credibility Feedback: Did U find the facts, logic, sources, and quotes in A to be truthful and credible or nol? qualities, for example, "interestingness," credibility, funniness, content value, writing quality, violence content, sexual content, profanity level, business imporiance, scientific merit, surprise/unexpectedness of information content, artistic quality, dramatic appeal, cnlertainmenl value, trendiness/importance to future directions, and opinion agreement. Specific Reason Feedback: Why did the user like or dislike A? Because of the authority? Because of the source? US 6,314,420 Bl 14 13 Ms, that are related to any concept C, may be Iooked-up. Furthermore, when a Z is created by U, the concept clues Because A is out-of-date (e.g. weather report from 3 weeks ago)? Because the information contained in A has been seen already? (I.e., the problem of duplicate information delivery) Categorization Feedback: Did U liked A? Was it placed within the correct M and Z? Such multi-faceted feedback queries can produce rich feed back profiles from U that can be used to adapt each of the 5 given by U to the information filter can be used to determine a set of likely concepts C that describe what U is seeking. For example, if U types in "basketball" as a likely word in the associated Z, then all concepts that have a high positive weight for the word "basketball" are associated with the new z. If no such concepts C seem to pre-exist, an entirely new concept C is created that is endowed with the clues U has profiles used in the filtering process to some optimal oper- 10 given as the starting profile. To augment the effectiveness of concept-based indexing, ating point. it is preferred to provide continual optimization learning. In One embodiment of creating a MI profile (step 305) for general, when a concept C no longer uniquely triggers any each concept can include concept profiling, creation, and documents that have been classified and liked by member optimization. Broad descriptors can be used to create a substantially-invariant concept profile, ideally without the 15 clients U in a particular community M, then that M is removed from the list of M indexed into by C. Also, when word choice used to express concept C. A concept profile there appears to be significant overlap between articles can include positive concept clues (PCC) and negative fitting concept C, and articles that have been classified by concept clues (NCQ. The PCC and NCC can be combined users as belonging to M, and if C does not currently index by a processor to create a measure-of-fit that can be com pared to a predetermined threshold. If Ihe combined effect of 20 into M, then M can be added to the list of M indexed into by C. The foregoing heuristic for expanding the concepts C the PCC and NCC exceeds the predetermined threshold, that are covered by M, can potentially make M too broad then informon A can be assumed to be related to concept C; and, thus, accept loo many articles. Therefore, it further is otherwise it is eliminated from further processing. PCC is a preferred that a reasonable but arbitrary limit is set on the set of words, phrases, and other features, such as the source or the author, each with an associated weight, that tend to be 25 conceptual size covered by M. With regard to the detailed analysis of each informon A in A which contains C. In contrast, NCC is a set of words, with respect to the community profile for each M, each A phrases, and other features, such as the source or the author, must pass through this analysis for each U subscribing to a each with an associated weight that tend to make it more particular M, i.e., for each member client in a particular unlikely that A is contained in C. For example, if the term "car" is in A, then it is likely to be about automobiles. 30 community. After A has passed that stage, it is then filtered at a more personal, member client level for each of those However, if the phrase "bumper car" also is in A, then it is users. The profile and filtering process arc very similar for more likely that A related to amusement parks. Therefore, both the community level and the member client level, "bumper car" would fall into Ihe profile of negative concept except that at the community level, the empirical data clues for the concept "automobile." Typically, concept profile C can be created by one or more 35 obtained is for all U who subscribed to M, and not merely an individual U. Other information about the individual U means. First, C can be explicitly created by user U. can be used to help the filter, such as what U thinks of what Second, C can be created by an electronic thesaurus or a particular author writes in other Zs that the user reads, and similar device that can catalog and select from a set of articles that can't be used for the group-level M processing. concepts and the words that can be associated with that FIG. 6 illustrates the development of a profile, and its concept. Third, C can be created by using co-occurrence 40 associated predictors. Typically, regarding the structure of a information that can be generated by analyzing the content profile 400, the information input into the structure can be of an informon. This means uses the fact that related features of a concept lend to occur more often within Ihe same divided into three broad calegories: (I) Structured Feature Information (SFI) 405; (2) Unstructured Feature Informa analysis of collections, H, of A that have been rated by one 45 tion (UFI) 410; and (3) Collaborative Input (CI) 415. Fea tures derived from combinations of these three types acl as or more U. Combinations of features that tend to occur repeatedly in II can be grouped together as PCC for the additional peer-level inputs for the next level of the rating analysis of a new concept. Also, an A that one or more U prediction function, called (4) Correlated-Feature, Errorhave rated and determined not to be within a particular Z can Correction Units (CFECU) 420. From inputs 405,410,415, be used for the extraction of NCC. 50 420, learning functions 425a-d can be applied to get two document than in general. Fourth, C can be created by the Concept profiles can be optimized or learned continually after their creation, with the objective that nearly all As that computed functions 426a-d,428a-d of the inputs. These two functions are the Independent Rating Predictors (IRP) Us have found interesting, and belonging in M, should pass 426a-d, and the associated Uncertainty Predictors (UP) Ihe predetermined threshold of at least one C that can serve 428a-d. IRPs 426a-d can be weighted by dividing them by as an index into M. Another objective of concept profile 55 their respective UPs 428a-d, so that the more certain an IRP management is that, for each A that docs not fall into any of 426a-d is, the higher its weight. Each weighted IRP429<7-</ the one or more M that are indexed by C, the breadth of C is brought together with other IRPs 429a-d in a combination is adjusted to preserve the first objective, insofar as possible. function 427<?-d. This combination function 427a-d can be For example, if C's threshold is exceeded for a given A, C's from a simple, weighted, additive function to a far more breadth can be narrowed by reducing PCC, increasing NCC, 60 complex neural network function. The results from this are normalized by the total uncertainty across all UPs, from In the next stage of filtering, one embodiment of contentCertain=zero to Uncertain°infinity, and combined using the based indexing takes an A thai has been processed into the Certainty Weighting Function (CWF) 430. Once the CWF set of C that describe it, and determine which M should 430 has combined the IRPs 426a-d, it is preferred that result accept the article for subsequent filtering, for example, 65 432 be shaped via a monolonically increasing function, to detailed indexing of incoming A. It is preferred that a data map to the range and distribution of the actual ratings. This structure including a database be used, so that the vector of function is called the Complete Rating Predictor (CRP) 432. or both, or by increasing the threshold for C. US 6,314,420 Bl 16 15 SF1405 can include vectors of authors, sources, and other features of informon A that may be influential in determining the degree to which A falls into the categories in a given M. UFI 410 can include vectors of important words, phrases, and concepts that help to determine the degree to which A falls into a given M. Vectors can exist for different canonical parts of A. For example, individual vectors may be provided for subject/headings, content body, related information in other referenced informons, and the like. It is preferred that a positive and negative vector exists for each canonical part. Q 41S is received from other Us who already have seen A and have rated it. The input used for CI 415 can include, for example, "interestingness," credibility, funniness, con tent value, writing quality, violence content, sexual content, profanity level, business importance, scientific merit, surprise/unexpectedness of information content, artistic quality, dramatic appeal, entertainment value, trendincss/ "significance" is used is in a statistical sense, and frame works such as the Minimum Description Length (MDL) Principle can be used to determine when to store or use a more "local" component of the IRP. As a simple example, the following IRP employs only two of the above terms: lRP(aulhor)=>wcighled sum of 10 15 importance to future directions, and opinion agreement. Each CFECU 420 is a unit that can detect sets of specific feature combinations which are exceptions in combination. For example, author X's articles are generally disliked in the Z for woodworking, except when X writes about lathes. average (ratings given this author so far in this M) average (ratings given all authors so far in this M) Table 2 gives the values attained for the four new articles. It is preferred that an estimate of the uncertainty resulting from a positive or negative IRP be made, and a complex neural net approach could be used. However, a simpler method, useful for this example, is simply to repeal the same process that was used for the IRP but, instead of predicting the rating, it is preferred to predict the squared-error, given the feature vector. The exact square-error values can be used as the informon weights, instead of using a rating-weight lookup table. A more optimal mapping function could also be computed, if indicated by the application. 20 Token 1 Token 2 Token 3 Token 4 IRP pas. vector 16.68 8.73 12.89 11.27 IRP ncg. vector 15.20 8.87 4.27 5.04 When an informon authored by X contains the concept of "lathes," then the appropriate CFECU 420 is triggered to signal that this is an exception, and accordingly a signal is sent to offset the general negative signal otherwise triggered 25 because of the general dislike for X's informons in the The UPs then can be computed in a manner similar to the IRP's: comparisons with the actual document vectors can be woodworking Z. As an example the form of Structured Feature Information made to gel a similarity measure, and then a mapping (SF1) 405 can include fields such as Author, Source, 30 function can be used to gel an UP. Information-Type, and other fields previously identified to Making effective use of collaborative input (Cl) from be of particular value in the analysis. For simplicity, the other users U is a difficult problem because of the following exemplary SFI, below, accounts only for the Author field. seven issues. First, there generally is no a priori knowledge For this example, assume three authors A, B, and C, have regarding which users already will have rated an informon collectively submitted 10 articles that have been read, and 35 A, before making a prediction for a user U, who hasn't yet have been rated as in TABLE 1 (following the text of this read informon A. Therefore, a model for prediction must be specification. In the accompanying rating scheme, a rating operational no matter which subset of the inputs happen to can vary between 1 and 5, with 5 indicating a "most be available, if any, at a given time. Second, computational interesting" article. If four new articles (11,12,13,14) arrive efficiency must be maintained in light of a potentially verythat have not yet been rated, and, in addition to authors A, 40 large set of users and informons. Third, incremental updates B, C, and a new author D has contributed, a simple IRP for of rating predictions often are desired, as more feedback is the Author field, that just lakes sums of the averages, would reported from users regarding an informon. Fourth, in learn be as follows: ing good models for making rating predictions, only very lRP(author)=weighted sum of sparse data typically is available for each users rating of each 45 average(ratings given the author so far) document. Thus, a large "missing data" problem must be averagc(ratings given the author so far in this M) dealt with effectively. averagc( ratings given all authors so far in this M) Fifth, most potential solutions to the CI problem require average(ratings given all authors) independence assumptions that, when, grossly violated, give averagc(ratings given the author so far by a particular very poor results. As an example of an independence 50 user U)* assumption violation, assume that ten users of a collabora average(ratings given the author so far in this M by a tive filtering system, called the "B-Team," always rate all particular user U)* articles exactly in the same way, for example, because they average(ratings given all authors so far in this M by a think very much alike. Further assume that user A's ratings particular user U)* are correlated with the B-Team at the 0.5 level, and are avcragc(ratings given all authors by a particular user)* 55 correlated with user C at the 0.9 level. Now, suppose user C * (if for a personal Z) reads an article and rates it a "5". Based on that C's rating, The purpose of the weighted sum is to make use of it is reasonable to predict that A's rating also might be a "5". broader, more general statistics, when strong statistics for a particular user reading an informon by a particular author, within a particular Z may not yet be available. When stronger statistics arc available, the broader terms can be eliminated by using smaller weights. This weighting scheme is similar to that used for creating CWFs 430, for the profiles as a whole. Some of the averages may be left out in the actual storage of the profile if, for example, an author's average rating for a particular M is not "significantly" different from the average for the author across all Ms. Here, 60 Further, suppose that a member of the B-Team reads the article, and rates it a "2". Existing collaborative filtering methods are likely to predict thai A's rating RA would be: R,,-(0.9x5h-0.5x2)/(0.9+0.5)=3.93 In principle, if other members of the B-Tcam then read and 65 rale the article, it should not affect the prediction of A's rating, RA, because it is known that other B-Tcam members always rate the article with the same value as the first US 6,314,420 Bl 17 member of the B-Team. However, the prediction for A by existing collaborative filtering schemes would tend to give 10 times the weight to the "2" rating, and would be: R,-(0.9x5+10x0.5x2)/(0.9+10x0.5)-2.46 18 comparing the similarity between the profiles. Next, the similarity of the member client profiles and informon content profiles can be compared. A neural network could be used to learn how to compare profiles so that the error in predicled ralings is minimized. However, the invention can be embodied with use of a simple cosine similarity metric, Existing collaborative filtering schemes do not work well in like that previously considered in connection with Unstructhis case because B-Team's ratings are not independent, and tured Feature Information (UFI) can be used. have a correlation among one another of 1. The information filter according to the present invention can recognize and yhe method used is preferred to be able to include more 10 Il]an jusl ,ne tokens> such ^ the author and other SFI; and, compensate for such inter-user correlation. Sixth, information about the community of people is jt is preferrej tnal ,he three vectors for component also are known, other than each user's ratings of informons. This ab|e ,0 ^ C0ll)paret]. sFIs may be handled by transforming information can include the present topics the users like, mem m(o an cntjly lnat cim bc lrta,cli ;n a comparable way what authors the users like, etc. This information can make ,0 ,oken frequencjes ttiat can be multiplied in the standard the system more effective when it is used for learning 15 Ioke[) frequency comparison method, which would be recstronger associations between community members. For ognized by a skilled artisan. example, because Users A and B in a particular community ConIinuing in lhe ongoing eXiimplc> lhe Author field may M have never yet read and rated an informon in common, no correlation between their likes and dislikes can be made, based on common ratings alone. However, users A and B 2U have both read and liked several informons authored by the same author, X, although Users A and B each read a distinctly different Zs. Such information can be used to make the inference that there is a possible relationship between user A's interests and user B's interests. For the most part, -5 existing collaborative filtering systems can not take advan tage of this knowledge. Seventh, information about the informon under consider ation also is known, in addition to the ratings given it so far. For example, from knowing that informon A is about the *> concept of "gardening", better use can be made of which users' ratings are more relevant in the context of the infor mation in the informon. If user B's rating agrees with user D's rating of articles when the subject is about "politics", but B's ratings agree more with user D when informon A is 3S about "gardening", then the relationship between User B's ratings and User D's ratings are preferred to be emphasized to a greater extent than the relationship between User B and User C when making predictions about informon A. ioken for t known, information about the community and the informon, it is possible to determine the influence of user A's rating of an informon on the predicted rating of the informon for a common "mindset," Sw between user A and user B and 50 informon D, that may be expressed as: M^profilefA) X profile(B) X DocumentProfile(D). Second, a correlation may be taken between user A's past ratings and user B's past ratings with respect to informons that are similar to D. This correlation can be taken by weighting all informons E that A and B have rated in common by the similarity of E to D, SED: ^^ a correlation of each author under comparife funclion p f ^ |e ^ £ an'd usef B* q(^ auth ^fa the monotonically-increasing function of the J \ and fB M a sj ^ an * ^P * ^ ^ read P e constraints according to the * J when lhere has ^ nQ q{ ^ zefo user soffie amhor ^ h raicd ^ ^ Jn thjs cag J [ P intentionally avoiding reading inforsubsequent article that has been read but is not prepared bv the author. In general, the exact weighting function and parameters can bc empirically derived rather ,,)an ,heoretically derived, and so Ls chosen by (he limizalion of lh<; ovcran ralinj, prediction functions. Continuing in the present example, a correlation can be com led wilh the rollowin& W(.ighls for Ihe aulhors K, L a[)d ^ S£D=Weighled_Correlation(ratings(A),ratings(B)) Author Weight W^ = weight for rating pair (rating (A. £)). rating (B. DU • togpl ♦ 1) x log(l «l|x Gfnot author) -004 F(5.7. author or D) " log(5 + 1) x log(7 + 1) x G(Milhoi) Note that the "X" in the above equation may not bc a mere multiplication or cross-product, but rather bc a method for (he exact value is an increasing function H of the total articles read by a particular user so far, because it becomes more Each of the examples can be weighted by = DocumemProfile (E)x DocummtProfiletD) fc of whether the informon D is by the author or not can be; ^ second user, B. For example, where user A and user B have 45 read and rated in common a certain number of informons, the influence of user A's rating of informon D on the predicted rating of informon D for user B can be defined by a relationship that has two components. First, there can be a as follows: component of the member client and ^ £ for each f With regard to the aforementioned fourth, sixth and 40 seventh issues namely, making effective use of sparse, but vcctor m fi)cs f £ ^ P f - 0.70 US 6,314,420 131 19 20 FIO. 7 illustrates a preferred embodiment of a mind pool -continued Auihor M Weighi ; svstem 50°- U fe Preferred that aU users be members of |he T I(°K26)°x C(2 + i)« G(not amho,) " . o.O2 5 uppermost portion of the hierarchy, namely, the top mind pool 501. Mind pool 501 can be broken into sub-mindpools 502a-e, which separate users into those having at least some common interests. Furthermore, each sub-mind pool 502«-c can be respectively broken into sub-sub-mindpools 503a-o, I, is preferred that, ,1,,, irwiriihrn he used as the the logarithm be used as the m olonfi^^ used are H=log(sample_Ji2,.0.1) and an assumed rating, for 503c-<*, SO5e,f,g to which users 504a-g are respective members. As used herein, mind pool 501 is h p ^^s. As used_h«ein, mind pool SO^fethe^arent^ode those authors who arc unrated by a user, to the value of "2." respechve parent nodes to sub-sub-mindpools S03«-g Subpools 502a-c arc the child nodes to mind pool 501 and one-neuron neural network, or a linear function such as, (correlation* 1)*PO. Where the Po is an optimized parameter 502a, 502c, if such more closely matches their interests than would membership in a sub-sub-mind pool 503a-g. In nize that there are numerous methods that can be used to effect informon comparisons, particularly document com- hy autjjOr A l0 he interesting but nevertheless found other ^^^ by author A on "gardening" to be uninteresting may ing technique in conjunction with the cosine similarity A processing means or mind pool manager may be used The correlation for the author' SFI can be mapped to a sub-pools 503a-g are child nodes to respective mindpools non-zero range, so that it can be included in the cosine 503a-c. Sub-pools 503a-g can be considered to be end similarity metric. This mapping can be provided by a simple 15 nodes. Users 505a,i> can be members of sub-mind pool used to produce the predicted ratings with the lowest error general, the objective is to break down the entire population in the given domain for filtering. of users into subsets that are optimally similar. For example. An artisan skilled in information retrieval would recog- 2Q meseiofuserswho find the same articles about "gardening" parisons. One preferred method is to use a TF-1DF weight- metric. SH including author, can be handled by including be . [ned fa one subse, P management of each of the mindpools 501, ponent of the relationship between user A's and user B's can child-node mind pool managers and from those users simple additive function, a product function, or a complex manager's parent node, if such exists; (3) receiving estima- straints encountered in the application. Optimization of the exists; and (4) making estimations of the mind pool con- be combined to produce the function to predict the rating of coupled directly to the manager; (2) passing rating informainformon D for user B. The combination function can be a 30 lion or compiled statistics of the rating information up to the function, including, for example, a neural network mapping function, depending upon computational efficiency con- tions of the mind pool consensus on the rating for an informon from the manager's parent mind pool, if such combination function can be achieved by minimizing the 35 sensus on the rating for a specific informon for the users that predicted rating error as an objective. In addition to determining the relationship between two user's ratings, a relationship that can be used and combined across a large population of users can be developed. This ibl h fid fi laborative input. Specifically, the difficuly pyg user rating relationship across a large population of users is compounded by the lack of a priori knowledge regarding a come under the manager's domain; and (5) passing the estimations from function 4 down to either a child-node mind pool or, if the manager is an end node in the hierarchy, to the respective user's CWF, for producing the user's ditd ti Function 4 also can include combining the y size, standard deviation, etc. Furthermore, as alluded to above, users can be allowed to belong to more than one mind large volume of dynamically changing information that may 45 pool if they don't fit precisely into one mind pool but have have unexpected correlation and therefore grossly violate multiple views regarding the conceptual domain of the to the aforementioned "community" or may instead be one managers) first decide whether (he rating will effect its independence assumptions. informon. Also, it is preferred that lateral communication be In one embodiment of the present invention, it is preferred procided between peer managers who have similar users that users be broken inlo distributed groups called "mindbeneath them to share estimation information. When a rating pools." Mindpools can be purely hierarchical, purely so comes in from a user, it can be passed to the immediate parallel, or a combination of both. Mindpools can be similar manager(s) node above that user. It is preferred that the of many subcommunities. These multiple hierarchies can be current estimation or whether the statistics should be passed used to represent different qualities of an article. Some upward 10 a parent-node. If the manager estimation would qualities that can be maintained in separate hierarchies 55 change by an amount above an empirically-derived miniinclude: interestingness; credibility; funniness; valuablemum threshold, then the manager should pass that estimaness; writing quality; violence content; sexual content; protion down to all of its child-nodes. In the event that the fanity level; business importance; scientific merit; ariislic compiled statistics arc changed by more than another miniquality; dramatic appeal; entertainment value; surprise or mum threshold amount, then the compiled statistics should unexpectedness of information content; Irendiness or impor- 50 be passed to the manager's parent-node, if any ,and the lance to future directions; and opinion agreement. Each of process recurses upward and downward in the hierarchy, these qualities can be optionally addressed by users with a Because no mind pool manager is required to have rating feedback mechanism and, therefore, these qua lilies accurate information, bin just an estimation of the rating and can be used to drive separate mind pool hierarchies. Also, an unccriainly level, any manager may respond with a the qualities can be used in combinations, if appropriate, to 65 simple average of all previous documents, and with a higher develop more complex composite informon qualities, and degree of uncertainty, if none of its child-nodes has any more sublime mindpools. rating information yet. The preferred distributed strategy US 6,314,420 Bl 22 21 methods described above. Approximations can be made by pre-computing all terms that do not change significantly, based on the particular informon, or the subset of actual tends lo reduce the communication needed between processors, and the computation tends to be pooled, thereby eliminating a substantial degree of redundancy. Using this distributed strategy, Ihe estimations tend to settle to the ratings given so far to the mind pool manager. As stated previously, the correlated-feature error- extent that the updating of other nodes, and the other users correction units (CFECUs) are intended to detect irregulari ties or statistical exceptions. Indeed, two objectives of the CFECU units are to (1) find non-linear exceptions to the general structure of the three aforementioned types of inputs (SFI, UFI, and CI); and (2) find particular combinations of informon sub-features thai statistically stand out as having special structure which is not captured by the rest of the general model; and (3) trigger an additional signal to the CFECU's conditions are met, in order to reduce prediction error. The following exemplifies the CFECU operation. predictions are minimized. Therefore, as the number of informons and users becomes large, the computation and prediction updates grow as the sum of the number of informoas and the number of users, rather than the product of the number of informons and the number of users. In addition, incremental updates can be accomplished by the passing of estimations up and down the hierarchy. Incre mental updates of rating predictions continue to move until the prediction becomes stable due to the large sample size. 10 The distributed division of users can reduce the effects of 15 independent assumption violations. In the previous example with the B-Team often users, the B-Team can be organized as a particular mind pool. With the additional ratings from each of the B-Team members, the estimation from the B-Team mind pool typically does not change significantly because of the exact correlation between the members of that mind pool. This single estimation then can be combined with other estimations to achieve the desired result, regard less of how many B-Team members have read the article at 20 any given time. 25 The mind pool hierarchies can be created by either computer- or human-guided methods. If the hierarchy cre ation is human-guided, there often is a natural breakdown of people based on information such as job position, common interests, or any other information that is known about them. 30 Where the mind pool hierarchy is created automatically, because the previously described measure of the collabora tive input relationship between users can be employed in a standard hierarchical clustering algorithm to produce each group of users or nodes in the mind pool hierarchy. Such 35 In this example, it is desired that author A's informon D standard hierarchical clustering algorithms can include, for about gardening have a high predicted rating for user B. example, the agglomerative method, or the divide-andHowever, because the average rating for author A by user B conquer method. A skilled artisan would recognize that is only 1.69, and the average rating for the gardening many other techniques also are available for incrementallyadjusting the clusters as new information is collected. 40 concept is only 1.68, a three-part model (SFI-UFI-C1) that does not evaluate the informon features in combination Typically, clustering is intended to (1) bring together users would tend to noi rank informon D very highly. In this case, the first CFECU would firsl find sources of error in past examples. This could include using the three-part model pendent among one another. Estimations are made in a manner similar to other esti 45 against the known examples that user B has rated so far. In this example, seven articles that user B has rated, have an mations described herein. For example, for each user or average rating of 4.5, though even the three-part model only sub-mind pool (sub-informant), a similarity between the predicts a rating of about 1.68. When such a large error sub-informant and the centroid of the mind pool can be appears, and has statistical strength due to the number of computed in order to determine how relevant the sub- whose rating information is clearly not independent; and (2) produce mind pool estimations that are substantially inde informant is in computing the estimation. Uncertainty esti- 50 examples with the common characteristics of, for example, mators also are associated with these sub-informants, so that the same author and topic, a CFECU is created to identify that this exception lo ihe three-part model has been triggered and that a correction signal is needed. Second, it is preferred they can be weighted with respect to their reliability in providing the most accurate estimation. Optionally, the informon under evaluation can be used lo modulate the relevancy of a sub-informant. 'Iliis type of evaluation also can take advantage of the two previously-determined col laborative information relationship components, thereby tending to magnify relationships that are stronger for par ticular types of informons than for others. Once a suitable set of weights are established for each user within a mind pool for a particular informon, a simple weighled-avcrage can be used to make the estimation. It is preferred that the "simple" weighted average used is more conservative regarding input information that a simple independent linear regression. Also, the overall Uncertainty can be derived from Ihe Uncertainty Prcdiclions of the sub-informants, in a manner similar lo the produclion of other uncertainty combination 55 to index the new CI-'LCU into a database so thai, when triggering features appear in an informon, for example, author and topic, the correction signal is sent into the appropriate CWF. One method which can be used to effect the first step is a cascade correlation neural network, in which the neural net finds new connection neural net units 65 to progressively reduce the prediction error. Another method is to search through each informon lhai has been rated but whose predicted raling has a high error, and storing Ihe informons profile. When "enough" informons have been found with high error and common characteristics, ihe common characteris tics can be joined together as a candidate for a new CFECU. Ncxi, the candidate can be tested on all Ihe samples, whether US 6,314,420 Bl 23 24 they have a high prediction or a low prediction error regular search engine results. As shown in the logic diagram of Fl G. 7, a user provides a query as indicated by block 20C. The query is applied to a Lookup Table, as indicated by block 22C, block 24C applies a test to determine from the users'profiles, they also can be added to a database of engine. mation of the predicted rating from a particular CFECU can be made by taking the average of those informons for which the CFECU responds. Also, the Uncertainty can be chosen such that the CFECU signal optimally outweighs the other tion of previous users' feedback data. By analyzing docu associated with them. Then, the overall error change (reduction or increase) for all of the examples can be computed to determine if the CFECU should be added to the informon profile. If the estimated error reduction is greater 5 table whether a wire already exists for the new query. If so, block 26C returns results from the existing wire. Otherwise, than a minimum threshold level, the CFECU can be added block 28C commands a demand search by a regular query to the profile. As successful CFECU are discovered for CFECU's that may be useful for analyzing other profiles. If With the use of wire search returns, each user can review a particular CFECU has a sufficiently broad application, it 10 the returned results and provide feedback data about can be moved up in the filtering process, so that it is reviewed documents. Such feedback data is incorporated in computed for every entity once. Also, the particular CFECU the filter profiles used in processing informons for the wire. can be included in the representation that is computed in the Therefore, when a future user makes substantially the same pre-processing stage as a new feature. In general, the esti query, the wire will have been improved by the incorpora ments which users rate as meeting a particular quality such as interestingness, the system can find common document features which can be used to return more like documents to signals being sent to the CWF. One method of selffuture users who make substantially the same query. optimization that can be employed is, for example, the Alternatively, all queries applied to a search engine sys gradient descent method, although a skilled artisan would 20 tem of the invention can set up new wires. After a search recognize that other appropriate optimization methods may query is presented to the search engine system, a wire is be used. created on the basis of the query terms, and all new The invention of this continuation-in-part application, as documents subsequently received from the network are shown in FIGS. 8 and 9, provides a collaborative and preferably adaptive search engine system in which elements 25 filtered by the new wire. A push-model may be used to send all passed, new documents to the user. of the structure and principles of operation of the apparatus of FIGS. 1-7 are applied. Accordingly, a search engine system of the invention, as preferably embodied, integrates collaborative filtering with adaptive content-based filtering to provide improved search engine performance. The acro nym "CASE" refers to a search engine system of the invention, i.e., a collaborative, adaptive search engine. In the operation of conventional search engines at portal web sites., user queries are searched on demand to find relevant informons across the web. Content-based filtering is typically used in measuring the relevacy of informons, and the search results are resented in the form of a list of Among other basic search engine system structures, an integrated system can be employed in which collaborative and content-based filtering is structured to provide demand 30 searches with or without collaborative filtering, or wire searches. In the operation of the preferred basic structure and other basic structures, a query processor can be employed, if needed, to make search-type assignments for user queries. Generally, basic search engine system structures of the 35 invention are preferably embodied with the use of a pro grammed computer system. Collaborative fillering employs additional data from other users to improve search results for an individual user for whom a search is being conducted. The collaborative data informons ranked by relevancy. The present invention combines collaborative filtering with content-based filtering in measuring informons for relevancy, and further preferably applies adaptive updating 40 of the content-based filtering operation. In providing these INTEGRATED COLLABORATIVE/CONTENT-BASED results, the invention can be embodied as a search engine system in accordance wilh different basic structures. In the presently preferred basic structure, an integrated collaborative/conlent-based filter (FIGS. 1-7) is operated to provide ongoing or continuous searching for selected user queries, with a "wire" being established for each query. On the other hand, a regular search engine is operated to make can be feedback informon rating data, and/or it can be content-profile data for agent mind melding which is more fully disclosed in Ser. No. 09/195,708 now pending, entitled FILTER STRUCTURE EMPLOYING SELECTIVELY 45 SHARED, CONTENT-BASED PROFILE DATA TO EVALUATE INFORMATION ENTITIES IN A MASSIVE INFORMATION NETWORK, filed by the current inventors on Nov. 19, 1998, and hereby incorporated by reference. Many types of user rating information can be used. For immediate or short-term "demand" searches for other user 50 example, users can sort documents which they have read from best to worst. Alternatively, users can select on a scale (numeric, such as 1 to 10, or worded, such as good, medium, poor) how much they enjoyed reading a document. Further, user monitoring can measure time spent by users on each for an input query. Otherwise, wire search results are 55 document, thereby indicating user interest (normalized by document length). Among other possibilities, the choices of returned if a wire does exist, or collaborative ranking data documents for reading by other users can be simply used as can be applied from the wire filler structure to improve the an indication of interesting documents. In all cases, the results of the demand search from the regular search engine. feedback rating data can be based on inlerestingness or any In the currently preferred embodiment, wires are created queries on the basis of content-based filtering. This basic structure of the invention is especially beneficial for use in applying the invention to existing search engine structure. Demand search results can be returned if no wire exists for the most common queries received by the search engine 60 of a variety of other document qualities, as described in connection with FIGS. 1-7. system. A suitable analysis is applied to the search engine Feedback ranking information can be used in a number of ways, and the invention is not limited by the method of and respective wires arc then created for each of these feedback information use. Use methods range in spectrum queries. An analysis update can be made from time to time 65 from weighting relative ranks by a set amount (possibly to make wire additions or deletions as warranted. equally, possibly heavy weighting one above the other) to When a user makes a query for which a wire already dynamically adjusting the weight by measuring how statisexists, wire search results are preferably returned instead of operations to determine which queries are most common, US 6,314,420 Bl 25 26 tically significant the user feedback is. For example, if only character 48C2. Collaborative rating data is used in the However, if many people have consistently ranked an article applied in the demand search mode, to the extent that one person has ranked an article, it may not be significant. informon rating process in the wire search mode, and if the same, more credibility may be placed on the user's collaborative data is available for the informons in the weighting. 5 search return. Search results are returned to the users 34C FIG. 9 shows a generalized embodiment of the invention and 3fic from ,he ^^^ return processor 48C as shown in in which system elements in a CASE system 30C are ^ 9 integrally configured to provide wire and/or demand ' . searches A query processor 32C receives queries from an The invention .s preferably embodied as shown in FIG. 38C responds to the currently processed query to set a content-based filter structure 40C for wire search operation vidual user 62C and other users 64C and determines whether a wire already exists for each entered query. If a wire exists, individual user 34C and other users 36C. A mode selector 10 10. A query processor 60C receives queries from an indior demand search operation. In the preferred application of ,he query is routed to a collaborative/conlenl-based filter the invention, the wire mode is selected only if a wire struciure 66C like that of FIGS. 1-7. A spider system 68C already exists, and wires exist only for those queries found 15 continllousiy scaas a nctWOrk 70C for informons providing to be commonly entered as previously described In the ^ ,hreshold.level match for conlen, based profiles (i.e., demand search mode, the filter structure 40C can funct.on cessi files „ Ihe , lcvel o[ lhe preferred "e^vruiXme^beusedfordetennining mul.i-ieve, filter structure, a, leas, one of which reflects the demand search being made the first time a particular query memory 72C according to the wire or wires to which they is entered and with wire searches being made for subsequent entries of the same query. As another example, the user may belong. A feedback processor 74C is structured like the mind pool is desired, the user may select a wire search. integration with the content-based data in the measurement mode or demand search mode, and employs content-based profiles 42C in content-based filtering preferably multt- slmcme like ,hal of FIG. 6 is employed for this purpose, AdUve feetlback dala ^ appiied from thc USers to the filter whether a wire search or a demand search is made. For 20 content profile of a current wire query). Informons which are example, every query can call for a wire search, with a passed by the filter 66C for existing wires are stored in a select a demand search, or, if continuing network searching 25 SyStcm of pjg 7 t0 provide collaborative feedback data for The filter structure 40C operates in its set wire search Sies iTrj\^^;^r^2i: of infonnon K{cvAncy by ,hc fi|,cr 66C. An informon rating evaluation, feedback data from users respectively associated ouslv in demand profile data agent mind A spider by a spider system 78C in a demand search of the network 70c. The spider system 78C can have its own memory system 78CM as considered in connection with the spider 45c of FIG. 9. therewith. These profiles are used by the filter structure 40C If no wire exists for a currently input query, the query is in wire searches in the wire mode. sent to a regular search engine where a content profile is Demand profiles 42C2 are used by the filler structure 40C 35 established for content based filtering of informons returned searches in the demand mode. Collaborative can be integrated with the wire profiles through melding 43C as previously explained. system 46C scans a network 44C to find infor- mons for a current demand search, and to find Mormons 40 with continued network scanning for existing wires. In selecting available mfonnons for return, the sp.der system 46C uses a content threshold derived from the content-based Once fi,(eri which is performed on returned informons, those e a satisfacl ' ^ H.. , . . match ,0 the query ^ h a ^^ fmm re for profile for which an informon search is being conducted. prutcssoi »^' . it. 46C have a memory system 46CM which holds an informon data base wherein index information is stored from infor- demand search memory 82C ind,cates that the current query has been made over time with sufhcienl frequency to qualify P In many instances, it s preferable that the spider system 45 current query for which a demand search, was made fa mons previously collected from the network. In this manner, as a "common" query for which a wire is justified. As demand searches can be quickly made from the spider indicated by dashed connector line 801D, collaborative memory 46CM as opposed to making a time consuming 50 feedback data can be, and preferably is, integrated into the search and downloading in response to a search demand demand search processing by the processor 80C. query from the search engine. A search return processor 48C receives either demand search informons or wire search informoas passed by the Many alterations and modifications may be made by those having ordinary skill in the art without departing from the spiri( and xope of ,he jnvcnljon. Therefore, it must be content-based filter structure 40C according to the operating 55 undel5|ood ihal the illustrated embodiments have been set mode of the latter, and includes an informon rating system which is like lhat of FIG. 6. The informon rating system combines content-based filtering data with collaborative forth Q. for |he be (akc[) as ,„ oses of exampie> and that it should not |hc invemion as^^ b the following lhefeforC) ,0 ^ rea(] ,0 feedback rating data from users through a feedback proces,he combination of elements which are sor SOC a. leas, ,n the wire search mode and, ,f desxred, ■„ « ^^ ^ ^^ ^ ^.^ ^^ fw ^^ in rwirflearch mode, the processor 48C rates infor- mons on a continuing basis as they arc received from the -"stantiaHy the same function in substantially the same way to obtain substantially the same result, llie claims are network 44C through the spider sysicm 46C as indicated by ihus to be understood to include whai is specifically illusthc reference character 48C1. In the demand search mode, 65 traled and described above, whai is conceptually equivalent, the processor 48C rates informons returned by the spider and also what incorporates the essential idea of the mvensyslcm 46C in a demand search as indicalcd by the reference lion. US 6,314,420 Bl 28 27 10. A search engine system comprising: a system for scanning a network to make a demand search TABLE 1 Article Author for informons relevant to a query from an individual Rating given user; a content-based filter system for receiving the informons from the scanning system and for filtering the infor mons on the basis of applicable content profile data for relevance to the query; and a feedback system for receiving collaborative feedback data from system users relative to informons consid 10 ered by such users; the filter system combining pertaining feedback data from the feedback system with the content profile data in filtering each informon for relevance to the query. 11. The system of claim 10 wherein adaptive user feed What is claimed is: 1. A search engine system comprising: back data is applied to the content-based filter to provide a learning component for content profile data employed a first system for receiving informons from a network on a continuing search basis, for filtering such informons 30 therein. 12. The system of claim 10 wherein: for relevancy to a query from an individual user, and for the scanning system further scans the network on a storing a ranked list of relevant informons as a wire; continuing basis to make a wire search for informons a second system for receiving informons from a network relevant to wire queries from system users; and on a current demand search basis and for filtering such informons for relevancy to the query from the indi 35 vidual user; and data in filtering each wire informon for relevance to a third system for selecting at least one of the first and second systems to make a search for the query and to applicable wire query. 13. The system of claim 10 wherein the collaborative feedback data comprises active feedback data. 14. The system of claim 10 wherein the collaborative feedback data comprises passive feedback data. 15. The system of claim 14 wherein the passive feedback data is obtained by passively monitoring the actual response return the wire or demand search results to the indi vidual user. 2. The system of claim 1 wherein the third system selects the first system to make a wire search only if a wire already exists for the query. 3. The system of claim 1 wherein: a feedback system is provided for receiving collaborative feedback data from system users relative to informons to a proposed informon. 16. The system of claim 10 wherein the collaborative feedback data comprises a combination of active feedback considered by such users; and at least the first system combines pertaining data from the feedback system with content profile data of the first system in filtering each informon for relevance to the query and inclusion in the wire. 4. The system of claim 3 wherein the first system includes a multi-level, content-based filter having descending levels including at least an upper preprocessing level, a middle user community level, and a bottom user level. 5. The method of claim 3 wherein the collaborative data and passive feedback data. 50 from an individual user; 55 6. The method of claim 3 wherein the collaborative feedback data comprises passive feedback data. 7. The method of claim 6 wherein the passive feedback data is obtained by passively moniloring the actual response to a proposed informon. users; a system for controlling the operation of the filtering system to filter for one of a wire response and a demand response and to return the one response to the user; and the filtering system combining pertaining feedback data from the feedback system with content profile data in determining the relevancy of the informons for inclu sion in at least a wire response to the query. 18. The system of claim 17 wherein: the content-based filtering system includes 8. The method of claim 3 wherein the collaborative feedback data comprises a combination of active feedback data and passive feedback data. of content profile data employed therein. 17. A search engine system comprising: a content-based filtering system for receiving informons from a network on a continuing basis and for filtering the informons for relevancy to a wire or demand query a feedback system providing feedback data from other feedback data comprises active feedback data. 9. The system of claim 1 wherein adaptive user feedback data is applied at least to the first system to provide updating the filter system combines pertaining feedback data from the feedback system with applicable content profile 65 a collaborative/content based filter for filtering infor mons for relevancy to a wire query on a continuing basis; and US 6,314,420 Bl 30 29 30. A method for operating a search engine system the content-based filtering system includes a regular comprising: search engine for filtering informons for relevancy to a receiving informons in a content-based filtering system demand query. from a network on a continuing basis and filtering the informons for relevancy to a wire or demand query 19 The system of claim 18 wherein adaptive user feed back data is applied at least to the collaborative/content- based filter to provide learning for content profile data employed therein. from an individual user; . 20. The search engine system of claim 17 wherein the feedback system provides active feedback data. 21. The search engine system of claim 17 wherein the providing feedback data from other users; controlling the operation of the filtering system to filter for one of a wire response and a demand response and 10 to return the one response to the user; and feedback system provides passive feedback data. combining pertaining feedback data with content profile 22. The search engine system of claim 21 wherein the passive feedback data is obtained by passively monitoring data in the filtering system in determining the relevancy of the informons for inclusion in at least a wire the actual response to a proposed informon. 23 The system of claim 17 wherein the feedback system 15 response to the query. provides a combination of active feedback data and passive feedback data. 24. A method for operating a search engine system comprising: receiving informons in a first system from a network on 20 31. The method of claim 30 wherein the step of providing feedback data comprises providing active feedback data. 32. The method of claim 30 wherein the step of providing feedback data comprises providing passive feedback data. a continuing search basis, for filtering such informons 33. The method of claim 32 wherein the passive feedback data is obtained by passively monitoring the actual response from at least one of the other users to a proposed informon. for relevancy to a query from an individual user and for storing a ranked list of relevant informons as a wire; 34. The method of claim 30 wherein the step of providing receiving informons in a second system from a network ^ feedback data comprises providing a combination of active on a current demand search basis for filtering such feedback data and passive feedback data. informons for relevancy to the query from the indi 35. A search engine system comprising: vidual user; and selecting at least one of the first and second systems to make a search for the query and to return the wire or ^ demand search results to the individual user. 25. A method for operating a search engine system comprising: scanning a network to make a demand search for infor mons relevant to a query from an individual user, 35 receiving the inforraons in a content-based filter system means for receiving informons from a network on a continuing search basis, for filtering such informons for relevancy to a query from an individual user, and for storing a ranked list of relevant informons as a wire; means for receiving informons from a network on a current demand search basis and for filtering such informons for relevancy to the query from the indi vidual user; and means for selecting at least one of the first and second systems to make a search for the query and to return the from the scanning system and filtering the informons on the basis of applicable content profile data for relevance to the query; wire or demand search results to the individual user. 36. A search engine system comprising: relative to informons considered by such users; and means for content-based filtering informons received receiving collaborative feedback data from system users 40 from a network on a continuing basis for relevancy to a wire or demand query from an individual user; combining pertaining feedback data with the content profile data in filtering each informon for relevance to the query. . 45 26. The method of claim 25 wherein the collaborative feedback data comprises active feedback data. 27. The method of claim 25 wherein the collaborative feedback data provides passive feedback data. 28. The method of claim 27 wherein the passive feedback _q data is obtained by passively monitoring the actual response to a proposed informon. 29. The method of claim 25 wherein the collaborative feedback data comprises a combination of active feedback data and passive feedback data. means for collecting feedback data from other users; means for controlling the operation of the filtering means to filter for one of a wire response and a demand response and to return the one response to the user; and the filtering means combining pertaining feedback data from the feedback system with content profile data in determining the relevancy of the informons for inclu sion in at least a wire response to the query.

Disclaimer: Justia Dockets & Filings provides public litigation records from the federal appellate and district courts. These filings and docket sheets should not be considered findings of fact or liability, nor do they necessarily reflect the view of Justia.


Why Is My Information Online?