Selene Communication Technologies LLC v. Google Inc.

Filing 1

COMPLAINT filed with Jury Demand against Google Inc. - Magistrate Consent Notice to Pltf. ( Filing fee $ 400, receipt number 0311-1503856.) - filed by Selene Communication Technologies LLC. (Attachments: # 1 Exhibit A, part 1, # 2 Exhibit A, part 2, # 3 Exhibit A, part 3, # 4 Exhibit A, part 4, # 5 Civil Cover Sheet)(cla, )

Download PDF
US 6,363,377 Bl 7 8 Content lens 520 first eliminates stop words, such as "a", search engine returns a large set of URLs, one may want to "an", "the", "to", etc., and then tries to partition the docurestrict it to pages that were visited only last week, or to ments by common sentences, phrases or words. As an pages that have been bookmarked, or to a smaller set of example, content lens 520 finds 40 documents that contain URLs that are relevant, based upon some specific criteria not the term "Home allendale bayonne belleville bergenfield 5 previously captured in the search engine index. Sometimes bloomfield butler" corresponding to a cluster in cell 522. In it is desirable to filter the output to exclude pages to which cell 524, content lens 520 finds 20 documents that contain a user should not have access. Referring to FIG. 2B, there is the term "top business and economy companies restaurants shown the details of the dynamic filter processor in step 24 organizations". In cell 526, content lens 520 finds 20 docuof FIG. lA. The dynamic filtering, based on a dynamic set ments that contain the term "cape may county". Contact lens 10 of URLs, is used to restrict the results of a search query. In 520 finds 20 documents that have no patterns in cell 528. the exemplary embodiment of the invention, the dynamic set Edges 554 indicate that documents clustered in cell 512 are of URLs can be determined explicitly by the user from a user exactly the same documents corresponding to cell 522. profile as shown in step 57 of FIG. 2B, or, in general, from Edges 556 indicate that the documents found in cell 514 are information stored in other information management sysexactly the same as the documents clustered in cell 524. FIG. 5 also shows Age legs 530. Age lens 530 clusters on 15 terns. Once the profile is accessed, then in step 59 the URL's may be filtered and the results are displayed in step 61. the documents' date of last update. Age lens 530 partitions FIG. 7 is a Venn diagram which illustrates a dynamic the 500 documents found into 4 clusters corresponding to filtering operation. In FIG. 7, area 710 defines Sl as the set cells 532, 534, 536, and 538. Cell 532 shows 40 documents of URLs of pages and their summaries, returned by the which were updated on Mar. 4, 1997. Cell 536 finds 30 documents updated on 1997. Cell 536 finds 20 documents 20 search engine, in response to a query for text in its own updated on 1996. Cell 538 finds 10 documents updated on information management system. Let n be the number of 1995. URLs in Sl. There are also n summaries corresponding to FIG. 5 shows the window of clusters results from which each URL of Sl. Area 712 defines S2 as the set of URLs that the user may select any number of cells from Lenses 500, is dynamically generated by the user or by a query external 510, 520 and 530. The system, then, displays the documents 25 to the search engine. Let m be the number of URLs in S2. that satisfy the conditions in all the selected cells. For There are no summaries associated with these URLs. Sl and example, as shown in FIG. 6 which is a graphical display of S2 are likely to have URLs in common. The filter returns the a selected search result of FIG. 5. In FIG. 5, the user chooses URLs in the intersection of Sl and S2 and the respective cell 526 corresponding to "Cape May County", cell 534 summaries corresponding to area 714. corresponding to the year 1997 and cell 536 corresponding 30 The dynamic filtering of the present invention may to year 1996. As a result, the system displays 15 documents improve upon the poor performance of other filtering techthat contain "Cape May County" and were last updated in niques which typically involve multiple disk fetch and store 1996 or 1997. The documents found are clustered and operations and several sorting steps. The performance of a fetch and store filtering technique of this type would be displayed in cell 600. Cell 602 indicates that 10 of the documents found were updated in 1997. Cell 604 indicates 35 O((m+n)log(m+n)). In contrast, the dynamic filtering of the that 5 out of the 15 documents found were last updated in present invention uses hashing and associated arrays in an 1995. intelligent fashion and has a performance of O(m+n). First, for set S2, an associative array is setup as shown in In another embodiment of the invention, the set of keyTable 1. words from the search query are used to rank the documents returned by the search engine. The more keywords that 40 Table 1 appear in a document, the higher the document is ranked. The results organizer outputs results from the highest ranked Flag[url_l]=l; to the lowest ranked. Flag[url_2]=1; ... The benefit of clusters to the user is that the clusters may contain all items of interest or duplicate items. In the first 45 Flag[url_m ]=1; In table 1, "url_n" represents a hash address generated from case, only items in the cluster need be reviewed by the user. While in the second case, only one item from the cluster a particular URL. This process takes m steps. The Flag array needs to be reviewed by the user prior to rejecting all the indicates the URLs to be included from Sl. other documents in the cluster. The "Vote" column in the As the search engine starts returning URLs of set Sl, a display allows the user to indicate the relevancy of a cluster 50 hash index into the associative array is generated from the to his informational needs. If the user votes positively on a URL and a check is performed to see if the corresponding cluster, the system can use the documents in the desired Flag is set. Only the URLs with the corresponding Flag set and their summaries are provided as the output URL's of the cluster (referred to hereafter as good documents) to recluster dynamic filter. Everything else is ignored. After the URL is the remaining documents, giving a higher weight to documents that are similar to the good documents. If the user 55 provided as an output URL, its Flag is reset to 0, to ensure votes negatively on a cluster, the system can use the docuthat the same URL is not presented again. This is performed ments in the undesired cluster (referred to hereafter as bad sequentially and hence takes n steps. The entire algorithm is therefore O(m+n) since lookup using a hash table in an documents) to recluster the remaining documents, giving a lower weight to documents that are similar to the bad associative array is 0(1). The result of this process is the documents. 60 intersection of Sl and S2, with the associated summaries All clusters which receive a yes vote are saved along with from Sl. Duplicate URLs, if returned by the search engine the query in a search context folder. A user has the ability to are eliminated. find a query and its results by either browsing the search FIG. 8 shows the architecture of an exemplary implemencontext folders or doing a keyword based search for them tation according to the third embodiment of the present among all the search context folders. 65 invention. The implementation uses commercially available tools such as a Harvest search engine, an Informix database Often times, it is desirable to filter the output of search and a dynamic filter implemented using JavaScript and Perl. engines, to prevent information from being displayed. If the US 6,363,377 Bl 9 10 As an example, a particular implementation is the one of a of the query is precisely the process that a user currently performs, with no help from the search engine, in order to student wanting to search pages of an Internet course that he refine the user's search. A query tuner according to the is taking, but limiting it to the pages he visited last week. present invention offers the user a helpful guide in the user's The Student is presented with a query page 810. Query page 810 is a form where the student user inserts the text to 5 quest by taking the user's query and generating a small number of additional queries according to a query hierarchy. search and specifies to limit the search only to pages he The user's query and the additional queries are forwarded to visited last week. When the form is submitted, the query is the search engine. The query tuner, then, evaluates the split in two other queries 812 and 814. Query 814 goes to the results of all the queries and suggests possible query reforInformix database 816 which has tracked the student's 10 mulations to the user, together with the expected number of navigation in the course. The Informix database 816 genermatching documents each reformulation would yield. ates a dynamic set 822 of URLs (S2) which represents the Since each search engine has its own query language, the pages he visited last week. Dynamic set 822 of URLs S2 is query tuner, as shown in FIG. 2A, is defined for an abstract then passed to dynamic filter 824, which sets up an assoquery language that can easily be mapped to any particular ciative array of flags as explained above, with reference to 15 engine's language. In most search engines, there are two FIG. 7. types of data that the user can input: content that is to be Query 812 goes to the Harvest search engine 816 which matched to the text of documents, and structure, or metais configured to index all the pages of the course. Upon data, related to each document that represents conditions to receiving query 812, the Harvest search engine 816 starts to be satisfied in order for a document to be considered a return pages from its index. The pages returned are all of the pages for the course no matter when the student visited the 20 match. For example, in the AltaVista query shown in Table 2. course pages, since student access information is not stored in the search index. The Harvest search engine 816 also Table 2 returns pages that the student has not yet visited. The output title: "cryptographic protocols" set 818 from Harvest Sl, is processed against the Flag array English language by dynamic filter 824 and the intersection is returned. The 25 dated after Jan. 1, 1997 algorithm also filters out duplicates. Dynamic filtering according to the present invention, can For this example, content is "cryptographic protocols" and be implemented anywhere one desires to restrict/filter one meta-data is all the other information, consisting of the set of URLs by another set of URLs. This may be desirable, requirements that the content appear in the title of the for example, for security reasons, when a company wants to 30 document, that the document's language be English, and that restrict access to its resources based on the employee's the document be dated after Jan. 1, 1997. The query requests identity. Different employees may have associated with all pages that have the keyword "cryptographic protocols" in different lists of URLs that they have permission to read. the title, that are written in the English, and that are dated When a search is performed, the company can insure that after Jan. 1, 1997. only those URLs that are not restricted for the particular user 35 In addition, to the types of data, there is an implicit or are presented. This addresses a common problem that search explicit Boolean operation to be performed on the different engines have: returning summaries of pages to which a user parts of the query. In the above query, there is an implicit "AND" operator among all of the query parts. In other does not have access. Usually, search engines return sumwords, implicitly the query specifies that the phrase "crypmaries of protected pages, even if the user is restricted access to those pages via web server password protection or 40 tographic protocols" appear in the content of the document AND that this match be in the title AND that the document's other mechanisms. Adult related material can also be filtered language be English AND that the document be dated after the same way. Internet service providers could use dynamic Jan. 1, 1997. A query may be formally defined to be a filtering to prevent children from searching adult web sites Boolean expression Q=(q;) op (q), where op is a Boolean and newsgroups. Personalizing search engines in an educational environment can be taken to higher levels. A student 45 operation from the set {AND, OR AND NOT, BEFOREw, can associate various topics with the visited pages and then, NEARw}, and each of q;, qj, is either a Boolean expression, search the database based on URL's that are associated with or of the form (m:k), where m is a (possibly empty) meta-data quantifier and k is a (possibly empty) keyword. a particular topic. A teacher may also mark certain pages as The Boolean operators AND, OR and NOT have the stanbeing essential for a final exam, or as potential topics for independent study. Then searching could be restricted to this 50 dard meanings; x NEARw y is TRUE if an only if x appears within w words of y; x BEFOREw, y is TRUE if x appears set of pages only. For each component of the user's query, there are a few at most w words before y. A document satisfies or matches a query if it satisfies the natural ways in which the query may be either restricted Boolean expression Q, where satisfying (m:k) means that further or relaxed further to potentially produce either fewer results or a greater number of results. This concept is best 55 the keyword k satisfies (or appears in) meta data m. A illustrated by an example. Suppose the user's query specifies keyword k can be a single word, a phrase, or a word with some wildcard characters. Meta-data can be any structural that the phrase "cryptographic protocols" appear as a headinformation such as title, heading, URL, domain, filename, ing in the document. A more restrictive, or stricter, query would require that "cryptographic protocols" be part of the file extension, date; it can also be a specification to the title, which would typically yield fewer results in compari- 60 quality of the match required. For example, an approximate son to the user's query. A less restrictive, broader query match meta-data operator (approxa y) evaluates to TRUE for would specify that "cryptographic protocols" appear anya phrase x if and only if x can be transformed into y (or vice where in the text, which would typically yield a greater versa) with at most d single-letter deletes, insertions or number of results in comparison to the user's query. Another substitutions; a synonyms operator (syn y) evaluates to way to relax the user's query is to require that the words 65 TRUE for a phrase x if and only if x is a synonym of y. "cryptographic" and "protocols" appear near each other in a A hierarchy forest may be defined on all the meta-data of heading, rather than as a phrase. Relaxation and restriction a query, where each tree contains meta-data that has certain US 6,363,377 Bl 11 12 relationship to other meta data. As shown in FIG. 9(a), for time to receive the results of just the user's query. Referring example, the structural meta-data of an H1ML document to FIG. 2A step 55, the next step performed by the query forms a natural hierarchical list from the tags that specify the tuner is to formulate the related queries. This process as well most prominent information in the document, i.e. title, to as how the results of the query aid the user is described tags that specify the least prominent information in the 5 below. document, i.e. text. In FIG. 9(a), requiring the keywords to The formulation of related queries according the query hierarchies is illustrated based on a sample query appear in the title 900 is more restrictive than requiring the keywords to appear in the title or level 1 heading 902. Cell Q=(( title;cryptographic) BEFO RE 1 ( title;protocols)) AND 902, in turn is more restrictive than requiring the keywords ((English language) AND (dated after Jan. 1, 1997)). The to appear in the title or level 1 or level 2 heading 904. An 10 term item is used to refer to any atomic part of the query: a even broader search query can be done with respect to cell meta-datum, a keyword or a Boolean operator. For example, Q contains the following set of items {title, cryptographic, 904, which adds the level 2 headings to the query in cell 902. The broadest most general query can be done with respect to BEFORE1 , title, protocols, AND, English language, AND, cell 906, which allows any text to be searched. dated after Jan. 1, 1997}. For each query item t, define h(t) In general, each hierarchy tree is ordered top to bottom 15 to be the node in the hierarchy forest corresponding to the from meta-data values that most restrict the query to values item t. Related queries consists of a set of queries, each of that least restrict the query. For example, numerous hierarwhich takes the original user query and modifies some items chies are appropriate for the date field, depending on the in it by either restricting or broadening them according to the hierarchy forest. The act of broadening (restricting) a query desired granularity. In the context of recent news stories, a daily granularity is appropriate as shown in FIG. 9(b). Cell 20 item t corresponds to using a descendant (an ancestor) ofh(t) in place oft within Q. 910 restricts the search to documents dated after Jan. 2, 1997 while cell 912 restricts the search to documents dated after For example, one set of related queries for our sample query Q is shown in Table 3 Jan. 1, 1997. A search with a cell 912 restriction is broader and encompasses the search with a cell 910 restriction. An Table 3 even broader search can be done with a cell 914 restriction 25 ((title:cryptographic) BEFORE 1 (title: protocols)) which includes documents dated after Dec. 31, 1996. On the other hand, in the context of general web pages, a yearly or (( <hl> or title: cryptographic) BEFORE 1 ( <hl> or title: biannual granularity is more relevant as shown in FIG. 9(c). protocols)) AND((English language) AND (dated after Cell 920 restricts the search to documents dated after Jul. 1, Jan. 1, 1997)) 1997 while cell 927 restricts the search to documents dated 30 ((title:cryptographic) BEFORE 2 (title: protocols)) AND after Jan. 1, 1997. A broader search can be done with respect ((English language) AND (dated after Jan. 1, 1997)) ((title:cryptographic) NEAR 1 (title: protocols) AND to documents dated after Jul. 1, 1996 in cell 924. As with the meta-data, there is a hierarchy for the key((English language) AND (dated after Jan. 1, 1997)) words. For example, as shown in FIG. lO(a), the top of the ((title:cryptographic) BEFORE 1 (title: protocols)) AND hierarchy is represented by cell 1010 and "keyword" corre- 35 (dated after Jan. 1, 1997)) sponds to the most restrictive search query. Second on the ((title:cryptographic) BEFORE 1 (title: protocols)) AND hierarchy is cell 1012 corresponding to a broader search that (English language)) can be done with the "all the English stemmings of key((title:cryptographic) BEFORE 1 (title: protocols)) AND word". Cell 1014 is at the bottom of the hierarchy and (English language) AND (dated after Jan. 1, 1997) corresponds to the broadest search query related to "key- 40 Where <hl> represents the main index level or the highest level heading in the HTML of the page. word or any of its synonyms". Finally, a hierarchy on the Boolean operators that form the The exemplary tree shown in FIG. lO(b) indicates that the query Q is defined as follows. For a single-word keyword search can be contracted or restricted by moving up the tree. with or without wildcards, the hierarchy is shown in FIG. In addition, it indicates that the search can be expanded by lO(a). When the keyword is a phrase, it is converted into a 45 moving down the tree. For example, a search limited to x Boolean expression to which the Boolean hierarchy applies. AND y according to cell 1040 can be restricted by moving up the tree and searching according to cell 1050 where the More specifically, ifk=w1 , w 2 . . . w,, and W; is the i-th word in the keyword phrase, m:k becomes (m:w1 ) BEFORE 1 search is restricted to x NEARn y. In contrast, the search can be expanded by moving down the tree and searching for only (m:w2 ) BEFORE 1 . . . BEFORE 1 (m:w,). Although not shown, the bottom-most node in each hierarchy is the NULL 50 x according to cell 1030 or searching for only y according expression. to cell 1035. The search can be further expanded by searchThese query hierarchies may be used to help the user ing for x OR y according to cell 1020. The generation of a set of related queries may be refine a given query more effectively. In the Internet's accomplished, for example, by holding all but one items of current state, the slowest operation for a user performing a search is the network delays in communicating with the 55 Q constant, while broadening or restricting the chosen item search engine. In a typical search session, the user formut of Q. The broadening (restricting) may be accomplished by lates a query, sends it to the search engine, waits some time, traversing any number of edges up (down) the hierarchy tree receives an answer, then reformulates the query and repeats from h(t). Since different edges in the hierarchy forest have the process. Some of the user's frustration comes from different restrictive/broadening effects on the query, it is having to pay for the network delay during each query 60 more efficient to traverse different number of edges in the tree for different items in the query. In the exemplary reformulation. The present invention cuts down the number of reformulation iterations used to find the relevant inforembodiment of the invention, the number of edges traversed is a fraction of the height of the hierarchy tree. Formally, an mation. When the user poses a query, the browser generates a number of related queries and sends all the queries to the f-family is defined as a set of queries, each of which takes search engine in parallel. The time to receive the complete 65 the original user query and modifies a set {t1 t2 . . . ts} of results for the users query and just the number of matches for items in the query, where each modified item tj is replaced each of the related queries is asymptotically the same as the by a node that is exactly min{l, f*H} edges away from h(t)

Disclaimer: Justia Dockets & Filings provides public litigation records from the federal appellate and district courts. These filings and docket sheets should not be considered findings of fact or liability, nor do they necessarily reflect the view of Justia.


Why Is My Information Online?