Masterobjects, Inc. v. Microsoft Corp
Filing
15
MICROSOFT CORPORATIONS' CORRECTED ANSWER to Complaint with Jury Demand, COUNTERCLAIM against Masterobjects, Inc. byMicrosoft Corp. (Attachments: # 1 Exhibit A)(Kalay, Leeron) (Filed on 8/1/2011)
111111
United States Patent
[19]
Miller
WORD PREDICTION SYSTEM
[75]
Inventor:
[73]
Assignee: Microsoft Corporation, Redmond,
Wash.
[21]
Appl. No.: 382,074
[22]
Filed:
[56]
US005805911A
[11]
[45]
[54]
[51]
[52]
[58]
1111111111111111111111111111111111111111111111111111111111111
John W. Miller, Kirkland, Wash.
Patent Number:
Date of Patent:
5,805,911
Sep. 8, 1998
A Space-Economical Suffix Tree Construction Algorithm by
Edward M. McCreight, Xerox Palo Alto Research Center,
Palo Alto, California the Journal of the Association for
Computing Machinery, vol. 23, No.2, Apr. 1976, pp.
262-272.
Primary Examiner-Phu K. Nguyen
Attorney, Agent, or Firm-Jones & Askew, LLP
Feb. 1, 1995
6
Int. CI. ...................................................... G06F 15/00
U.S. CI. .............................................................. 395/796
Field of Search ......................... 364/419.15,419.09;
395/161, 155, 796; 3451792, 795, 797
References Cited
U.S. PATENT DOCUMENTS
4,558,302 12/1985 Welch ..................................... 340/347
4,814,746 3/1989 Miller et al. .............................. 341/95
4,955,066 9/1990 Notenboom ............................... 382/56
4,969,097 11/1990 Levin ...................................... 364/419
5,109,433 4/1992 Notenboom ............................... 382/40
5,261,091 11/1993 Yuyama .................................. 395/600
5,367,453 11/1994 Capps et al. ....................... 364/419.13
OlliER PUBLICATIONS
The Reactive Keyboard, (Table of Contents, Chapters 2, 3, 5,
and 6 submitted), by John J. Darragh, Dept. of Computer
Science, University of Calgary, Alberta, Canada and Ian H.
Witten, Department of Computer Science, University of
Waikato,Hamilton, New Zealand, Cambridge University
Press 1992.
Implementation of the Substring Test by Hashing, by Malcolm C. Harrison, New York University the Communications of the ACM, Dec. 1971, vol. 14, No. 12.
~
I
CONTROL
[57]
ABSTRACT
A computer-implemented method of providing and selecting
multiple text prediction from text entered from multiple
windows of a computer system using an applicationindependent text prediction system. Applicationindependence is the ability to work with many different
applications without being customized for each applications.
Because different prediction methods are appropriate for
different applications, multiple prediction components
called "prediction modules" provide text predictions based
on each prediction module's prediction method from which
the best prediction may be displayed. The best prediction
from the results of the multiple methods may be displayed.
In accordance with providing multiple predictions from
various modules, a method is provided for choosing which
modules' predictions should be used from the set of possibly
contradictory predictions. Each prediction module produces
a weighted list of predictions. The prediction module
weights are based on the prediction modules' estimation of
the probability that the completion text will be entered by the
user. The resulting best prediction or predictions with the
greatest weights may then be presented to the user.
22 Claims, 15 Drawing Sheets
SHIFT
19
17
u.s.
Patent
Sep. 8, 1998
5,805,911
Sheet 1 of 15
1\10
14
~
.. I
/
I
I
.
/
II
Pred~l
15~
(
1
34
18
35
16
1111111111111111111111
SIDFT
I
I
CONTROL
19
17
u.s.
Patent
Sep. 8, 1998
5,805,911
Sheet 2 of 15
12)
,-- - -- ---___ L_ -- ----
--l
22
CENTRAL PROCESSING
UNIT
24
BUS
MEMORY UNIT
20
26
INPUT/OUTPUT
PORTS
I
I
I
I
- I
I
I
I
I
I
L ________________________ ..J
2
I
u.s.
Patent
Sep. 8, 1998
°7
r-..
APPLICATION
APPLICA TION
3
36)
t
5,805,911
Sheet 3 of 15
32
31
APPLICA TION
~
~
h
GRAPHICAL WINDOWING
SYSTEM
FORMS PACKAGE
------ \--------------
\------
34
-
PREDICTOR
SYSTEM
40~
WINDOWS-TOFORMS PACKAGE
\46
~
42
PREDICTION MODULE MANAGER
t
MOST
FREQUENTLY
USED
PREDICTION
MODULE
43)
-
r
IN FORMA TION
RETRIEVAL
MODULE
44
,......
ATTRIBUTE
CORRELATOR
PREDICTION
MODULE
47~
BEST MATCH
PREDICTION
MODULE
440-~
MOST RECENTLY
USED PREDICTION
MODULE
DICTIONARY
PREDICTION
MODULE
4sJ
44b
u.s.
Patent
Sep. 8, 1998
5,805,911
Sheet 4 of 15
c
)
35
FIELD 1
TO:
I
FIELD 2
FROM:
I JOHNDOE
~
RE:
I
JMILLER
I
I
JMILLER
FIELD 3
FIELD 10
-""
--'" CC:
JANEDOE
35J
J
35
~5
u.s.
Patent
Sep. 8, 1998
5,805,911
Sheet 5 of 15
PREDICTION LIST
NUMBER OF
CORRECT
PREDICTIONS
67
J
NUMBER OF
PREDICTION ATTEMPTS
FIELD 1
4
8
FIELD 2
3
7
1
1
•
•
•
•
FIELD 10
6
~
RECENTLY USED
FIELD ID #'S
7
5
3
6
~.
I
4
}5
9
LIj-
8
10
BEST PREDICTION FIELD PAIRS_
6 :3
V
SELECTED
FIELD ID #
BEST PREDICTION
FIELD ID # FOR
SELECTED FIELD
1
1
2
2
3
3
4
1
5
5
6
6
7
7
8
8
9
9
10
3
u.s. Patent
Sep. 8, 1998
5,805,911
Sheet 6 of 15
52
50
(
0
RECORD
RECORD
RECORD
RECORD
1
RECORD
RECORD
RECORD
RECORD
2
RECORD
RECORD
RECORD
RECORD
RECORD
RECORD
RECORD
RECORD
RECORD
RECORD (3)
HASHED
INPUT
CHARACTER
SEQUENCE
3
56
•
•
•
N
RECORD
RECORD
52
{54
I# IsIu Ii Ie I Is Ie I. I. Is I Is IeIaIs Ih IeI. I. Is I
fz
1
2
3
4
5
6
7
8
9
10
11
12
13
---.D. ·
I
14
Uj1
15
7
16
17
18
19
20
21
22
u.s.
Patent
Sep. 8, 1998
UPDATE AUXILIARY
DATABASE FOR
INFORMA TION
RETRIEVAL MODULE
822
824;
841
DISPLAY PREDICTION
IN LIGHTER SHADE
THAN INPUT TEXT
GET TEXT
PREDICTIONS FROM
INFORMATION
RETRIEV AL MODULE
SHIFT
KEY OR
MORE
GET TEXT PREDICTION
FROM DICTIONARY
MODULE
826
5,805,911
Sheet 7 of 15
828
85'2J
GET PREDICTION
FROM MOST
FREQUENTLY USED
MODULE
DISPLAY
CHARACTERS
IN BLACK
TEXT
830
GET TEXT PREDICTION
FROM ATTRIBUTE
CORRELATOR MODULE
MOVE CURSOR TO END
OF SELECTION
DETERMINE BEST
PREDICTIONS
FROM MODULES
USING WEIGHTED
BELIEF FACTORS
854
I
~
832
H*/
_
B
u.s.
Patent
Sep. 8, 1998
5,805,911
Sheet 8 of 15
1--------------------------1
\902
r
912
LOOK IN LIST OF FIELD
ID PAIRS AND FIND THE
MOST RECENT PAIR
WHICH BEGINS WITH THE
CURRENT FIELD ID
IF THE FIELD OF
ENTRY CHANGES,
RECORD THE NEW FIELD
ID # IN A LIST OF
RECENTLY USED FIELDS
914)
904)
I
I
I
I
I
t
I
I
I
I
I
ACCESS LIST TO FIND
THE MOST RECENTLY
USED FIELDS
QUERY THE BEST MATCH
MODULE WITH
THE SECOND ID IN THE
PAIR
(,916
t
I
QUERY THE BEST MATCH
MODULE FOR EACH FIELD
LISTED
IF NO MATCH FOUND
QUERY OTHER PAIRS IN
LIST UNTIL PREDICTION
FOUND
t
I
I
I
I
I
I
I
t
I
t
I
t
I
918l
DETERMINE THE FIELD
WITH THE HIGHEST
BELIEF FACTOR
RETURNED BY THE BESTMATCH MODULE
9 08~
I
I
I
I
('906
I
I
I
RETURN BEST MATCH
FOR SECOND ID AS
PREDICTION
-
,
I
I
I
t
I
I
I
I
I
I
I
t
I
I
t
STORE IN A FIELD PAIR
LIST THE CURRENT FIELD
ID # WITH THE FIELD
CONTAINING THE
HIGHEST BELIEF FACTOR
CONTINUE
,------------------------5---t
910
I
I
I
J
~L·
I
1rLL/
~
I
9A
830
u.s.
Patent
Sep. 8, 1998
5,805,911
Sheet 9 of 15
~---------------------------l
I
\
I
I
I
I
I
I
I
I
I
I
I
902
(
04~
I
LOOK IN LIST OF FIELD ID
P AIRS AND FIND PAIR
WHICH OCCURS MOST
WHICH BEGINS WITH THE
CURRENT FIELD ID
IF THE FIELD OF ENTRY
CHANGES, RECORD THE
NEW FIELD ID # IN LIST
OF RECENTLY USED
FIELDS
9
912b
914)
I
I
I
I
ACCESS LIST TO FIND THE
MOST RECENTLY USED
FIELDS
I
I
I
I
I
I
I
I
I
QUERY THE BEST MATCH
MODULE WITH THE
SECOND ID IN THE PAIR
(906
[916
QUERY THE BEST MATCH
MODULE FOR EACH FIELD
LISTED
IF NO MATCH FOUND
QUERY OTHER PAIRS IN
LIST UNTIL PREDICTION
FOUND
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
DETERMINE THE FIELD
WITH THE HIGHEST
BELIEF FACTOR
RETURNED BY THE BESTMATCH MODULE
908
0
9 18 )
RETURN BEST MATCH
FOR SECOND ID AS
PREDICTION
1
STORE IN A FIELD PAIR
LIST THE CURRENT FIELD
ID # WITH THE FIELD
CONT AINING THE
HIGHEST BELIEF FACTOR
CONTINUE
910
L - - __ : ]_______________________ J
830
u.s.
Patent
Sep. 8, 1998
Sheet 10 of 15
5,805,911
r------------I
I
I
RECEIVE PREDICTION
BELIEF FACTOR FROM
EACH PREDICTION
MODULE FOR CURRENT
PREDICTION
I
1002
l-----J
ACCESS LIST OF
SUCCESSFUL
PREDICTIONS AND
NUMBER OF PREDICTION
ATTEMPTS FOR THE
CURRENT FIELD OF
ENTRY
REWEIGH THE BELIEF
FACTOR BASED ON THE
RATIO OF SUCCESSFUL
PREDICTIONS TO NUMBER
OF PREDICTION
ATTEMPTS
~
1004
1006
~
1008)
PROVIDE BEST
PREDICTIONS FOR
DISPLAY
DETERMINE WHICH
PREDICTIONS HAVE THE
HIGHEST BELIEF
FACTORS
Ir
RETURN REWEIGHED
BELIEF FACTORS FOR
PREDICTION IN ORDER OF
DECREASING BELIEF
I
CONTINUE
f----
1
'------_ _---Ir / \}
I
L ____ )
(
_______ _
__ _
1010
__ _
I
I
-I;--~"---Uz-· 10
---)
9 32
I
--=r
I
_ _ _ _ _ _ _ _ _ _ _ _ ...1
u.s.
,-
Patent
- -
- -
-
Sep. 8, 1998
-
--
-
-
-
-
-
5,805,911
Sheet 11 of 15
- - - -
-
-
-
-
-
-
-
-
-
-
-
-
-
- --..,
MATCH LENGTH <
mSTORY LENGTH?
NO
1110
1124
1112
CALCULATE HASH VALUE
FOR MOST RECENT
OCCURRENCE OF
CHARACTER SEQUENCE
INCREMENT MATCH
LENGTH
1114
1130
CALCULATE HASH VALUE
FOR CURRENT CHARACTER
SEQUENCE
LONGER
MATCH AT MOST
RECENT OCCURRENCE
POSITION?
1116
DETERMINE WHETHER
POSITION STORED IN
DATABASE REPRESENTS
FIRST OCCURRENCE OF
CURRENT CHARACTER
SEQUENCE
STORE POSITION OF THE
MATCH FOR THE MOST
RECENT OCCURRENCE HASH
VALUE
UPDATE CURRENT
CHARACTER HASH VALUE
FOR THE LONGER MATCH
NO
I
I
I
I
I
I
I
1132
I
1118
FIRST
1
1120
FIRST OCCURRENCE
POSITION OR POSITION
FOR A NEW lWDATED
SEQUENCE?
1- _______________________
-----------------~
I
I
I
I
I
I
~.
I
:~822
....J
ill
Uj-
u.s.
Patent
Sep. 8, 1998
5,805,911
Sheet 12 of 15
r - - - - - - - -- - - - - - - - - - - - - - - - - - - - - -
- - - - -
I
INPUT CHARACTER SEQUENCE
1
"I
1
I
I
I
MATCH LENGTH = 1
1
1
I
I
1224
1210
1
1
1
I
MATCH LENGTH < INPUT
TEXT LENGTH?
NO
1
1
INCREMENT MATCH
LENGTH
1
I
1214
I
1
12:30
1
1
CALCULATE HASH VALUE
FOR CURRENT CHARACTER
SEQUENCE
I
LONGER MATCH
AT MOST RECENT
OCCURRENCE
POSITION?
1
I
1
I
I
DETERMINE WHETHER
POSITION STORED IN
DATABASE REPRESENTS
FIRST OCCURRENCE OF
CURRENT CHARACTER
SEQUENCE
YES
1232
1
1
I
UPDATE CURRENT
CHARACTER HASH
VALUE FOR THE
LONGER MATCH
1
1
1
1
1
1216
1
1
I FIRST
I
1
1
I
FIRST OCCURRENCE
POSITION OR POSITION
FOR A NEW UPDATED
SEQUENCE?
1
UPDATE
1222
1223
I
1
1220
1
I
I
1
1
RETURN STORED POSITION
FOR PREVIOUS UPDATE OF
CURRENT SEARCH INPUT
PROVIDE CHARACTERS
FOLLOWING STORED
POSITION OF STEP 822
AS PREDICTION
I
1
I
I
I
I
I
I
I
I
L------------------------C;-----~
824
----ffiUj '_ 1 2
I
_
u.s.
Patent
Sep. 8, 1998
5,805,911
Sheet 13 of 15
DATABASE
ADDRESS
C
, ,
, ,
-
, ,
,,
'S'
'A'
'A'
'E'
'E'
'E'
'E'
'E'
'E'
'E'
'H'
'H'
'H'
'H'
'I'
'I'
'L'
'L'
'L'
'L'
'L'
'0'
'S'
'S'
's'
's'
'S'
'S'
'S'
'S'
'U'
'2'
SEED
HASH (C, SEED)
0
5895
13685
15420
0
0
5895
0
5885
5965
11285
12175
17895
20195
0
1385
1785
18455
0
18455
0
2185
5895
12175
16625
10860
0
1385
1785
5885
11285
12765
18455
20195
0
0
HSTRING
"- "
" S"
" - SE"
" SELLS S"
"S"
"A"
"AS"
"E"
"E "
"ELLS S"
"E SE"
"EL"
"ELL"
"E S"
"H"
"HELL"
"HEL"
"HE"
"I"
"IE"
"L"
"LS S"
"LS"
"LL"
"LLS S"
"0- SELLS- S"
"S"
"SELL"
"SEL"
"S"
"s SE"
"SELLS S"
"SE"
"s S"
"U"
"2"
5885
20195
11285
10860
1690
10075
15230
18455
13210
12765
14590
1785
1385
4300
3795
1555
530
18030
5890
17635
12175
16625
19340
17895
5965
19525
5895
18890
1320
20205
13405
15420
13685
2185
10085
20560
I
Ek;'
_
13
u.s. Patent
Sep. 8, 1998
5,805,911
Sheet 14 of 15
DATABASE AT
STEP 1116 OF F1G. 11
DATABASE AT
STEP 1118 OF F1G. 11
DATABASE
ADDRESS
DATABASE
CONTENTS
DATABASE
ADDRESS
1
[1690 ]
INVALID
[1690 ]
2
2
[10085]
INVALID
[10085]
3
3
[20560]
INVALID
[20560]
4
4
[5890 ]
INVALID
[5890 ]
5
5
[18455]
INVALID
[18455]
6
6
[5885 ]
INVALID
[5885 ]
7
7
[5895 ]
INVALID
[5895 ]
8
ITERATIONS
OF STEPS
1116 AND 1118
DATABASE
CONTENTS
8
[18455]
6
[18455]
9
9-
[13685]
INVALID
[17635]
6
10
[12175]
INVALID
[12175]
10
11
[12175]
10
[12175]
11
12
[17895]
8
[1785 ]
10
13
[5895 ]
8
[ 5895 ]
12
14
[19340]
INVALID
[20195]
8
15
[5885 ]
7
[5885 ]
13
16
[20205]
INVALID
[13210]
7
17
[5895 ]
12
[5895 ]
14
18
[20195]
8
[19340]
12
19
[2185 ]
INVALID
[4300 ]
8
20
[18455]
9
[18455 ]
15
21
[13405]
INVALID
[14590 ]
9
-
22
[10075]
INVALID
[10075 ]
23
[5895 ]
14
[5895
]
17
24
[15230]
INVALID
[20195 ]
14
25
[3795 ]
INVALID
[3795
]
18
[18455 ]
19
16
26
[18455]
15
27
[18030]
INVALID
[13685 ]
15
28
[12175]
11
[12175 ]
20
29
[1785 ]
10
[17895 ]
11
30
[530
INVALID
[1320
]
10
31
[12175 ]
20
[12175]
21
32
[17895 ]
11
[1785
]
20
33
[1555
]
INVALID
[18890 ]
11
34
[5895
]
11
[5895 ]
22
]
JliUj'_14
I
u.s.
Patent
Sep. 8, 1998
Sheet 15 of 15
5,805,911
1510
WINDOW 2
WINDOW 1 OR
WINDOW 2
WINDOW 1
1514
1512
CALCULATE DATABASE
ADDRESSES SPECIFIC TO
WINDOW 1 FOR
CHARACTER SEQUENCE
POSITIONS IN HISTORY
ARRAY
CALCULA TE DATABASE
ADPRESSES SPECIFIC TO
WINDOW 2 FOR
CHARACTER SEQUENCE
POSITIONS IN HISTORY
ARRAY
1515
1513
PROVIDE WINDOW
SPECIFIC PREDICTION
BASED ON WINDOW 1
DATABASE ADDRESSES
PROVIDE WINDOW
SPECIFIC PREDICTION
BASED ON WINDOW 2
DAT ABASE ADDRESSES
CONTINUE
1516
5,805,911
1
2
systems either a linear or a binary search is used to scan the
text history in order to provide a text prediction. A linear
TECHNI CAL FIELD
search operates by sequentially examining each element in
a list until the target element is found or the list has been
The present invention relates generally to the field of 5 completely processed. Linear searches are primarily used
communication aids and more particularly to the field of
with very short lists. A binary search locates an item by
computer implemented word prediction systems.
repeatedly dividing an ordered list in half and searching the
half that it is known to contain the item. A binary search
BACKGROUND OF THE INVENTION
requires a value for the input that can be compared against
General purpose digital computers are widely used for 10 values in a list of items arranged in a known sequence, such
as ascending order. The binary search begins by comparing
text based communications including word processing and
the input value against the value in the middle of the list. If
electronic mail. Text is generally entered into the computer
the input value is greater than the middle value, the lower
by means of a keyboard. Other text input devices include a
half of the list is discarded and the search concentrates on the
touch-sensitive screen overlaid on top of a graphical image
of a keyboard, or a system which detects the motion of a pen 15 upper half. The input value is again compared with a value
in the middle of the new list and again half of the list is
in combination with handwriting recognition software. The
discarded. The process continues, with the input value being
text is then sent to a particular software application running
compared against the middle of each succeeding smaller list,
on the computer.
until the desired item is found.
Typically, a computer system effectively allows for mulIn using either of these search methods, the time required
tiple applications to be running simultaneously. These appli _ 20
for a prediction can become substantial. Also, without
cations appear in different portions of the screen (called
special provisions for storage of various strings of the text,
windows), with usually only one window or field of entry at
the search time required can also become substantial as
a time receiving text input. Regardless of the input method
searches of repeated substrings may be duplicated.
used, the speed at which text can be entered into a computer
25
is a factor in effective and efficient use of the computer.
The computer programs discussed above are generally
application-dependent programs, in that, the programs have
Because there are a limited number of words available in
been customized to work only with a particular application.
any given language, many of the words forming the vocabuApplication-dependent prediction can cause wasteful dupli1ary of the given language are used frequently. Recognizing
cations. Duplication of software occurs when similar prethat various patterns of language or words are repeated,
computer systems have been devised which complete text, 30 diction programs are implemented by several different applications. Duplication of memory can also be a problem. For
based on the prior history of text entered.
example, it can be wasteful to have several different appliDefault text completion is a feature of a user interface
cations keep separate histories of commonly used words.
which offers simple shortcuts for entering very common
Another problem with application independent modeling is
input. These shortcuts speed text entry and also serve as a
memory aid. For example, most recently used (MRU) lists 35 that since the prediction processes do not take advantage of
text input into other applications, there are common repetiare utilized in text completion applications. The MRU list
tive patterns of text input which are not used to improve
gives a menu of recently used names or files so that they can
prediction.
be quickly opened without retyping the name. The list of
Thus, there is a need in the art for a text prediction system
possible completions can be viewed as predictions of likely 40
that may operate with multiple applications with little or no
text input. Additionally, some computer programs complete
application specific programming and that provides a fast
input text based on a record of prior words entered. For
search method for text in a text history array.
example, some personal finance programs maintain a record
of a person's bank deposits and checks written. In order to
SUMMARY OF THE INVENTION
speed entry of the information on a bank check, the program 45
keeps a list of check payees. This list is used to complete the
Generally, the present invention is an applicationpayee name automatically after the first few letters of the
independent text prediction system. Generally, applicationpayee have been typed by the user. Such word prediction
independence is the ability to work with many different
systems have been particularly used in computer systems in
applications without being customized for each application.
which there are a limited number of selected terms that may 50 Because different prediction methods are appropriate for
be predicted.
different applications (such as MRU lists, English
In addition to MRU prediction mechanisms, other text
Dictionaries, etc.), the present invention allows for multiple
completion mechanisms may include: English dictionaries
prediction components called "prediction modules" to provide text predictions based on each prediction module's
and custom dictionaries such as a list of city names. With a
dictionary prediction mechanism, the information contained 55 prediction method from which the best prediction may be
displayed. For example, a "dictionary prediction module"
within the dictionary does not generally change based on
may provide a prediction method which finds the most
text that has been entered within the computer system. The
common word or words which complete the currently
dictionary prediction mechanism provides predictions based
entered prefix, whereas a MRU list may provide a prediction
on the originally supplied information or on information that
may be provided during periodic updates of the dictionary. 60 list based on the most recently entered text that matches the
input sequence. Thus, the best prediction from the results of
In word prediction systems, an input character may be
the two methods may be displayed. Thus, the present invenanalyzed, with respect to the prior history of text entered, to
tion allows multiple prediction modules to work together to
predict the text likely to follow the input character or
efficiently produce predictions for presentation to the comsubstring of characters entered. Because word prediction
systems are based upon a prior history of text entered, the 65 puter user.
One of the multiple prediction modules provided by the
search time and amount of storage required for the systems
are important parameters. For some of these word prediction
present invention enables text provided with a field of entry
WORD PREDICTION SYSTEM
5,805,911
3
4
of one application to be utilized for text prediction in a
prediction, a user may not be overly distracted by the text
different field of entry of another application. For example,
predictions during typing.
an electronic mail (email) application might maintain a list
Another aspect of the present invention provides for
of commonly used email names. This list can be installed
efficient use of memory among the various prediction modinto the system as a prediction module so that whenever the 5 ules. The different prediction modules are able to efficiently
prefix to an email name is entered, the prediction module is
access a common history of user input so that there is not
able to predict the full name. Although this email prediction
wasteful duplication of history information between differmodule would be primarily of use within the mail
ent prediction modules. A prediction module termed, "the
application, if another application has a text window where
information retrieval prediction module", is used to help
email names are commonly entered (such as an application 10
efficiently store text in and retrieve text from a common
for scheduling meetings), the prediction module manager
input text history array. Rather than have each prediction
will find that the email name prediction module has been
module keep a separate history, each prediction module can
successful at predicting text for these window scheduling
make queries to the information retrieval prediction module
programs. Thus, the prediction module manager will begin
to find and read locations in the history array. The informato use text predictions from the email name prediction 15
tion retrieval prediction module implements a "best match
module within that particular window of the scheduling
prediction" module that locates the longest sequence of
program window. Therefore, a prediction module designed
characters duplicated in the history array that matches the
for primary use for one application is made available to other
input sequence. The information retrieval prediction module
applications and may provide useful predictions.
also implements a "most recently used" prediction module
In accordance with providing multiple predictions from 20 that finds the most recently entered match for the input text.
various modules, the present invention provides a method
Generally described as part of the best match prediction
for choosing which modules' predictions should be used
module of the information retrieval module, the present
from the set of possibly contradictory predictions. In the
invention also provides a method of locating text within a
present invention, a "prediction module manager" keeps a
record of the prediction success of each prediction module 25 history array that has a search time that is independent of the
length of the text history array. The present invention
within each different application window where text is
provides a text storing method that enables the present
entered. Each prediction module produces a weighted list of
invention to quickly locate a match for the input character
predictions. The prediction module weights are based on the
sequence, in the history array, for which a prediction is
prediction modules' estimation of the probability that the
completion text will be entered by the user. The prediction 30 desired and displays, as the prediction, the characters following the most recent occurrence of the longest matched
module manager then adjusts these weights based on the
input character sequence up to the end of word following the
prior success of each prediction module in the current text
matched sequence.
input window of the particular application. The resulting
More particularly, the "best match prediction" module
best prediction or predictions with the greatest weights may
35 aids in storing text in a text history array in the order in
then be presented to the user as default inputs.
which the characters are input by a computer user and stores,
A general problem in implementing an applicationin a database, values that indicate the positions of various
independent prediction system with a standard keyboard
character sequences in the history array. As characters are
occurs because different applications use the keyboard and
being input and added to the history array, an address is
screen in different ways. For example, using the tab key to
indicate that a prediction should be accepted may not work 40 calculated, preferably by a hash coding technique, that is
representative of the current input character sequence and
for all applications because existing applications may
the position of the character sequence in the history array is
already use the tab key for another operation. However, in
stored at the database address calculated for the input
the preferred embodiment, the shift key is utilized as a text
character sequence.
selection mechanism. The shift key is depressed to indicate
Each time a new input character is entered, the database
that the proposed text should be entered as if it were typed 45
is queried, using the calculated address, to determine if a
in by the user. Utilizing the standard shift key alone as a text
valid position for the history array has been stored at the
selection method provides a key that typically is not used
calculated address. If a valid position for the text history
alone by an application to produce a visible result on the
array has been stored at the calculated database address, this
display monitor. The shift key selection method is differentiated from normal use of the shift key for capitalization 50 indicates that the character sequence has previously
occurred in the history array. If the character sequence has
because no other key is depressed simultaneously with the
not previously occurred in the history array, then the position
shift key. Because applications, in general, do not use the
of the character sequence is stored in the database as
shift key alone as a control signal, the shift key selection
discussed above. If the character sequence has previously
mechanism may be used for multiple applications without
control signal conflicts with the multiple applications. Also, 55 occurred, the position of the current character sequence is
stored in the database at the calculated address to replace the
because of the conventional location of the shift key, the
position that was previously stored in the database for the
shift key may be typed easily without interfering with
previous occurrence of the input sequence in the text history
normal touch typing. Depressing the conventional "control"
array.
key without pressing another key simultaneously may also
be used to select a new prediction in the same manner as the 60
Additionally, after the present invention locates the posishift key.
tion in the text history array for which the current character
Displaying predicted text on a monitor with input text can
sequence has most recently occurred, the adjacent preceding
be a distraction to a computer user. In the preferred
characters from the most recent occurrence position in the
text history array are compared against the adjacent precedembodiment, the method for displaying predicted text is to
show the prediction in light gray text following the charac- 65 ing characters of the current input characters to locate the
ters already entered by the user. By using gray or a lighter
character and the position at which the sequence does not
match. A hash database address location is calculated for the
shade of color of the input text for displaying the text
5,805,911
5
6
non-matching sequence, and the position in the text history
FIG. 4 illustrates multiple fields of entry for a computer
array of the non-matching sequence, which begins with the
application.
originally located sequence, is stored to the database at the
FIG. 5 illustrates tables used in providing text predictions.
address calculated for that sequence. By additionally updatFIG. 6 illustrates a table used in providing a text predicing the database to point to the longer character sequence 5 tion.
which did not match all of the characters of the input
FIG. 7 illustrates a database and text history array utilized
sequence at the most recent occurrence position for the input
in the preferred embodiment of the present invention.
sequence, the database is continually updated to point to
FIG. 8 illustrates a general flow diagram of the preferred
multiple character sequences that begin with the same
character or characters which enables direct addressing to 10 steps implemented in the preferred embodiment of the
present invention.
quickly locate the different character sequences. By providFIG. 9a illustrates a flow diagram of the steps impleing direct addressing to locate matching character
sequences, the search time for providing a prediction based
mented in a prediction module of the present invention.
on matching characters in a history array is fast.
FIG. 9b illustrates an alternate embodiment of the flow
In order to provide a prediction for input characters based 15 diagram of FIG. 9a.
upon the history of text entered, the present invention
FIG. 10 illustrates a flow diagram of the steps utilized in
calculates an address that is representative of the character
selecting a text prediction.
sequence, preferably by a hash coding technique as disFIG. 11 illustrates a flow diagram of the hashing functions
cussed above. The database is queried using the calculated
used in the preferred embodiment of the present invention to
hash database address, and if the character sequence has 20
update a database of character positions.
previously occurred, the most recent position in the text
FIG. 12 illustrates a flow diagram of the hashing functions
history array of the character sequence will be returned. The
used in the preferred embodiment of the present invention to
word prediction system may then display the longest match
locate a sequence of characters in a text history array.
of characters for the most recent occurrence up to the next
25
FIG. 13 illustrates a table of hash values calculated for
word.
selected character sequences.
Thus, it is an object of the present invention to provide an
FIG. 14 illustrates a table of values of changes to the
efficient word prediction system.
database as the database is being updated.
It is also an object of the present invention to use a
FIG. 15 illustrates a flow diagram illustrating the predatabase to aid in searches through a text history array.
30
ferred process of calculating addresses for multi-window
It is also an object of the present invention to quickly
applications is shown.
locate an address in the database that corresponds to the
input character sequence.
DETAILED DESCRIPTION
It is also an object of the present invention to determine
Referring to the figures, in which like numerals refer to
the longest match of input characters using a hash address- 35
like parts throughout the several views, a word prediction
ing scheme.
system made according to the preferred embodiment of the
It is also an object of the present invention to provide a
present invention is shown. Referring to FIG. 1, a word
text prediction system that operates across multiple windows
prediction system 10 is implemented with a computer 12
or field of entries of an operating system.
It is also an object of the present invention to provide 40 connected to a display monitor 14. The computer 12 receives
input data 15 from a conventional keypad 16 via an input
multiple prediction methods or modules from which a
line 17. It should be appreciated by those skilled in the art
displayed prediction may be selected.
that text entry also may be accomplished by using a touchIt is also an object of the present invention to provide a
pad or input selection device in conjunction with software
method to weight the multiple predictions from the various
45 for recognizing input signals as a character selection. Shown
prediction modules in order to aid in prediction selection.
in gray or a lighter shade of color than the input data 15 is
It is also an object of the present invention to enable text
the predicted text 18.
to be predicted for a field of entry based on text entered in
In the preferred embodiment, the predicted text 18 may be
another field of entry.
accepted when a user presses and releases the shift key 19
It is also an object of the present invention to provide a 50 of the keypad. Because the pressing and releasing of a shift
method of displaying a text prediction.
key without simultaneously pressing another key normally
It is also an object of the present invention to provide an
has no effect on an application, the shift key text selection
easily operated text selection mechanism.
method does not interfere with normal keypad input by the
These and other objects, features, and advantages of the
user.
present invention will become apparent from reading the 55
With respect to first the nomenclature of the specification,
following description in conjunction with the accompanying
the detailed description that follows is represented largely in
drawings.
terms of processes and symbolic representations of operations by conventional computer components, including a
BRIEF DESCRIPTION OF THE DRAWINGS
central processing unit (CPU) associated with a general
FIG. 1 illustrates the preferred embodiment of the word 60 purpose computer system, memory storage devices for the
prediction system of the present invention.
CPU, and connected pixel-oriented display devices. These
FIG. 2 illustrates some of the components associated with
operations include the manipulation of data bits by the CPU
the computer utilized in the preferred embodiment of the
and the maintenance of these bits within data structures
present invention.
supplied by one or more the memory storage devices. Such
FIG. 3 illustrates the subsystems of a computer system 65 data structures impose a physical organization upon the
that implements multiple windows and illustrates subcollection of data bits stored within computer memory and
systems that may be implemented in the present invention.
represent specific electrical or magnetic elements. These
5,805,911
7
8
process descriptions and symbolic representations are the
processing unit 22 may be communicated along the computer system bus 24 to input/output ports 26, for transfer to
means used by those skilled in the art of computer programdisplay monitor 14 (FIG. 1). The display monitor 14 proming and computer construction to most effectively convey
vides a visual display of input data 15 and the predicted text
teachings and discoveries to others skilled in the art.
For the purposes of this discussion, a module or process 5 18 generated from the processes implemented at the central
processing unit 22.
is generally conceived to be a sequence of computerThe following discussion of text entry in a windowing
executed steps leading to a desired result. These steps
environment is given with reference to FIG. 1, FIG. 3, and
generally require physical manipulations of physical quanFIG. 4 which shows a block diagram of the operative
tities. Usually, though not necessarily, these quantities take
10 subsystems of a text entry system in a windowing environthe form of electrical, magnetic, or optical signals capable of
ment. At any given time in the computer system, multiple
being stored, transferred, combined, compared, or otherwise
applications 30, 31, and 32 may be running simultaneously.
manipulated. It is conventional for those skilled in the art to
Generally, applications send data to a graphical windowing
refer to these signals as bits, values, elements, symbols,
system 34 to display text and graphics on the monitor 14.
characters, terms, numbers, records, files or the like. It is 15 When the user produces mouse or keyboard input, the
also conventional for such signals to be manipulated through
graphical windowing system 34 sends this information to
the use of conventional computer programs as designed
one of the applications. This application is said to have
through conventional programming techniques. It should be
focus.
kept in mind, however, that these and similar terms should
Rather than directly communicating with the windowing
be associated with appropriate physical quantities for com20 system 34, an application can instead communicate with a
puter operations, and that these terms are merely convensubsystem 36 generally known as a forms package or a
tionallabels applied to physical quantities that exist within
dialog package. The forms package 36 is a set of procedures
and during operation of the computer.
which handle the most common input-output interaction
It should also be understood that manipulations within the
with the windowing system 34. For example, the application
computer are often referred to in terms such as adding, 25 can call a procedure in the forms package that causes a
comparing, moving, etc. which are often associated with
window or field of entry 35 to be created which prompts the
manual operations performed by a human operator. The
user for text input as shown in Fig. new 4. Because the forms
operations described herein are machine operations perpackage 36 is customized for the most common types of
formed in conjunction with a human operator or user that
input tasks, the application can use a less complicated,
interacts with the computer. The machines used for perform- 30 higher level interface when communicating with the appliing the operation of the present invention, as will be
cation 30 than is available to an application 32 calling the
understood, include general purpose digital computers or
windowing system directly. The forms package 36 handles
other similar computing devices.
the low level details. For example, when a user types a"}"
Furthermore, it should be kept in mind that there is a
into a field of entry 35 within an application using the forms
distinction between the methods, steps, or operations com- 35 package 36, the forms package 36 receives notification of
pleted by a computer, and the method of computation itself.
the keypad input from the windowing system 34 and signals
The present invention relates to methods, processes, steps,
to the windowing system to display the letter "}" in a specific
or operations for a computer and the processing of electrical
location.
or other physical signals to generate desired physical signals
In addition to the subsystems of a general windowing
and to display results and interactions.
40 system, multiple predictor subsystems 40 may be utilized by
word prediction system 10 to produce the predicted text. The
In addition, it should be understood that the programs,
predictor subsystems 40 may be available to a forms packprocesses, methods, etc. described herein are not related or
age 36 to help provide text completion. Each of the sublimited to any particular computer or apparatus. Rather,
system modules 43, 44, 44a, 44b, 45 and 47 may provide
various types of general purpose machines may be used with
programs constructed in accordance with the teachings 45 text completion based on a different set of prediction rules.
Within the limits of the memory resources on the computer
described herein. Similarly, it may prove advantageous to
system, a plurality of prediction modules can be installed
construct specialized apparatus to perform the method steps
into the prediction system. When a new prediction module
described herein by way of dedicated computer systems with
hard-wired logic or programs stored in nonvolatile memory,
is installed, a prediction module manager 46 is given an
50 identifier by conventional programming techniques, as
such as read only memory.
known by those skilled in the art, which specifies how a new
Referring to FIG. 2, the hardware components of the
prediction module is to be called. When a user inputs data,
computer 12 that are utilized by the word prediction system
a forms package 36 may create a query to the prediction
10 in accordance with the present invention are illustrated.
module manager 46 to present the current state of the user
For simplicity of the drawings, many components of a
standard computer system have not been illustrated such as 55 input.
Upon receiving the user input text, each prediction modaddress buffers, memory buffers and other standard control
ule provides a prediction based on its prediction technique.
circuits because these elements are well known and illusAlong with each prediction by each prediction module, each
trated in the prior art and are not necessary for the underprediction module may provide a "belief factor" that is an
standing of the present invention. Computer programs used
to implement the various steps of the present invention are 60 estimate of the likelihood that its prediction is correct. The
belief factor is a number that increases with the increasing
generally located in the memory unit 20, and the processes
probability that a predicted word or the predicted text is
of the present invention are carried out through the use of a
correct. The manner in which the belief factor may be
central processing unit 22. The memory unit 20 and the
calculated for the various modules is discussed in more
central processing unit 22 are interconnected by a computer
system bus 24 on which digital information is exchanged 65 detail below.
The prediction module manager 46 provides the best
between computer system components. The data signals
prediction or predictions from the modules 43, 44, 44a, 44b,
resulting from the processes implemented at the central
5,805,911
9
10
45, and 47 to the windowing system for display. The form
The information retrieval module 44 maintains a text
package 36 sends instructions to the windowing system 34
history array and accesses the best match prediction module
to display the predicted text in light gray type 18 in the
44a to provide predicted text based on the longest match for
location that the text would be written if entered by the
the current input text in the text history array. Additionally,
computer user. If the computer user indicates that the 5 the information retrieval module 44 accesses the most
recently used prediction module 44b to provide the most
prediction is correct, the forms package 36 darkens the
predicted text and updates the state of the forms package 36
recently used word that matches the input sequence. For
as if the user had typed in the full predicted text.
example, if a field of entry prompts the user for a filename
and the user has entered "re" and the most recent files names
For applications which do not use forms package 36, text
completion is accomplished using the lower level informa- 10 were (in order) "rope report2 real rhymes" then the prediction is "report2" for the best match prediction module 44a
tion available from the graphical windowing system 34. A
and the prediction is "real" for the most recently used
windows-to-forms package 42, as referred to herein, is
prediction module 44b. Similarly, if the previous entry was
utilized for low level communications with the graphical
"this is a test" and the current input is "another te", then the
windowing system 34 to provide text completion for applibest match prediction module 44a of the information
cations that do not interact with a forms package 36. To the
predictor subsystems 40, programming calls from the 15 retrieval module 44 will predict "test" as the word being
entered. Preferably, successive predictions will give the best
windows-to-forms package 42 are handled in the same
prediction available excluding any predictions previously
manner as programming calls from the forms package 36.
generated by the same query to the information retrieval
The windows-to-forms package 42 takes predictions and
module 44.
instructs the windowing system to display the prediction in
the gray text 18. The windows-to-forms package 42 then 20
Because the information retrieval module 44 keeps a
history of the user text input, the other prediction modules
monitors the keypad signals to determine if the user has
indicated that the prediction is correct. When a user indicates
do not need to keep duplicate copies of the history of user
that a prediction is correct, the windows-to-forms package
input. Instead, the other prediction modules can query the
42 instructs the windowing system 34 to indicate to the
information retrieval module 44 directly to determine what
application that the predicted text has been entered.
25 has been typed in the past. To make the information retrieval
module 44 more useful for other modules, the information
When a user inputs text data, the forms package seeks a
retrieval module 44 can be queried for matches within any
prediction from the multiple prediction modules through the
field of entry from any form, not just the field of entry which
prediction module manager 46. The prediction module manis the subject of the current query. By providing a commonly
ager 46 receives the results of the predictions from each
prediction module and returns the best predictions to the 30 accessible the common text history array, each module does
not have to store a separate text history array and therefore
forms package 36. As discussed in more detail below, the
memory can be saved.
forms package 36 maintains a list that indicates the number
As noted above, each prediction module provides a belief
of correct predictions received from each prediction module
factor with each prediction. For the best match prediction
within the current field of entry, such as fields 1-10 illustrated in FIG. 4. The correct prediction list information is 35 module 44a, the belief factor returned with the prediction is
the number of characters matched, as long as a full word is
used with the "belief factor" returned by the prediction
matched. For the most recently used prediction module 44b,
modules to create an "adjusted belief factor." The adjusted
the belief factor returned with the prediction equals 1+ l/(the
belief factor provides an adjusted calculation of the probdistance back in the text history array the match occurred).
ability that a prediction from a particular prediction module
is correct for the current field of entry. The predictions are 40 The distance back in the text history array that the match
occurs can be determined by counting the number of charthen returned to the forms package 36 in order of decreasing
acters (or bytes) or words back from the last character input
adjusted belief factor. A list of the predictions may be
in the text history array.
displayed to the user or only the best prediction may be
displayed.
Unlike the information retrieval module 44, the dictionary
As noted above, multiple prediction modules may be 45 prediction module 45 does not change over time with user
input. The dictionary prediction module 45 simply looks up
provided. For example, prediction modules that may be
the current input text in a dictionary and finds all the words
provided include: a most frequently used prediction module
for which the current input is a prefix. The dictionary
43 which may provide a list of the most frequently used
prediction module 45 then orders these words by a frewords beginning with the input text; an information retrieval
prediction module 44 that includes an array of input text that 50 quency (or probability) stored with each word. The resulting
word list is returned to the prediction module manager 46 in
has been entered, a best match prediction module 44a that
sorted order with belief factors for the set of words returned.
locates the longest match in the text history array, and a most
The dictionary and the belief factors for the dictionary
recently used prediction module 44b that locates the most
module 45 can be created by taking a large corpus of
recently used word that matches the input text; a dictionary
prediction module 45 that contains an ordered and generally 55 documents of text related to the type of dictionary being
unchanging set of words; and an attribute correlator module
provided and counting the number of times each word
47 that searches other fields of entry with the current input
appeared in the text then storing the frequency count as the
text and determines which of the other fields contain words
belief factor with each corresponding word. It should be
that can provide the best predictions for the current field. It
appreciated that words not in the corpus of documents can
should be appreciated by those skilled in the art that other 60 be additionally included and certain words that appeared in
types of prediction modules may be provided and that the
the corpus of documents may be removed. Preferably, the
prediction modules discussed herein are illustrative of the
dictionary is compressed along with the log-base two of the
types that may be provided. The belief factors for each of the
number of times that each word appeared in the corpus. For
prediction modules may be calculated in different ways
example, if a word occurred 700 times, the number nine is
depending on the nature and basis of the prediction method 65 stored as the frequency count for that word since 700 occurs
utilized by the module. The operation of the various predicbetween 2 9 =512 and 210=1024. Thus, the result is a set of
tion modules are discussed below.
words with a general estimate of their frequency.
5,805,911
11
12
As noted above, the "most frequently used" prediction
computer system even if the filenames had never been typed
module 45 tracks the number of times the most common text
into a field of entry of form.
values have been typed into the current field of entry or
Because multiple predictions are returned for a given set
window. The probabilities or belief factors for these words
of text or characters, the word prediction system uses a
may be calculated by dividing the number of times that the 5 reweighting process to determine which prediction from the
predicted word appeared in a given context by the total
multiple prediction modules provides the best prediction for
number of words that occurred in the given context.
the current field of entry. The reweighting process of
Also, as noted above, the present invention also contains
returned predictions from the prediction modules are based
an attribute correlator prediction module 47. The purpose of
on prior learning or a prior record of which prediction
the attribute correlator prediction module 47 is to find other 10 modules returned predictions that were accepted by a user in
fields of entry 35 where the current input text might have
the current field of entry. The acceptance of a prediction by
been recently entered. The attribute correlator prediction
user is considered as a successful prediction.
module 47 accesses or calls the best match prediction
The preferred reweighting process, which determines
module 44a to search other fields of entry 35 with the current
which prediction to use from the multiple predictions,
input text. Within the attribute correlator prediction module
15 implements the following steps:
47, the best match prediction module 44a is used to return
W~(N/(N+1))*(S/N)+(1/(N+1))*BF
a prediction as if the current text had been typed in a field
of entry other than the current field of entry. The attribute
where W is the corrected or reweighted belief factor; BF is
correlator prediction module 47 is useful because often it is
the belief factor returned by the expert; S is the number of
the case that the current input text or word has never been 20 successful predictions that have occur ed within a predetertyped in the current field, although it has been typed in
mined range that the returned belief factor for the current
another field. For example, referring to FIG. 4, if
field of entry falls within; and N is the total number of times
"JMILLER" is entered in the "RE:" field of a mailing
that returned belief factors for the current field of entry have
program, and subsequently "JM" is entered in the "CC:"
fallen within the same range. Preferably, the predetermined
field of the same program, the attribute correlator will be 25 range of values used in determining the number of successable to predict "JMILLER" in the "CC:" field even though
ful predictions, S, and the total number of attempts, N,
"JMILLER" has not previously been entered in the "CC:"
within the predetermined range values are values separated
field.
by the powers of two from -R to +R. For example, the range
Referring to FIG. 5, the attribute correlator prediction
of values using R=5 would be: 1132, 1;16, 1fs, Y4, Y2, 1, 2, 4, 8,
module 47 uses two list of information in association with its 30 16,32. An example calculation of the corrected weight is as
method of prediction. A "recent field use" list 61 of a
follows. If the BF returned by the prediction module is 5
preselected number of the most recently used field identifi(BF=5) and there were S=2 prior successful attempts and 6
cation numbers (i.d.'s) is accessed and maintained by the
failures, out of N=8 tries with belief factors between 4 and
attribute correlator 47. When the field of entry 35 (FIG. 4)
8 (the range that BF=5 falls in) then the adjusted weight or
is changed by a user, the new field of entry 35 moved into 35 belief factor is W=(%) * (%)+(119) * 5=0.777. Note that as
is recorded in the recent field use list 61. Every time a word
N increases, W approaches SIN. FIG. 6 illustrates a table or
is entered in a field of entry 35, the most recently used field
list of prediction successes of prediction attempts within a
i.d. numbers are accessed from the recent field use list 61.
given range.
For each field i.d. in the recent field use list 61, the best
As noted above, the information retrieval module 44
match prediction module 44a provides a prediction. The 40 maintains a history array of characters that have been
field returning the highest belief factor is stored and assopreviously entered. Because many characters and combinaciated with the current field i.d. as the second field i.d in a
tions of characters may be repeated in a text history array,
"best prediction field pair" list 63. These second field i.d. of
searches through a history array using conventional search
field i.d. pairs of the pair represents the i.d. of the field that
methods, as discussed above, may be unacceptable for
generally provides the best prediction for the current field. 45 certain applications. To help reduce the time required for
When the attribute correlator prediction module 47 is
searching through a text history array, the present invention
queried for a prediction, the "best prediction field pair" list
provides an indexing or addressing scheme to provide for
63 is accessed to find the most recent pair that has the current
fast searches.
field i.d. as the first field i.d. of pair. The best match
Depending on the application for which the prediction
prediction module 44a is then queried with the second i.d. of 50 system is used, the method of displaying a text prediction
the located pair. The prediction returned using the second
may vary. As noted above, a text prediction may be disi.d. of pair is returned as the prediction for the attribute
played as the predicted completion immediately following
correlator prediction module 44a. In an alternative
the input text 15. When the predicted text immediately
embodiment, rather than using the most recent pair, the
follows the input text, the forms package 36 displays, as the
attribute correlator prediction module 47 may locate from 55 prediction, the text that has the highest prediction weight
the best prediction field pairs in the list 63 with the selected
among the prediction weights returned with each text prefield i.d. and may query the best match prediction module
diction from the prediction modules.
44a with the second i.d. in the pair which occurred most
Alternatively or in addition to displaying text at the end of
frequently in the list. Additionally, if there is no prediction
the input text, the predictions may be stored to a prediction
returned by the best match prediction module 44a, the 60 list. The prediction list may be displayed to the user in the
second most recent pair or second most frequent pair could
form of a graphical pull-down menu. The predictions are
be searched with the current first i.d. depending of the
preferably arranged in order with the most likely text
embodiment implemented.
prediction appearing first in the displayed list followed by
In addition to the modules discussed above, additional
the second most likely prediction and so on to the least likely
modules may evaluate information available on the com- 65 text prediction. The ordering of the prediction list is preferably based on the calculated weights for each text predicputer system. For example, a filename module may evaluate
tion.
the names of the files and directories available on the
5,805,911
13
14
Referring to FIG. 7, a general block diagram of a database
characters of the string equals the value that the hash
indexing or addressing process is shown which is used to
function returned from the hash operation on the subsequent
implement fast searching of a text history array 54 in
character of the string. Modulo is a function that performs a
information retrieval module 44. As part of the process when
division operation and returns the remainder of the division
text is input, the text history array 54 is updated in sequential 5 operation. For example, the value for the hash string "sel"
order to indicate the appropriate text history as shown. A
equals hash (,s', hash (,e', hash' 1',0)). The hash process is
database 50 contains address locations 52, for example, 0
repeated for each subsequent character of the sequence until
all the characters of the character sequence have been
through n, that have values that point to the positions of
operated upon to provide an appropriate pseudo random
various character sequences found in the text history array
54. It should be appreciated that the term character 10 number for the entire character sequence. Generally, different character sequences will produce different pseudo ransequence, as used herein, may refer to a single character or
dom numbers for different character sequences.
multiple characters. When the text history array 54 becomes
The resulting hash number is not truly a random number
full (i.e. the last byte of the text history array is filled), new
because the process will produce the same pseudo random
input characters are stored starting at the beginning of the
text array history 54 and thus overwrite any previous input 15 number given the same input character sequence 56.
characters in the order that the characters where input. This
Therefore, the same input character sequence 56 will be able
type of array may be termed circular.
to access the same address. The deterministic random nature
As noted above, a hashing technique is used to aid in
of the hashing function helps to save storage space because
searching for an appropriate match. In database
the input character sequence 56 does not have to be stored
management, hashing may be described as an indexing 20 with an associated value. By utilizing a hashing function
with respect to various input character sequences 56, longer
technique in which the value of a key (a character sequence
input character sequences in the text history array 54 beginas used herein) operates as a record identifier, and the key is
numerically manipulated to directly calculate either the
ning with the input character sequences can be found in an
location of its associated record in a file or the location of the
efficient manner.
In the preferred embodiment of the present invention, a
starting point for a search for the associated record. If the 25
hash function is performed on the input character sequence
key value is a character string, each possible character is
56 to obtain an address location 52 of the database 50. The
assigned a numeric code to permit the numerical manipulation. The manipulation performed on the key is known as
address location 52 contains a record that is a pointer to a
the hashing function, for example, assume two keys, CAT
position in the text history array 54 of the most recent
and MOUSE. If the characters in these words are assigned 30 occurrence of the current input character sequence. If the
numeric values by totaling the ASCII values of the letters, a
character or character sequence has not occurred in the text
formula, or hashing function, could be created that might
history array 54, the position of that input character or input
character sequence 46 is stored as the record or pointer to the
calculate a value of 1000 for CAT and a value of 1800 for
MOUSE. Based on these values, record 1000 would contain
character sequence of the history array 54. If a character
the key value CAT and record 1800 would contain the key 35 sequence has been found within the text history array 54,
value MOUSE.
additional hash operations are performed to locate the longBy utilizing hashing, the input character sequence 56 does
est match of characters corresponding to the characters
not have to be stored along with the record. Instead, when
preceding the input character sequence 56. The position of
the hash is performed on the character sequence, the numeric
the longest match of characters that correspond to the input
value determined is used as the address into the database 50. 40 character sequence 56 is returned.
Because the address space available for the database to
The address value provided from the hashing function on the
occupy is finite and the number of potential character
character sequence is preferably a pseudo random number.
Generally, the pseudo random number may be generated by
sequences or keys is infinite, a problem may arise where
multiplying the ASCII value of the first letter of the key by
multiple keys of the hashed character sequence map to the
a large number to obtain a product that is divided by another 45 same address. The mapping of input character sequences 56
large number to obtain a remainder. The remainder is the
to the same address is termed a collision problem. In order
pseudo random number of the hashing function.
to accommodate or lessen the problem of collisions of a
particular address location, a bin, as known to those skilled
If the character sequence contains more than one
in the art, is utilized at the address locations 52. A bin stores
character, the value of the pseudo random number may be
determined as follows. The pseudo random number from the 50 multiple records at that one address.
Because it is possible that different text values yield the
first hash operation is multiplied by the ASCII value of the
second character to obtain a product that is divided by the
same calculated database address, the input characters are
same large divisor from the previous hash operation to
compared to the text pointed to by the first record in the bin.
obtain a remainder. The remainder from the second hash
If the input characters do not match characters associated
operation is the pseudo random number for the two character 55 with the first record then the text value pointed to by the next
sequence. The process may be repeated for each subsequent
record in the bin is compared to determine if the characters
character of the character sequence.
match or until all records within the bin have been checked.
The preferable hash function of the preferred embodiment
Another way to distinguish text sequences which generate
the same address is to calculate "signatures" with this
of the present invention accepts two variables: a character c
and a "seed" value. The hash function utilized to obtain an 60 method, in addition to the pseudo random number generated
address location in the database 50 in the preferred embodito identify the address, another pseudo random number
called a "signature" can also be calculated for each key. As
ment of the present invention is:
known to those skilled in the art, a signature is usually
hash( c,seed)~( (c+constant X)x(seed+constant Ylmodulo constant
selected or calculated to contain fewer bytes than the actual
Z)
65 bit representation of the pseudo random number generated
as the key or address for the character sequence. Although
where the seed value equals zero for the last character of a
the addresses may be the same for a particular key, the
character sequence and where the seed of each of the other
5,805,911
15
16
signature for those keys that provide the same address may
of the input character sequence. At step 850, the user of the
system has an option of pressing and releasing the shift key
be calculated to be different. By providing signatures for
19, without pressing another input key simultaneously, to
input character sequences, four different input character
accept the word prediction or inputting another character. If
sequences may map to the same address location 52 and still
be distinguished. Thus, when a hash function is performed 5 the user presses and releases the shift key 19 at step 850, the
process proceeds to step 852 where the characters that were
on an input character sequence, the address corresponding to
previously displayed in gray text are now displayed in black
the input character sequence is located, then the bin is
as though the characters were input directly by the user. At
searched for the corresponding calculated signature. As
step 854, the cursor is moved to the end of the characters
noted above, the record points to a position in the text history
accepted by the user. The process may then proceed to step
array 54 that corresponds to the input character or input 10
802 where the prediction process of the preferred embodicharacter sequence being evaluated.
ment of the present invention is repeated as described above.
In summary, to determine whether or not a character
Referring to FIG. 9a, the processes and steps implesequence has occurred within the text history array 54, a
mented in the present invention for providing a text predichash operation is performed on an input character sequence
tion from the attribute correlator are illustrated. FIG. 9a is
which yields an address. At the address location, a determi- 15 discussed in conjunction with FIGS. 3 and 5. At step 902,
nation is made whether characters have a match in the bin.
when the field of entry is changed by a user, the field of entry
If there is a match, the record points to the location in the text
moved into is recorded in the most recently used fields list
history array 54 of the corresponding input character
61. At step 904, the most recently used field i.d. numbers are
sequence. If no match is found in the bin, then there is no
accessed from the "recent field use" list 61. At step 906, for
match for this input character sequence, and the record is set 20 each field i.d. in the "recent field use" list 61, the best match
to point to the new input characters in the array. If a bin is
prediction module 44a provides a prediction. At step 908,
full and the current record pointed to the character sequence
the field returning the highest belief factor is returned. The
is to be written in the bin, the oldest record is replaced by the
process then proceeds to step 910. At step 910, the field
current record.
returning the highest belief factor is stored and associated
In connection with the text prediction, the present inven- 25 with the current field i.d. as the second field i.d in a "best
tion utilizes a selection device that is not likely to be used by
prediction field pair" list 63 of field i.d. pairs. These second
field i.d. of the pair represents the i.d. of the field that
the various windows in which the word predictor system
operates. Referring to FIG. 1, the shift key 19 is illustrated
generally provides the best prediction for the current field.
The process then proceeds to step 912.
and operates according to the present invention to designate
selection of a word prediction. The shift key is unlike many 30
At step 912, when the attribute correlator prediction
other keys because the shift key 19 must normally be pressed
module 47 is queried for a prediction, the "best prediction
simultaneously with another keypad input to produce a
field pair" list 63 is accessed to find the most recent pair that
visible effect on the display monitor 14. Also the "control"
has the current field i.d. as the first field i.d. of pair. At step
key of a conventional keyboard may be used as the text
914, the best match prediction module 44a is then queried
selection method. Therefore, by utilizing the shift key 19 or 35 with the second i.d. of the located pair. At step 916, if no
control key as the text selector for predictions, it is less likely
match is found with the initial query, another pair that has
that a window or application would have an operation that
begins with the current i.d. is queried for prediction until the
responds only to the pressing and releasing of the shift key
list is exhausted. At step 918, the prediction returned using
19. In this manner, the text selection mechanism of the word
the second i.d. of pair is returned as the prediction for the
predictor is easy to access and is not likely to interfere with 40 attribute correlator prediction module 44a.
designated input operations allocated to specific keys on the
Referring to FIG. 9b, an alternate process for the attribute
keypad.
correlator 47 of the present invention is shown. The steps
implemented in FIG. 9b correspond to the steps impleReferring now to FIG. 8, a flow diagram illustrating the
preferred steps of the method of the present invention is
mented in FIG. 9a except that instead of searching the i.d.
illustrated. FIG. 8 will be discussed in connection with 45 pair that most recently occurs with the current i.d., the i.d.
FIGS. 1 and 3. At step 802, input data is received from
pair that occurs most which begins with the current field i.d.
keypad input from a computer user. At step 822, the inforis searched. In the process illustrated in FIG. 9b, step 912b
mation retrieval module 44 is called by the prediction
of replaces step 912 of FIG. 9a.
module manager 45 and the input data is added to the history
As noted above, the present invention provides a
array 54. The information retrieval module 44 updates, as 50 reweighting process for the belief factors returned by the
discussed in connection with FIG. 7 and 11, the database 50
prediction modules. The reweighting process is an adjusted
to point to the most recent occurrence of the input character
calculation of the probability that a prediction from a
sequence. The process then proceeds to step 824 where the
particular prediction module is correct for the current field of
information retrieval module 44 provides its text prediction
entry. Referring to FIG. 10, the reweighting process of the
based on the best match for the input character sequence or 55 present invention is illustrated. At step 1002, the predictions
the most recently used match. The process then proceeds to
are received from each prediction module. At step 1004, the
step 826 where the dictionary prediction module 46 provides
list 67 (FIG. 6) of the number of successful predictions and
its text prediction, as discussed above, for the input data. At
the number of prediction attempts for the current field of
step 828, the prediction module manager 46 calls the most
entry is accessed. The process then proceeds to step 1006
frequently used prediction module 43 to provide its text 60 where the process reweights the belief factors based on the
prediction for the current input data. At step 830, the
ratio of successful predictions to the number of predictions
attribute correlator is accessed for its text prediction. At step
attempts for the current field of entry. At step 1008, the
832, after each module has provided its prediction, the
predictions having the highest belief factors are determined
prediction module manager 46 determines the best predicand, at step 1010, the prediction having the highest
tion as discussed above.
65 reweighted belief factors are returned in order of decreasing
The process then proceeds to step 841 where the character
belief. At step 1012, the best prediction or predictions are
prediction is displayed in gray text following the black text
provided for display.
5,805,911
17
18
Referring to FIG. 11, the processes implemented in the
preferred embodiment of the present invention for updating
the database 50 to point to character sequences in the text
history array 54 is shown. Before any text processing has
occurred, the database is initialized to values (invalid
positions) that are not contained in the history array 54, such
a negative position value. At step 1102, a variable match
length, length, is set to 1. The match length variable provides
a count for the number of characters that match the current
character sequence as compared to the most recent occurrence of the character sequence in the history array 54. At
step 1104, the input character sequence is provided by a
computer user. At step 1110, a determination is made as to
whether the match length is less than the history length. If
the match length is not less than the history length, then at
step 1110 the process returns to step 1102. If, however, at
step 1110, the match length is not less the length of the
history array then the process proceeds to step 1112 where
the hash value is calculated for the most recent occurrence
of the character sequence. As noted above, the hash value is
determined according to the hash equation (1). The character
c of the hash function (1) may be represented by h[ x], where
h is the history array 54 and x represents a position variable
for a character in the history array 54, so that h[ x] represents
the character in the history array 54 at position x. The hash
function implemented to calculate the most recent occurrence hash value (MRORV) is:
If, at step 1120, the position is not valid and thus is the first
occurrence of character sequence then the process returns to
step 1102 where processing may begin for an additional
input character as discussed above. If, however, the position
stored for the current character sequence has occurred in the
history array then the process proceeds to step 1124 where
further processing of the current sequence will yield an
updated database value to point to a longer non-matching
sequence that occurred prior to the current sequence but
which begins with currently processed character sequence.
At step 1124, the match length, mlength, is incremented
by 1. At step 1130, the next character looking back mlength
characters from the current character sequence in the history
array is compared to the character at the most recent
occurrence position looking back in the history array
mlength characters. If the character at the most recent
occurrence position matches looking back in the history
array mlength characters, then the current character hash
value eeRV is updated according to the hash function (3)
above. The process then proceeds to step 1124. If, at step
1130, the character at the most recent occurrence position
did not match looking back in the history array mlength
characters, the process proceeds to step 1110.
Referring to FIG. 12, the processes implemented in the
preferred embodiment of the present invention for finding a
character sequence in the history array is shown. The
processes implemented for finding are similar to the processes implemented in the database updating process discussed in connection with FIG. 11, however, steps 1104,
1112 and 1118 are omitted and the text history length is
replaced with a short string input from a particular window.
At step 1201, a character string is received from a window
or field of entry. At step 1202, a variable match length,
mlength, is set to 1. The match length variable provides a
count for the number of characters that match the character
sequence as compared to the most recent occurrence of the
character sequence in the text history array 54. At step 1210,
a determination is made as to whether the match length is
less than the history length. If the match length is not less
than the history length, then at step 1210 the process returns
to step 1202.
If, however, at step 1210, the match length is less than the
history length the hash value for the current character
sequence is calculated at step 1214. The hash function
implemented to calculate the hash value for the current
character sequence is:
MROHV~Hash(h[posMRO-mlength],
CCHV)
CCHV)
10
15
20
25
(2)
where posMRO is the position number of the most recent
occurrence of the character sequence and the seed, eeRV,
is the current character sequence hash value. When the input
character is first received at step 1104 the eeRV value is set
to zero and posMRO is set to equal the length of the history
array. Also, if the character sequence has not previously
occurred, the MRORV value is also the hash value for the
current sequence. After the hash value is calculated for the
most recent occurrence of the character sequence, the hash
value for the current character sequence is calculated at step
1114. The hash function implemented to calculate the hash
value for the current character sequence is:
CCHV~Hash(h[hlength-mlength],
5
30
35
40
(3)
where hlength is the length of the history array (i.e. the
number of characters in the history array). The process then
proceeds to step 1116.
At step 1116, the database is accessed to get the history
array position value for the current character sequence hash
value eeRy. An invalid position will be returned at step
1116 if the character sequence has not occurred (i.e., the first
occurrence of the character sequence in the history array) or
valid position, if the sequence is located in the history array,
will be returned at step 1116. The process then proceeds to
step 1118 where the position of the match for the most recent
occurrence of the character sequence is stored in the database at the MRORV location of the database. By storing this
occurrence of the character sequence, subsequent accesses
of the database using the hash value for the sequence will
point to the most recent occurrence position in the history
array. At step 1120, a determination is made as to whether
the stored database position obtained from step 1116 is the
first occurrence of the character sequence (i.e., an invalid
position). If the position is valid then the character sequence
has been previously input in the history array and thus a
valid position pointer for history array is stored at the eeRV
location in the database.
45
CCHV~Hash(h[ alength-mlength],
50
55
60
65
CCHV)
(4)
where alength is the length of the character string. The
process then proceeds to step 1216. At step 1216, the
database is accessed to get the history array position value
for the current character sequence hash value eeRy. An
invalid position will be returned at step 1216 if the character
sequence has not occurred or a valid position, where the
sequence is located in the history array, will be returned at
step 1216. The process then proceeds to step 1220. At step
1220, a determination is made as to whether the stored
database position obtained from step 1216 is the first occurrence of the character sequence (i.e. an invalid position).
That is, has the character sequence been previously input in
the history array and thus a valid position pointer for history
array is stored at the eeRV location in the database. If, at
step 1220, the position is not valid and thus is the first
occurrence of character sequence then the process returns to
step 1222 where the stored position for the previous update
for the current search process is returned. If there has been
a previous update step then the position for that update step
5,805,911
19
20
represents the last matching position and thus will be
be made to ensure that the match of characters is not longer
returned as the prediction. If, however, the position stored
than the maximum match length. Because the history array
for the current character sequence has occurred in the history
is a circular buffer, when the end of the history is reached,
the maximum match length number of characters should be
array then the process proceeds to step 1224 where further
processing of the current sequence will yield an updated 5 copied to the beginning of the history array. As known to
database value to point to a longer sequence beginning with
those skilled in the art, by using this technique no special
currently processed character sequence.
case is required during matching for patterns which start
near the end of the history array and continue at the start of
At step 1224, the match length, mlength, is incremented
by 1. At step 1230, the next character, looking back mlength
the history array.
characters from the current character sequence in the history 10
Referring to FIG. 11, at step 1102, mlength (the match
length) is initialized to 1. At step 1104, the input character
array 54, is compared to the character at the most recent
occurrence position looking back in the history array
'#' is added to the text history array 54 at position 1. At this
mlength characters. If the character at the most recent
point in the process, the current length of the text history
occurrence position matches looking back in the history
array 54 is 1 and the mlength is also 1, so at step 1110, the
array mlength characters, then the current character hash 15 process returns to step 1102. As before, at step 1102, the
mlength is initialized to 1. The new input character'S' is
value CCHV is updated according to the hash function (3)
above. The process then proceeds to step 1224. If, at step
added to the text history array 54 at position 2 and hlength
1230, the character at the most recent occurrence position
is incremented to 2. At this point in process mlength=1 is
did not match looking back in the history array mlength
less than the history array length hlength=2. The process
characters, the process proceeds to step 1210.
20 then proceeds to step 1112.
The processes of the present invention are discussed
At step 1112, the hash value for the most recent occurbelow in connection with specific examples. As noted above
rence (MROHV) of the current character sequence is genthe information retrieval module 44 updates a database 50
erated which is a signature of the character sequence of
and locates character sequences in a historic array 54. The
length 1 starting at position 2 in the text history array 54:
information retrieval module 44 finds the position of the 25 "S". Because the position of the most recent occurrence
longest ending match of a short string A from a long history
(posMRO) is initially set to equal the hlength, at step 1114,
H. An ending match is a sequence of adjacent characters
the characters hashed in step 1112 are the same as the
which occurs in both H and A (ignoring differences in case)
characters hashed in step 1114 and thus, the hash value for
and which includes the last character in A. For example
the current character sequence (CCHV) has the same sigH="Suzie sells seashells by the seashore" and A="what does 30 nature as generated in step 1112. In this example, the
Suzie s." In this example, H and A have a match sequence
signatures are calculated according to the hash function (1)
of: "s" at positions 1,7,11,13,16,21,30, and 33; " s" at
above with specific values substituted for the constants as
positions 7, 13, and 30; "e s" at positions 7 and 30; and "ie
follows:
s" at position 7 only. The longest match is thus at position
hash(c, seed)~((c+3297)x(seed+43985)modulo 20945)
35
7.
An example implementing the method of the present
where the character c is the ASCII representation of the
invention for processing a text history array so that longest
character c as an integer.
matches can be determined is discussed below. As noted
Referring to FIG. 13, a table of hash values generated by
above, the processes of the present invention maintain a
the processes of the present invention for the character
database of values which are used to aid in searches though 40 sequences processed in this example is shown. Because all
a history array of text. The time required to find the longest
non-zero seeds are generated by the result of another calculation of "hash(c, previously calculated_seed)", a relamatch is proportional to the length of the match. In contrast
to other methods for find the longest match which have been
tionship exist between the different seeds which is shown by
used for prediction, the time required to the update database
the hash string (hstring). For example, the third row of the
when H is appended to is very fast. If a character is added 45 table in FIG. 13 shows hash(, _', 13685)=11285. The charto the end of H to create a string H', the time required to
acter "_" is used here instead of the space character for
update the database for efficient lookup of H' is proportional
illustration purposes. The hstring of "_se" indicates 11285=
to the match length, mlength, of the longest ending match of
hash(, _' ,hash(,s' ,hash(,e' ,0))).
H with H'. A method of processing a history array so that
Referring again to step 1112, the signature for "s" is given
longest matches can be determined is discussed below in 50 to be 1690. MROHV and CCHV are now both 1690. The
CCHV signature is used to query the database 50 to find the
connection with FIGS. 3, 6, and 8.
Referring to FIGS. 3, 7 and 11, as noted above, the
position where the character sequence "s" has previously
database is initialized to all invalid values. The below
occurred in the history array. In this case, this character
discussion uses shows how the input string "#Suzie sells
sequence "s" has not occurred previously so the position
seashells" is processed. The '#' character is added to the 55 returned at step 1116 is an invalid position. At step 1118, the
current position 2 of this sequence "s" is stored using the
history array to serve as a history array bounds check. In this
example, the first character processed is a unique character
MROHV value of 1690. At step 1120, since the position that
"#" which is not repeated in order to insure that no match
was originally stored in the database 50 was an invalid
sequences overlap the beginning of the history array. It
position at step 1116, the processing of this character "s" is
should be appreciated by those skilled in the art that other 60 complete. The process then returns to step 1102. Referring
history array bound checking techniques could be used.
additionally to FIG. 14, a table shows the changes to the
The preferred method of bounds checking is to allocate
database 50 as the characters of this example are processed.
memory "before" the start of the history buffer to ensure that
The next character of the example, 'u,' is processed.
Again mlength is initialized to 1 at step 1112. The process
there is no access error if the data before the beginning of the
text history is overrun. It is preferable to allocate as much 65 completes as previously discussed with the processing of
memory as the maximum allowed match length. When
"s" except this time the position is 3 for sequence "u" which
examining the characters in the history array, a check should
has signature 10085. The signature 10085 corresponding to
5,805,911
21
22
"u" is used to store position 3 for later querying of the
be readily understood by one skilled in the art in light of the
database 50. As with the processing of "S", "u" has not
above figures and related discussion.
previously occurred. so the value returned at step 1116 from
After setting up the database 50, the process searches for
the database is an invalid position. The current position 3 of
text to be predicted from the history array 54. The process
this sequence "u" is stored using the MROHV value of 5 for searching the processed text for the longest matching
10085. Because step 1116 yielded an invalid position value,
character sequence is similar to the text updating processed
the processing of "u" is complete and the process returns to
described above, the primary difference is that the database
step 1102.
50 is not modified. In the text updating process, the two
So far 3 characters '#' "s", and 'u' have been processed
signatures, MROHV and CCHV refer to two character
from the full example "#Suzie sells seashells." The follow- 10 sequences in the history array. The MROHV represents the
ing characters 'z', 'i', 'e', and's' all follow a similar pattern
sequence with the last character at the position of the longest
of processing as the processing of'S' and 'u.'
match, the CCHV represents the sequence ending at the end
Now, when the process encounters 'e' in "sells" the
of the history array 54.
process takes additional steps to process this second occurIn the find match position process, the sequence ending at
rence of 'e' in the history array. As discussed above, mlength 15 the end of the history array is replaced with a new character
is initialized to 1 at step 602. The character 'e' is added to
string which is passed into the process for the new search.
The find match position process shown in FIG. 13 carries out
the history array 54 which now begins "#Suzie se".
MROHV is calculated at step 1112 for the sequence 'e' to be
the same steps as the update text process shown in FIG. 12
18445. Again at step 1114, CCHV is the same as MROHV
except that steps 1204, 1212, and 1218 are omitted and the
at this point in the at step 1114. At step 1116, the database 20 history length is replaced with the length of searched for
string.
50 is queried with signature 18445 to determine the last
position that the sequence "e" has occurred. For the first time
The following example for finding a match uses the input
in this example an invalid position is not returned by the
search string "Who sells s". The process produces the output
"Who sells seashells." The database 50 has been set to find
database 50. The position 11 is returned at step 1116
indicating that 'e' has previously occurred. Now, at step 1118 25 sequences from the string "#Suzie sells seashells" as disthe current position 9, representing the 'e' in "sells", is
cussed above. Referring to FIG. 12, at step 1202, mlength is
stored using the new signature MROHV (which is also
initialized to 1. Because mlength is less than the input search
18455). At this point, the process has determined that the
string length, the process proceeds to step 1214. At step
1214, the signature CCHV, for "S" generated from the last
character at position 9 in the history array matches the
character at position 11 in the history array. Because 'e' has 30 character of "Who sells s" and is calculated to be 5895. The
signature 5895 is used to query the database 50 at step 1216
previously occurred and a pointer has been set to point to the
for the most recent occurrence of a "s" which is found to be
current character 'e', the process proceeds to steps 1124,
position 22. Position 22 is the final's' in "seashells." Since
1130, and 1132. At steps 1124, 1130, and 1132, the process
position 22 is a valid position, the process proceeds to step
is enabled to update the database to point to a longer
non-matching sequence that occurred prior to the current 35 1224, where mlength is incremented.
sequence but which begins with currently processed charAt step 1230, the characters of the input search string
acter sequence. At steps 1124, 1130, and 1132, the process
counting back from the end of the input search string are
determines how many more characters match starting with
compared with the characters of the text history array
preceding the located position 22. At step 1230, the last
positions 8 and 5, and counting backwards in the history
array. At step 1130, since position 8 is a's' and position 5 40 space ' , in "Who sells s" is compared with the last '1' in
is an 'i', there are no further matches and the process
"seashells." These two characters do not match so the
proceeds to step 1110 with mlength=2.
process proceeds to step 1210 with mlength=2. The process
Now, at step 1110, the processing of 'e' continues with
proceeds to step 1214 where CCHV is calculated by hashing
the' , character with the 5895 to get 20195. The signature
mlength=2 because mlength is less than the history length
hlength=9. At step 1112, the letter at position 5 (' i') is hashed 45 5895 is used to query the database 50 to get 14. Since 14 is
with 18445 to get MROHV=17635. MROHV is the signaa valid position, the match length is incremented to 3 at step
ture which represents the sequence "ie" since HASH
1224 and the characters in the input string ending with the
final's' in "Who sells" are compared at step 1230 with the
(i,HASH(e,0))=17635. Similarly CCHV is calculated by
characters from the history ending with "#Suzie sells." At
hashing the letter at position 8 (,s') with 18445 to get
CCHV=13685. CCHV is a signature representation of the 50 step 1230, because the ending "s" of "sells" in "Who sells"
matches the ending "s" of "# Suzie sells", the current input
sequence "se" since HASH(s,HASH(e,0))=13685. At step
1116, the database 50 is queried to determine if "se" has ever
hash value is updated for the longer match by the hash
function (4). These comparisons continue at each iteration
occurred in the history array 54. The position returned at step
1116 is an invalid position indicating that "se" has not
through the steps 1224, 1230, and 1232 until mlength is 8
previously occurred. The position 11 is stored at step 1118 55 (i.e. when the 'e' in "#Suzie" is found to be different than the
'0' in "Who"). The process then continues to step 1210 with
in the database using MROHV=17635. Because the database position returned is an invalid position, we are through
mlength equal to 8. The process proceeds to step 1214 where
processing the 'e' in "sells".
the character '0' is hashed with the current signature to get
19525 which represents the sequence "0 sell s". The signaReferring to FIG. 14, examination of the changes to the
database 50 shows that position 6 had been stored earlier at 60 ture 19525 is used to query the database 50 and an invalid
position is returned indicating that this sequence has not
signature 18455 for sequence "e." Position 6 was replaced
previously occurred. So the process ends with the position
with position '9,' while position 6 was stored at signature
17635 for sequence "ie". Thus, the changes to the database
14 returned. The word prediction system takes the match
position and writes out the characters from this position to
occurred so that the signature of each sequence stored in the
database can be used to look up the most recent occurrence 65 the end of the word as the predicted text. This causes the text
"seashells" to be written out as the completion for the partial
of the character sequence. The effect of processing the
remainder of the characters in "#Suzie sells seashells" can
text "Who sells s."
5,805,911
23
24
As noted above, prior art word prediction systems are
Each time a new input character is entered, the database
based upon text entered from within only a single window
is queried, using the calculated address, to determine if a
or single application. Referring to FIGS. 1, 3, and 15, the
valid position for the text history array has been stored at the
word prediction system 10 may be utilized across multiple
calculated address, which indicates that the character
windows to provide an individual and collective method to 5 sequence has previously occurred in the history array. If the
character sequence has not previously occurred in the histrack the input text history across the multiple windows. A
tory array, then the position of the character sequence is
window or field of entry is a portion of a screen that contains
stored in the database as discussed above. If the character
its own document or message. In window-based programs,
sequence has previously occurred, the position of the current
the screen can be divided into several windows, each of
which has its own boundaries. Each window may contains 10 character sequence is stored in the database at the calculated
address to replace the position for previous occurrence of the
its own menu or other controls.
The preferred embodiment of the present invention may
input sequence.
Additionally, after the present invention locates the posiprovide a dynamically linked library, as known to those
tion in the history array for which the current character
skilled in the art, to make the text history available across the
system of windows. In a multi-window environment, the 15 sequence has most recently occurred, the adjacent preceding
word prediction system 10 attempts to find the best match
characters from the most recent occurrence position in the
within the current window 62 in which the user is operating.
history array are compared against the adjacent preceding
The prediction system then may look for a better match
characters of input characters of the history array to locate
the position at which the sequence does not match. An
available from the entire system of windows. In order to
differentiate text input in a specific window, each window 20 appropriate hash address location is calculated for the character that was determined not to match, and the position of
has a number constant associated with the window. Thus,
different addresses are calculated for the database 50 for the
the non-matching sequence, which begins with the origisame character sequences in the different windows.
nally located sequence, is stored to the database. By addiBy using window specific constants for calculating datationally updating the database to point to the longer charbase addresses and an overall word prediction system con- 25 acter string at the most recent occurrence position, the
stants for database address calculation, two different
database is updated continually updated to point to multiple
matches are returned: one is a best match in any context and
character sequences that may begin with the same characters
the other is the best match within the current window. The
with permits direct addressing to locate different character
system of priority of prediction between two matches
sequences.
In order to provide a prediction for input characters based
returned is a matter of user design. For example, the longest 30
match of the current window may constitute the prediction
upon the history of text entered, the present invention
for the prediction system 10. If there is a tie between the
calculates an address for the input character sequence,
lengths of the matches between system text history and the
preferably by a hash coding technique as discussed above.
current window text history, the word prediction may be
The database is queried using the calculated hash address,
taken from the current window. Another design choice may 35 and if the character sequence has previously occurred, the
require that the length of the match from any context be a
most recent position of the character sequence will be
certain length greater than the length of the match found
returned. The word prediction system may then display the
from the current window.
longest match of characters for the most recent occurrence
up to the next word.
Referring to FIG. 15, a flow diagram illustrating the
preferred process of calculating addresses for multi-window 40
The foregoing relates to the preferred embodiment of the
present invention, and many changes may be made therein
applications is shown. The process of FIG. 15 continues
without departing from the scope of the invention as defined
from C of FIG. 5 as indicated. At step 1510, the computer
by the following claims.
determines which of the multiple windows or field of entry
is currently operating. At step 1512, if window 1 is the
I claim:
current window, step 1122 is repeated using different num- 45
1. In a computer system, a computer-implemented method
ber constants in the hash operations to provide database
for predicting text for presentation on a display monitor,
address locations specific to window 1 for the character
comprising the steps of:
sequence positions in the text history array 54. At step 1513,
receiving input text from a text input device;
step 1124 is repeated with the window 1 database address
providing said input text to a text prediction system within
values to provide a word prediction specific to window 1. At 50
said computer system,
step 1514, if window 2 is the current window, steps 1122 and
said text prediction system comprising:
1124 are repeated using different number constants in the
a first text prediction module for predicting text based
hash operations to provide database address locations speon a first text prediction method, and
cific to window 2 for the character sequence positions in the
a second text prediction module for predicting text
text history array 54. At step 1515, step 1124 is repeated with 55
based on a second text prediction method;
the window 2 database address values to provide a word
analyzing said input text by said first text prediction
prediction specific to window 2.
method of said first text prediction module and providIn summary, in order to aid in making text predictions, the
ing a first output text prediction from said first text
present invention builds a text history array of text as
prediction module to said text prediction system;
characters are input by a computer user and builds a database 60
of values that are the positions of various character
analyzing said input text by said second text prediction
sequences in the history array. As characters are being input
method of said second text prediction module and
and added to the history array, an address is calculated,
providing a second output text prediction from said
preferably by a hash coding technique, for the current input
second text prediction module to said text prediction
system, said second output text prediction capable of
character sequence and the position of the character 65
sequence in the history array is stored at the database address
being contradictory when compared to said first output
calculated for the input character sequence.
text prediction;
5,805,911
25
26
in response to receiving said first and second output text
12. In a computer system, a computer-implemented
predictions, determining which of said output text
method for predicting text for presentation on a display
predictions to display; and
monitor, said text being entered within a plurality of fields
displaying said output text prediction, from said step of
of entry, comprising the steps of:
determining, to said display monitor.
5
receiving input text within a first field of entry; and
2. The method of claim 1 further comprising the step of
providing a first output text prediction in said first field of
providing a first belief factor from said first text prediction
entry based on text that was entered in one of a plurality
module and providing a second belief factor from said
of second fields of entry.
second prediction module, said first belief factor indicative
13. The method of claim 12 wherein said step of providing
of the likelihood that said first output text prediction is 10
said first output text prediction based on text that was
correct and said second belief factor indicative of the likeentered in one of a plurality of second fields of entry
lihood that said second output text prediction is correct.
comprises utilizing a second field of entry that is most likely
3. The method of claim 2 further comprising the step of
to provide a correct prediction.
determining whether said first output text prediction should
14. The method of claim 12 wherein said step of providing
be displayed or whether said second output text prediction
should be displayed based on said first and second belief 15 a first output text prediction in said first field of entry based
factors.
on text that was entered in one of a plurality of second fields
4. The method of claim 3 further comprising the step of
of entry, comprises:
providing a first reweighted belief factor to more accurately
accessing a list of pairs of identification numbers, each of
reflect the likelihood that said first output text prediction is
said identification numbers representing an input text
correct and providing a second reweigh ted belief factor to 20
field of entry, the first of said identification numbers of
more accurately reflect the likelihood that said second output
one of said pairs representing the field of entry from
text prediction is correct.
which the input text was entered and the second of said
5. The method of claim 4 wherein said step of providing
identification numbers of said one of said pairs reprea first reweighted belief factor comprises determining said
senting said one of said plurality of second fields of
first reweighted belief factor based on the number of suc- 25
entry; and
cessful predictions provided by said first text prediction
providing said first output text prediction utilizing the text
module within the text field of entry from which said first
previously entered into said one of said second fields of
output text prediction is provided.
entry represented by said second identification number.
6. The method of claim 5 wherein said step of providing
15. The method of claim 14 wherein said step of providing
a second reweighted belief factor comprises determining 30
said first output text prediction utilizing the text previously
said second reweighted belief factor based on the number of
entered into said second field of entry represented by said
successful predictions provided by said second text predicsecond identification number comprises utilizing said section module within the text field of entry from which said
ond field of entry that is most likely to provide a correct
second output text prediction is provided.
7. The method of claim 6 further comprising the step of 35 prediction for the field of entry from which the input text
was entered.
determining whether said first output text prediction should
16. The method of claim 14 wherein said step of providing
be displayed or whether said second output text prediction
said first output text prediction utilizing the text previously
should be displayed based on said first and second
entered into said one of said second fields of entry reprereweighted belief factors.
8. The method of claim 1 wherein said first text prediction 40 sented by said second identification number comprises
selecting said one of said second fields of entry which has
method comprises providing said first output text prediction
the first identification of the pair that is the same as the
in the field of entry that said input text was entered, and
identification number of the field of entry from which the
providing said first output text prediction based on text that
input text was entered.
was entered in a second field of entry.
17. The method of claim 14 wherein said step of providing
9. The method of claim 8 wherein said step of providing 45
said first output text prediction utilizing the text previously
said first output text prediction based on text that was
entered into said one of said second fields of entry repreentered in a second field of entry comprises utilizing a
sented by said second identification number comprises
second field of entry that is most likely to provide a correct
selecting said one of said second fields of entry which has
prediction.
10. The method of claim 8 wherein said first text predic- 50 the first identification of the pair that is the same as the
identification number of the field of entry from which the
tion method comprises:
input text was entered and said first identification number
accessing a list of pairs of identification numbers, each of
represents the first identification number that occurs most is
said identification numbers representing an input text
said list of pairs.
field of entry, the first of said identification numbers
18. In a computer system, a computer-implemented
representing the field of entry from which the input text 55
method for predicting text for presentation on a display
was entered and the second of said identification nummonitor, comprising the steps of:
bers representing said second field of entry; and
receiving input text from a text input device;
providing said first output text prediction utilizing the text
providing multiple text predictions for said input text;
previously entered into said second field of entry repproviding belief factors for each of said predictions, said
60
resented by said second identification number.
belief factors based on the number of successful pre11. The method of claim 10 wherein said step of providing
dictions provided within particular text fields of entry
said first output text prediction utilizing the text previously
from which said text prediction were provided; and
entered into said second field of entry represented by said
second identification number comprises utilizing said secdisplaying the text prediction having the highest belief
ond field of entry that is most likely to provide a correct 65
factor to said display monitor.
prediction for the field of entry from which the input text
19. In a computer system, a computer-implemented
was entered.
method of organizing text within a text prediction system,
5,805,911
27
28
said text prediction system operating within a computer
system having a first window and a second window, comprising the steps of:
receiving input text from a text input device;
storing said input text, entered within either said first
window or said second window, in a common text
history array; and
independently associating text entered within said first
window from text entered within said second window;
and
providing a first output text prediction based on text enter
within said first window, said first output text prediction
based on text stored in said common text history array
and providing a second output text prediction for text
entered within said second window, said second output
text prediction based on text stored in said common text
history array.
20. A computer-implemented method of processing text
for a word prediction system, comprising the steps of:
receiving a plurality of input character sequences in a
computer system;
inputting each of said input character sequences in
ordered positions in an array of text;
generating a number identifier representative of each of
said input character sequences;
using said number identifier as an address into a database;
storing, in said database at said address, the position of
said character sequence in said ordered array of text;
and
replacing, in said database at said address, the position of
said character sequence with the position of the most
recent occurrence said character sequence in said array
of text.
21. A computer-implemented method of locating text in a
text history array of a word prediction system, comprising
the steps of:
generating a pseudo random number identifier for an input
character sequence, said number identifier being an
address location for a database operable to store
pointers, said stored pointers pointing to characters
previously entered into said text history array of characters;
determining whether a pointer exists at said address
location for said input character sequence, said pointer
pointing to the most recent occurrence of said input
character sequence in said text history array;
locating in said text history array of characters the position of the longest most recent occurrence of said
character sequence; and
displaying a text prediction which completes the word
beginning with said longest most recent occurrence of
said character sequence.
22. In a computer system, a computer-implemented
method for storing text utilized for predicting text for
presentation on a display monitor, comprising the steps of:
storing character sequences in a text history array in the
order in which said character sequences are input;
in response to characters being input to the history array,
calculating a database address that is representative of
the current input character sequence;
accessing said database with said database address to
determine if a valid position for the text history array is
stored at said database address, thereby indicating that
said character sequence has previously occurred in said
text history array;
storing the position of the character sequence of the text
history array in the database at the calculated address
for said input character sequence, if the character
sequence has not previously occurred in the text history
array;
if the character sequence has previously occurred, storing
the position of the current input character sequence in
the database at said calculated database address so as to
replace the position of the text history array stored for
the previous occurrence of the input sequence located
at said database address;
after locating the position in the history array for which
the current character sequence has previously occurred,
comparing the adjacent preceding characters from the
most recent occurrence position in the history array
against the adjacent preceding characters of said input
character sequence of the history array to locate the
character and the position at which the sequence does
not match thereby identifying a non-matching character
sequence which begins with the current input character
sequence;
calculating a database address for the non-matching
sequence that is representative of the non-matching
character sequence input character sequence; and
storing the position of the non-matching sequence to the
database at the address calculated for the non-matching
sequence.
5
10
15
20
25
30
35
40
45
* * * * *
Disclaimer: Justia Dockets & Filings provides public litigation records from the federal appellate and district courts. These filings and docket sheets should not be considered findings of fact or liability, nor do they necessarily reflect the view of Justia.
Why Is My Information Online?