PersonalWeb Technologies LLC v. Google, Inc. et al
Filing
1
COMPLAINT for Patent Infringement against All Defendants ( Filing fee $ 350 receipt number 0540-3354008.), filed by PersonalWeb Technologies LLC. (Attachments: # 1 Civil Cover Sheet, # 2 Exhibit A - Patent '791, # 3 Exhibit B - Patent '280, # 4 Exhibit C - Patent '442, # 5 Exhibit D - Patent '310, # 6 Exhibit E - Patent '539, # 7 Exhibit F - Patent '544, # 8 Exhibit G - Patent '662, # 9 Exhibit H - Patent '096)(Tribble, Max) (Additional attachment(s) added on 12/14/2011: # 10 Exhibit A Searchable, # 11 Exhibit B Searchable, # 12 Exhibit C Searchable, # 13 Exhibit D Searchable, # 14 Exhibit E Searchable, # 15 Exhibit F Searchable, # 16 Exhibit G Searchable, # 17 Exhibit H Searchable) (mjc, ). (Entered: 12/08/2011)
Case 6:11-cv-00656 Document 1-4
Filed 12/08/11 Page 1 of 63 PageID #: 128
EXHIBITC
EXHIBIT C
!.
Case 6:11-cv-00656 Document 1-4
Filed 12/08/11 Page 2 of 63 PageID #: 129
111111
1111111111111111111111111111111111111111111111111111111111111
US006928442B2
(54)
(75)
United States Patent
(10)
Farber et ai.
(12)
(45)
ENFORCEMENT AND POLICING OF
LICENSED CONTENT USING
CONTENT-BASED IDENTIFIERS
FOREIGN PATENT DOCUMENTS
EP
Notice:
(21)
Appl. No.: 09/987,723
(22)
Filed:
Subject to any disclaimer, the term of this
patent is extended or adjusted under 35
U.S.c. 154(b) by 0 days.
Nov. 15,2001
Prior Publication Data
(65)
(Continued)
US 2002/0052884 A1 May 2, 2002
Primary Examiner-Luke S Wossum
Assistant Examiner-Khanh Ph am
(74) Attorney, Agent, or Firm-Davidson Berquist Jackson
& Gowdey, LLP
Related U.S. Application Data
(63)
(51)
(52)
(58)
Continuation of application No. 09/283,160, filed on Apr. 1,
1999, now Pat. No. 6,415,280, which is a division of
application No. 08/960,079, filed on Oct. 24, 1997, now Pat.
No. 5,978,791, which is a continuation of application No.
08/425,160, filed on Apr. 11, 1995, now abandoned.
(57)
References Cited
U.S. PATENT DOCUMENTS
3,668,647
4,215,402
4,290,105
4,376,299
4,405,829
4,412,285
A
A
A
A
A
A
6/1972
7/1980
9/1981
3/1983
9/1983
10/1983
ABSTRACT
Data files are distributed across a plurality of computers. The
computers may form a network such as a content delivery
network (CDN) or a peer-to-peer network. The network may
operate as a TCP/IP network such as the Internet. Data files
may represent may represent digital messages, images,
videos or audio signals. For content---data items or files in
the system-a name is obtained (or determined), where the
name is based, at least in part, on a given function of the data
in a data item or file. The given function may be a message
digest or hash function, and it may be MD4, MD5, and SHA.
A cony of a requested file is only provided to licensed (or
authorized) parties. The system may check one or more
computers for unauthorized or unlicensed content. Content
is served based on a measure of availability of servers.
Int. CI? ................................................ G06F 17/30
U.S. CI. ............................. 707/10; 707/3; 707/101;
707/200; 709/203; 709/219; 709/229
Field of Search .............................. 707/3, 6, 9, 10,
707/101,200; 709/203, 219, 229
(56)
4/1994
Gwertzman, James, et al. "The Case for Geographical PushCaching." Technical Report HU TR 34-94 (excerpt), Harvard University, DAS, Cambridge, MA02138, 1994, 2 pgs.
Grigni, Michelangelo, et al. "Tight Bounds on Minimum
Broadcasts Networks." SIAM Journal of Discrete Mathematics, vol. 4, No.2, May 1991, pp. 207-222.
Devine, Robert. "Design and Implementation of DDH: A
Distributed Dynamic Hashing Algorithm." In Proceedings
of 4th International Conference on Foundations of Data
Organizations and Algorithms, 1993, pp. 101-114.
Deering, Stephen, et al. "Multicast Routing in Datagram
Internetworks and Extended LANs." ACM Transactions on
Computer Systems, vol. 8, No.2, May 1990, pp. 85-110.
Assignees: Kinetech, Inc., Northbrook, IL (US);
Savvis, Inc., Town & Country, MO
(US)
( *)
0592045
OTHER PUBLICATIONS
Inventors: David A. Farber, Ojai, CA (US);
Ronald D. Lachman, Northbrook, IL
(US)
(73)
Patent No.:
US 6,928,442 B2
Date of Patent:
Aug. 9,2005
Evangelisti
Mitchell
Cichelli
Rivest
Rivest
Neches
56 Claims, 31 Drawing Sheets
(Continued)
SIMPL
DATA ITEM
TRUE NAME
~
Case 6:11-cv-00656 Document 1-4
Filed 12/08/11 Page 3 of 63 PageID #: 130
US 6,928,442 B2
Page 2
U.S. PATENT DOCUMENTS
4,414,624
4,441,155
4,464,713
4,490,782
4,571,700
4,577,293
4,642,793
4,675,810
4,691,299
4,725,945
4,773,039
4,887,235
4,888,681
4,922,414
4,922,417
4,972,367
5,025,421
5,050,074
5,050,212
5,057,837
5,077,658
5,129,081
5,129,082
5,144,667
5,179,680
5,202,982
5,208,858
5,276,901
5,287,499
5,301,286
5,301,316
5,341,477
5,343,527
5,357,623
5,384,565
5,404,508
5,452,447
5,459,860
5,542,087
5,581,758
5,638,443
5,640,564
5,781,629
5,802,291
5,809,494
5,835,087
5,907,704
6,006,018
6,134,603
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
11/1983
4/1984
8/1984
12/1984
2/1986
3/1986
2/1987
6/1987
9/1987
2/1988
9/1988
12/1989
12/1989
5/1990
5/1990
11/1990
6/1991
9/1991
9/1991
10/1991
12/1991
7/1992
7/1992
9/1992
1/1993
4/1993
5/1993
1/1994
2/1994
4/1994
4/1994
8/1994
8/1994
10/1994
1/1995
4/1995
9/1995
10/1995
7/1996
12/1996
6/1997
6/1997
* 7/1998
9/1998
9/1998
* 11/1998
5/1999
12/1999
10/2000
Summer, Jr.
Fletcher
Benhase
Dixon
Emry, Jr.
Matick
Meaden
Gruner
Rivest
Kronstadt
Zamora
Holloway
Barnes
Holloway
Churm et al. .................. 707/1
Burke
Cho
Marca
Dyson
Colwell
Bendert
Kobayashi
Tirling
Pogue, Jr.
Colwell
Gramlich et al. .............. 707/2
Vollert
Howell
Nemes .......................... 707/2
Rajani
Hamilton
Pitkin et al. ................ 709/226
Moore
Megory-Cohen
Cannon
Konrad
Nelson et al. .............. 707/205
Burnett
Neimat et al. ................ 707/10
Burnett
Stefik et al. .................. 705/54
Hamilton et al. ........... 709/303
Haber et al. ................ 713/177
Balick et al. ............... 709/202
Nguyen ......................... 707/1
Herz et al. .................. 345/810
Gudmundson et al.
Burnett et al. ......... 395/200.49
Jones et al. ................. 709/330
OlliER PUBLICATIONS
Cormen, Thomas H., et al. Introduction to Algorithms, The
MIT Press, Cambridge, Massachusetts, 1994, pp. 219-243,
991-993.
Naor, Moni, et al. "The Load, Capacity and Availability of
Quorum Systems." In Proceedings of the 35th IEEE Symposium on Foundations of Computer Science, Nov. 1994,
pp.214-225.
Nisan, Noam. "Pseudorandom Generators for SpaceBounded Computation." In Proceedings of the TwentySecond Annual ACM Symposium on Theory of Computing,
May 1990, pp. 204-212.
Palmer, Mark, et al. "Fido: A Cache that Learns to Fetch."
In Proceedings of the 17th International Conference on Very
Large Data Bases, Sep. 1991, pp. 255-264.
Peleg, David, et al. "The Availability of Quorum Systems."
Information and Computation 123, 1995, 210-223.
Rabin, Michael. "Efficient Dispersal of Information for
Security, Load Balancing, and Fault Tolerance." Journal of
the ACM, vol. 36, No.2, Apr. 1989, pp. 335-348.
Ravi, R., "Rapid Rumor Ramification: Approximating the
Minimum Broadcast Time." In Proceedings of the 35th
IEEE Symposium on Foundation of Computer Science, Nov.
1994, pp. 202-213.
Schmidt, Jeanette, et al. "Chernoff-Hoeffding Bounds for
Applications with Limited Independence." In Proceedings
of the 4th ACS-SIAM Symposium on Discrete Algorithms,
1993, pp. 331-340.
Tarjan, Robert Endre, et al. "Storing a Sparse Table."
Communications of the ACM, vol. 22, No. 11, Nov. 1979,
pp. 606-611.
Wegman, Mark, et al. "New Hash Functions and Their Use
in Authentication and Set Equality." Journal of Computer
and System Sciences vol. 22, Jun. 1981, pp. 265-279.
Vitter, Jeffrey Scott, et al. "Optimal Pre fetching via Data
Compression." In Proceedings of 32nd IEEE Symposium on
Foundations of Computer Science, Nov. 1991, pp. 121-130.
Fredman, Michael, et al. "Storing a Sparse Table with 0(1)
Worst Case Access Time." Journal of the Association for
Computing Machinery, vol. 31, No.3, Jul. 1984, pp.
538-544.
Yao, Andrew Chi-Chih. "Should Tables be Sorted?" Journal
of the Association for Computing Machinery, vol. 28, No.3,
Jul. 1981, pp. 615-628.
Floyd, Sally, et al. "A reliable Multicast Framework for
Light-Weight Sessions and Application Level Framing." In
Proceeding of ACM SIGCOMM '95, pp. 342-356.
Feeley, Michael, et al. "Implementing Global Memory Management in a Workstation Cluster." In Proceedings of the
15th ACM Symposium on Operating Systems Principles,
1995, pp. 201-212 .
Carter, J. Lawrence, et al. "Universal Classes of Hash
Functions." Journal of Computer and System Sciences, vol.
18, No.2, Apr. 1979, pp. 143-154.
Patent Abstracts of Japan, "Electronic Mail Multiplexing
System and Communication Control Method in The System." Jun. 30, 1993, JP 05162529.
Kim et aI., "Experiences with Tripwire: Using Integrity
Checkers For Intrusion Detection", COAST Labs. Dept. of
Computer Sciences Purdue University, Feb. 22, 1995, pp.
1-12.
Kim et aI., "The Design and Implementation of Tripwire: A
file System Integrity Checker", COAST Labs. Dept. of
Computer Sciences Purdue University, Nov. 19, 1993, pp.
1-21.
Bert dem Boer et aI., Collisions for the compression function
of MDs, pp. 292-304.
Sakti Pramanik et aI., Multi-Directory Hasing, 1993, Info.
Sys., vol. 18, No.1, pp. 63-74.
Murlidhar Koushik, Dynamic Hashing with Distributed
Overflow Space: A File Organization with Good Insertion
Performance, 1993, Info. Sys., vol. 18, No.5, pp. 299-317.
Witold Litwin et aI., LH* -Linear Hashing for Distributed
Files, HP Labs Tech. Report No. HPL-93-21, Jun. 1993, pp.
1-22.
Yuliang Zheng et aI., HAVAL-A One-Way Hashing Algorithm with Variable Length of Output (Extended Abstract),
pp. 83-105.
Chris Charnes and Josef Pieprzky, Linear Nonequivalence
versus Nonlinearity, Pieprzky, pp. 156-164.
Case 6:11-cv-00656 Document 1-4
Filed 12/08/11 Page 4 of 63 PageID #: 131
US 6,928,442 B2
Page 3
Witold Litwin et aI., Linear Hashing for Distributed Files,
ACM SIGMOD, May 1993, pp. 327-336.
Ming-Ling Lo et aI., On Optimal Processor Allocation to
Support Pipelined Hash loins, ACM SIGMOD, pp. 69-78,
May 1993.
Thomas A. Berson, Differential Cryptanalysis Mod 232 with
Applications to MD5, pp. 69-81.
William Perrizo et aI., Distributed loin Processing Performance Evaluation, Twenty-Seventh Hawaii International
Conference on System Sciences, vol. II, pp. 236-244.
Vijay Kumar, A Concurrency Control Mechanism Based on
Extendible Hashing for Main Memory Database Systems,
ACM, vol. 3, 1989, pp. 109-113.
Birgit Pfitzman, Sorting Out Signature Schemes, Nov. 1993,
1st Conf. Computer & Comm. Security '93, p. 74-85.
Zhiyu Tian et aI., A New Hashing Function: Statistical
Behaviour and Algorithm, pp. 3-13.
G. L. Friedman, Digital Camera with Apparatu for Authentication of Images Produced from an Image File, NASA
Case No. NPO-19108-1-CU, U.S. Appl. No. 08/159,980,
filed Nov. 24, 1993.
H. Goodman, Ada, Object-Oriented Techniques, and Concurrency in Teaching Data Structures and File Management
Report Documentation p. AD-A275 385 - 94-04277.
Advances in Cryptology-Eurocrypt'93, Workshop on the
Theory and Application of Cryptographic Techniques
Lofthus, Norway, May 23-27, 1993 Proceedings.
Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, vol. 22, Issue 2, lun. 1993.
Advances in Cryptology-AUSCRYPT '92-Workshop on
the Theory and Application of Cryptographic Techniques
Gold Coast, Queensland, Australia, Dec. 13-16, 1992 Proceedings.
Peter Deutsch (peterd@bunyip.com), "Re: MD5 and LiFNs
(was: Misc Comments)", www.acl.lanl.gov/URl!archive/
uri-94q2.messages/0106.html, Apr. 26, 1994.
Alexander Dupuy (dupuy@smarts.com), "RE: MD5 and
LIFNs (was: Misc Comments)", www.acl.lanl.gov/URl!archive/uri-94q2.messages/0113.html, Apr. 26, 1994.
Alexander Dupuy (dupuy@smarts.com), "MD5 and LIFNs
(was: Misc Comments)", www.acl.lanl.gov/URl!archive/
uri-94q2.messages/0081.html, Apr. 17, 1994.
Albert Langer (cmf851@anu.oz.au), http://groups.google.com/groups?selm=
1991Aug7.225159.786%40newshost.anu.edu.au&oe=
UTF-8&output=gplain, Aug. 7, 1991.
Clifford Lynch (Calur@uccmvsa.bitnet), "ietf urI/uri overview draft paper (long)", www.acl.lanl.gov/URl!archive/
uri-93q1.messages/0015.html, Mar. 25, 1993.
K. Sollins and L. Masinter, "Functional Requirements for
Uniform Resource Names", www.w3.org/Addressing/
rfc1737.txt, Dec. 1994, pp. 1-7.
W3C:ID, HTTP: A protocol for networked information,
"Basic HTTP as defined in 1992", www.w3.org/Protocols/
HTTP2.html, 1992.
European Search Report issued Dec. 23, 2004.
* cited by examiner
Case 6:11-cv-00656 Document 1-4
u.s. Patent
Aug. 9,2005
Filed 12/08/11 Page 5 of 63 PageID #: 132
Sheet 1 of 31
<0
US 6,928,442 B2
0::
C
'W"'"
0
til
N
0::
0
0
~
fa
u
(J)
tn
W
N
,..
0
0
c..
0::
0
0
a::
c..
•
•
I
•
I
0::
-•
0
r.n
CJ)
N
w
0
,..
(.)
C>
lL..
I
0
I
,
0::
c..
0::
,~W'
~ ~E
~ 0 fij
0
N
0
""""
It/)
•
•
'¢
~
C2
0
~
[ij
)'U,C
"
en
(/)
C'!
0
Q
0
0
•
c>W
0
c..
"
t- C
'W
til
til
W
W
(.)
0
a::
0..
~
Case 6:11-cv-00656 Document 1-4
u.s. Patent
Aug. 9,2005
Filed 12/08/11 Page 6 of 63 PageID #: 133
Sheet 2 of 31
US 6,928,442 B2
r---- --------------------------------------------------,
I
I
I
I
I
I
I
I
I
I
I
I
I
I
~
I
I
o
1
I
I
I
I.
I
I
I
I
1(\1
I
I
I
I
I
10
I_
I
I
I
I
I
I
I
I
I
I
I
I
I
~
~--~~
a
!\
CI)
en
W
()
a
0::
c..
-
w
I
~ j~
co ::>
o c..
(!) W
t2
~ ~3 en ~
~
()
U
~
~
!
I
I
I
I
:
I
\)
I _______________________________________________________ J
L
..........
.c
.
(!)
1.L
I
FILE
SYSTEM
~
~
,
I
I
117
REGION
~
I
117
REGION
.....
.....
REGION
• • •
~17
=
I
REGION
~
I
118
({Q
~~
118
118
N
C
C
(It
DIRECTORY
•• •
DIRECTORY
'Jl
=-
~
~
.....
~
1
FILE'
\
( FILE)
.1
122
o
....,
'""'"
~
( FILE
12~
SEGMENT
SEGMENT
1
J
I
SEGMENT
e
rJ'l
-..CJ\
\0
N
00
~
~
N
~
N
Filed 12/08/11 Page 7 of 63 PageID #: 134
DIRECTORY
Case 6:11-cv-00656 Document 1-4
d
•
rJl
•
Case 6:11-cv-00656 Document 1-4
u.s. Patent
Aug. 9,2005
Filed 12/08/11 Page 8 of 63 PageID #: 135
Sheet 4 of 31
US 6,928,442 B2
FIG.3
138·
Region ID
Pathname
True Name
Type
File ID
Time of last access
Time of last modification
Safe fl.ag
Lock
fl~
Size
owner
FIG.4
140
True Name
File ID
Compressed File ID
Source IDs
Dependent processors
Use count
Time of last access
Expiration
Grooming delete count
142
Reqion ID
Region file system
Region pathname
Region status
Mirror processor(s)
Mirror duplication count
Policy
FIG.5
144
FIG.6
~
source IO
source type
~
.....
.....
~
=
source rights
source availability
source location
~
146
({Q
original Name
~~
operation
N
C
C
Ul
FIG.7
Type
Timestamp
'JJ.
Pathname
.....
=-
~
~
Ul
True Name
o
....,
~
148
FIG.8
t:::~::Ul
True Name
I::n:::e ---- - - - FIG.9
150
.. -I
'""'"
e
rJ'l
-..CJ\
\0
N
00
~
~
N
~
N
Filed 12/08/11 Page 9 of 63 PageID #: 136
Processor IO
Case 6:11-cv-00656 Document 1-4
d
•
rJl
•
Case 6:11-cv-00656 Document 1-4
u.s. Patent
Aug. 9,2005
Filed 12/08/11 Page 10 of 63 PageID #: 137
Sheet 6 of 31
US 6,928,442 B2
FIG. 10(a)
S/MPL
DATA/TEM
, ..
-------------S~i8-----------\
:
I
,
,
\
5212
COMPUTEMDFUNcnONON
DATA ITEM
I
I
S214
APPEND LENGTH MODULO 32 OF
DATA ITEM
,
,
\
\
I
\
I
,I
\
',----------------------------TRUE NAME
~
'
Case 6:11-cv-00656 Document 1-4
u.s. Patent
Aug. 9,2005
8216
r- YES
Filed 12/08/11 Page 11 of 63 PageID #: 138
Sheet 7 of 31
US 6,928,442 B2
0,_ _--.
FIG. IO{b)
DATA ITEM
SIMPLE?
8220
PARTmON DATA ITEM INTO
SEGMENTS
5222
ASSIMILATE EACH SEGMENT
(COMPUTING ITS TRUE NAME)
r
,~----s2f8-----'"
I
I
: COMPUTE TRUE :
I
I NAME OF SIMPLE :
I
:
DATA ITEM
:
,
....
_----- ------,
I
8224
CREATE INDIRECT BLOCK OF
SEGMENT TRUE NAMES
8226
ASSIMILATE INDIRECT BLOCK
(COMPUTING ITS TRUE NAME)
5228
REPLACE FINAL 32 BITS OF TRUE
NAME WITH LENGHT MOD 32 OF DATA
ITEM
I
~
FIG. II
~
.....
.....
DETERMINE
TRUE NAME
~
=
>
=
({Q
~~
N
c
c
YES
Ul
=-
~
~
S236
.....
00
o
....,
* CREATE NEW ENTRY
* SET USE COUNT TO 1
* STORE FILE 10
* seT OTHER FIELDS
~
'""'"
S238
S239
DELETE FILE 10
STORE FILE lD
e
rJ'l
-..CJ\
\0
N
00
~
~
N
~
N
Filed 12/08/11 Page 12 of 63 PageID #: 139
'JJ.
Case 6:11-cv-00656 Document 1-4
d
•
rJl
•
Case 6:11-cv-00656 Document 1-4
u.s. Patent
Aug. 9,2005
Filed 12/08/11 Page 13 of 63 PageID #: 140
Sheet 9 of 31
US 6,928,442 B2
FIG.12
YES
S240
UPDATE
DEPENDENCY
LIST
NO
8242
SEND MESSAGE TO
~--------------~ CACHESERVERTO
UPDATE CACHE
S244
COMPRESS
(IF DESIRED)
8246
MIRROR
(IF DESIRED)
Case 6:11-cv-00656 Document 1-4
u.s. Patent
Filed 12/08/11 Page 14 of 63 PageID #: 141
US 6,928,442 B2
Sheet 10 of 31
Aug. 9,2005
FIG.13
S250
SEARCH FOR
THE
~~~~~U-_ _~~
FAIL
PATHNAME
FOUND
8258
ASSIMILATE
FILE 10
S256
FREEZE
DIRECTORY
Case 6:11-cv-00656 Document 1-4
u.s. Patent
Aug. 9,2005
Filed 12/08/11 Page 15 of 63 PageID #: 142
US 6,928,442 B2
Sheet 11 of 31
5260
CONFIRM THAT
TRUE NAME
EXISTS LOCALLY
8262
FIG.14
SEARCH FOR
PATHNAMEIN
LDETABLE
8264
CONFIRM THAT
DIRECTORY
EXISTS
YES
8268
DELETE
TRUE FILE
8270
CREATE
ENTRY IN LDE
& UPDATE
~
~
NO
NEGATIVE
RESPONSE
8278
.....
.....
~
=
YES
"
/'
REQUEST
MOUNT
SEND RTF
MESSAGE &
WAIT FOR
RESPONSE
POSITIVE
RESPONSE
~
({Q
~~
N
C
C
Ul
'JJ.
=-
~
~
.....
'N "
""'
o
....,
~
8276
8280
ENTER TRUE FILE
FIND FilE
RETURNED INTO
TFR
'""'"
5282
L..----------.J~I
FIG.15
VERIFY TRUE
e
rJ'l
DESIRED)
-..CJ\
\0
FILE (IF
N
00
~
~
N
~
N
Filed 12/08/11 Page 16 of 63 PageID #: 143
FAIL
8274
Case 6:11-cv-00656 Document 1-4
d
•
rJl
•
~
8284
~
.....
.....
CLIENT
SELECTS
PROCESSOR(S)
~
=
~
~o
~I
FAIL
I
({Q
~~
N
C
C
Ul
S286
NEGATIVE
RESPONSE
OR I
TIMEOUT
I
CLIENT
BROADCASTS
=-
FIG.16(a)
~
~
.....
'""'"
~
o
....,
~
'""'"
CLIENT
WAITS
POSItiVE
RESPONSE
-----*------
e
rJ'l
-..CJ\
\0
N
00
~
~
N
~
N
Filed 12/08/11 Page 17 of 63 PageID #: 144
'JJ.
Case 6:11-cv-00656 Document 1-4
d
•
rJl
•
~
~
.....
.....
~
0
STORE
PROCESSOR 10
~
=
~
({Q
~~
N
C
C
Ul
FIG.16(b)
'JJ.
=-
~
~
.....
'~
""'"
o
....,
~
'""'"
S291c
SEND MESSAGE TO
RESERVE TRUE FILE
ON SOURCE
PROCESSOR
S290C-
~ "-/
o
OURCE IS"'>-
PUBLISHING
SYSTEM? ,.
-1
YES
S291d
DETERMINE
EXPIRATION DATE
AND ADD TO LIST
e
rJ'l
-..CJ\
\0
N
00
~
~
N
~
N
Filed 12/08/11 Page 18 of 63 PageID #: 145
5290B
LOOK UP TFR FOR
TRUE NAME & ADD
SOURCE LOCATION 10
TO SOURCE IDS FOR
TRUE NAME
Case 6:11-cv-00656 Document 1-4
d
•
rJl
•
~
~
.....
.....
~
NO
FIG. 17(0)
~INI~K~UKIKUc
,
...
~
(
FAIL
1
=
>
=
({Q
~~
N
C
C
Ul
DONE
I
'JJ.
=-
~
~
.....
'""'"
Ul
0
....,
NO
~
'""'"
YES
I
8298
DECOMPRESsl
e
rJ'l
-..CJ\
\0
NO
N
00
~
~
N
~
N
Filed 12/08/11 Page 19 of 63 PageID #: 146
YES
Case 6:11-cv-00656 Document 1-4
d
•
rJl
•
-----
---~---
~
~
J
S302
NOTIFY
USER
I
.....
.....
~
=
---STORE 10
>
=
({Q
S308·~
LOCATE
REMOTE FILE
NO MORE
SOURCE ID
S304
T
N
c
c
Ul
'JJ.
DONE
S306
=-
~
~
.....
'""'"
0'1
o
....,
REALIZE TRUE
FILE FROM
SQURCE(S)
~
'""'"
e
FIG.17(b)
rJ'l
-..CJ\
\0
N
00
~
~
N
~
N
Filed 12/08/11 Page 20 of 63 PageID #: 147
SELECT
SOURCE IDS
~~
Case 6:11-cv-00656 Document 1-4
d
•
rJl
•
~
FIG.18(a}
>--YES.
~
.....
.....
~
=
~
({Q
~~
YEs{;]
>--YE5.
N
C
C
Ul
'JJ.
DELETE
.....
'"""
-..J
=-
~
~
TRUE FILE
8320
CREATENBN ~I~~______~
SCRATCH FILE
o
....,
8322
~
'"""
MAKE TRUE
FILE LOCAL
e
rJ'l
-..CJ\
\0
DONE
N
00
~
~
N
~
N
Filed 12/08/11 Page 21 of 63 PageID #: 148
8318
Case 6:11-cv-00656 Document 1-4
d
•
rJl
•
Case 6:11-cv-00656 Document 1-4
u.s. Patent
Filed 12/08/11 Page 22 of 63 PageID #: 149
Aug. 9,2005
Sheet 18 of 31
,......
.c
-......-.
CO
-
•
(!)
-
US 6,928,442 B2
.
~
~
.....
.....
8332
~
=
INCREMENT
FREEZE LOCK
I
FIG. 19(0)
~
FOR EACH
SUBORDINATE
FILE AND
DIRECTORY IN THE
\IVEN DIRECTOR,)
~~
N
C
C
Ul
1
-~
5334
S336
FREEZE IF
..
ASSIMILATE
DIRECTORY" UNASSIMlLATED
FILE
'JJ.
=-
~
~
.....
'""""
~
o
....,
~
'""""
...
5337
CREATE NEW
e
\Jl
DATA ITEM
-..CJ\
\0
__ __
.I
00
N
~
~
N
~
N
Filed 12/08/11 Page 23 of 63 PageID #: 150
. "".
({Q
Case 6:11-cv-00656 Document 1-4
d
•
rJl
•
-------,--
I
----
~
•
~
~
I
FOR EACH
S338
SUBORDINATE
ADD ENTRY TO
t-----1~.. NEW DATA
FILE AND
DIRECTORY IN THE
ITEM
GIVEN DIRECTORY
I
\
I
I
~
)
RECORD
ADDITIONAL
DESIRED
INFORMATION
~
=
~
({Q
~~
N
C
C
Ul
8342
'JJ.
=-
ASSIMILATE THE
NEW DATA ITEM
~
~
.....
FIG.19(b)
.
S344
DECREMENT
THE FREEZE
LOCK
N
C
o
....,
~
'""'"
e
rJ'l
-..CJ\
\0
N
....
00
~
~
N
~
N
Filed 12/08/11 Page 24 of 63 PageID #: 151
.
5340
~
.....
.....
Case 6:11-cv-00656 Document 1-4
d
•
rJl
•
~
1
.....
.....
~
=
MAKE TRUE
FILE LOCAL
,
NO MORE
ENTRIES
.
\.
~
~
po
8353
FOR EACH
DIRECTORY
ENTRY
J~
({Q
'"
S348
MORE
~ENTRIE~
~
.....
~
READ
~~
N
C
C
(It
DIRECTORY
., Ir
S350
CREATE FULL
PATHNAME
'JJ.
=-
~
~
....
N
""""
o
....,
~
""""
-~
S352
LINK PATH TO
TRUE NAME
e
rJ'l
-..CJ\
\0
N
00
~
~
N
~
N
Filed 12/08/11 Page 25 of 63 PageID #: 152
[ . DONE~
~
~
5346
FIG. 20
8354
r
Case 6:11-cv-00656 Document 1-4
d
•
rJl
•
Case 6:11-cv-00656 Document 1-4
u.s. Patent
Aug. 9,2005
Filed 12/08/11 Page 26 of 63 PageID #: 153
Sheet 22 of 31
US 6,928,442 B2
5354
WAIT FOR
FREEZE LOCK
TO TURN OFF
S356
FIND TFR
FIG.21
ENTRY
S358
DECREMENT
REFERENCE
COUNT
YES
S362
DELETE
TRUE FILE
NO
S364
REM eVE FILE ID
...- - - - - - - - - - - i AND COMPRESSED
FILE 10
Case 6:11-cv-00656 Document 1-4
u.s. Patent
Aug. 9,2005
Filed 12/08/11 Page 27 of 63 PageID #: 154
US 6,928,442 B2
Sheet 23 of 31
5365
GET
OPERATION
FIG. 22
">-_YES_ _ _ _ _
S368
~
ASSIMILATE
S369
NEW TRUE
FILE
.-_YES
8378
S370
MODIFY USE
COUNT OF EACH
COMPONENT
5379
FOR EACH PARENT
DIRECTORY OR FILE,
UPDATE USE COUNT,
LAST ACCESS AND
MODIFY TIMES
RECORD TRUE
NAME IN AUDIT
FILE
Case 6:11-cv-00656 Document 1-4
u.s. Patent
Aug. 9,2005
Filed 12/08/11 Page 28 of 63 PageID #: 155
Sheet 24 of 31
US 6,928,442 B2
5382
FIG. 23
VERIFY
GROOMING
LOCK OFF
8384
SET
GROOMING
LOCK
"
S386
SET GROOM
COUNTS
Case 6:11-cv-00656 Document 1-4
u.s. Patent
Filed 12/08/11 Page 29 of 63 PageID #: 156
Aug. 9,2005
Sheet 25 of 31
US 6,928,442 B2
S388
FIND LDE
RECORD
FIG. 24
5390
FINDTFR
RECORD
5392
INCREMENT
GROOMING
DELETE COUNT
...
,
S394
ADJUST FILE
SIZES
Case 6:11-cv-00656 Document 1-4
u.s. Patent
Filed 12/08/11 Page 30 of 63 PageID #: 157
Aug. 9,2005
Sheet 26 of 31
US 6,928,442 B2
FIG. 25
S396
DELETE
FILE
5398
UNLOCK
GROOMING
LOCK
"
,.-____..... ;0
~
YES_ _-..
FIG. 26(0)
~
~
.....
.....
~
=
'0--..t
I
5404
PROHIBIT
OPEN
S408
DETERMINE
REGION
I
~
({Q
~~
N
C
C
Ul
YES
=-
~
~
.....
N
-..J
o
....,
~
'""'"
YES,
5422
PROHIBIT
OPEN
e
rJ'l
-..CJ\
\0
N
00
~
~
N
~
N
Filed 12/08/11 Page 31 of 63 PageID #: 158
'JJ.
Case 6:11-cv-00656 Document 1-4
d
•
rJl
•
~
~
.....
.....
~
=
~..
I
LOCK IF NOT
LOCKED
~
({Q
-------r--
~~
N
C
C
Ul
'JJ.
=-
8417
CREATE
SCRATCH
~ERASEFILE
COpy
5406
CREATE
SCRATCH FILE
~
~
...
S420
MAKE LOCAL
VERSION &
RETURN FILE 10
FROMTFR
S424
RETURN
SCRATCH FILE 1l1li
'4111
10
FIG.26(b)
...
....
N
00
o
....,
~
'""'"
e
rJ'l
-..CJ\
\0
N
00
~
~
N
~
N
Filed 12/08/11 Page 32 of 63 PageID #: 159
0_---,
Case 6:11-cv-00656 Document 1-4
d
•
rJl
•
~
~
.....
.....
S422
~
=
DETERMINE LOE &
RTENTRY
RECORDS FOR
FILE
~
({Q
~~
N
C
C
Ul
PROHIBIT
DELETION
'JJ.
=-
~
~
.....
N
~
o
....,
...
8424
NO
~
'""'"
IDENTIFY TRUE
FILE FROM TRUE
NAME
FIG. 27(0)
e
rJ'l
-..CJ\
\0
N
_.--L. __
00
~
~
N
~
N
Filed 12/08/11 Page 33 of 63 PageID #: 160
,YES--..
Case 6:11-cv-00656 Document 1-4
d
•
rJl
•
~
~
.....
.....
~
nO _____-,
~
=
~
({Q
~~
S427
:)
Ul
'JJ.
=-
~
~
.....
~
c
8430
DELETE
TRUE FllE
o
....,
~
'""'"
8431
S428
1
,.'
REDUCE USE
COUNT BY ONE
__ IADD ENTRY TO
AUDIT FILE
e
rJ'l
-..CJ\
\0
FIG. 27(b)
N
00
'l.
~
N
~
N
Filed 12/08/11 Page 34 of 63 PageID #: 161
DELETE
SCRATCH COPY
OF FILE
XES
N
C
C
Case 6:11-cv-00656 Document 1-4
d
•
rJl
•
~
~
S432
FIG. 28
LOOKUP
TRUE NAME
/'.....
YES
.....
.....
~
=
>
NO
=
({Q
~~
N
C
C
Ul
=-
~
~
.....
o
8442
~I FORWARD
REQUEST
~
I..
'"
o"'"
....,
YES
~
'""'"
e
rJ'l
S444
POSITIVE
RESPONSE
NEGATIVE
RESPONSE
-..CJ\
\0
N
00
~
~
N
~
N
Filed 12/08/11 Page 35 of 63 PageID #: 162
'JJ.
Case 6:11-cv-00656 Document 1-4
d
•
rJl
•
Case 6:11-cv-00656 Document 1-4
Filed 12/08/11 Page 36 of 63 PageID #: 163
US 6,928,442 B2
1
2
ENFORCEMENT AND POLICING OF
LICENSED CONTENT USING CONTENTBASED IDENTIFIERS
being files, directories, records in the database, objects in
object-oriented programming, locations in memory or on a
physical device, or the like) are always defined relative to a
specific context. For instance, the file identified by a particular file name can only be determined when the directory
containing the file (the context) is known. The file identified
by a pathname can be determined only when the file system
(context) is known. Similarly, the addresses in a process
address space, the keys in a database table, or domain names
on a global computer network such as the Internet are
meaningful only because they are specified relative to a
context.
In prior art systems for identifying data items there is no
direct relationship between the data names and the data item.
The same data name in two different contexts may refer to
different data items, and two different data names in the
same context may refer to the same data item.
In addition, because there is no correlation between a data
name and the data it refers to, there is no a priori way to
confirm that a given data item is in fact the one named by a
data name. For instance, in a DP system, if one processor
requests that another processor deliver a data item with a
given data name, the requesting processor cannot, in
general, verify that the data delivered is the correct data
(given only the name). Therefore it may require further
processing, typically on the part of the requester, to verify
that the data item it has obtained is, in fact, the item it
requested.
A common operation in a DP system is adding a new data
item to the system. When a new data item is added to the
system, a name can be assigned to it only by updating the
context in which names are defined. Thus such systems
require a centralized mechanism for the management of
names. Such a mechanism is required even in a multiprocessing system when data items are created and identified
at separate processors in distinct locations, and in which
there is no other need for communication when data items
are added.
In many data processing systems or environments, data
items are transferred between different locations in the
system. These locations may be processors in the data
processing system, storage devices, memory, or the like. For
example, one processor may obtain a data item from another
processor or from an external storage device, such as a
floppy disk, and may incorporate that data item into its
system (using the name provided with that data item).
However, when a processor (or some location) obtains a
data item from another location in the DP system, it is
possible that this obtained data item is already present in the
system (either at the location of the processor or at some
other location accessible by the processor) and therefore a
duplicate of the data item is created. This situation is
common in a network data processing environment where
proprietary software products are installed from floppy disks
onto several processors sharing a common file server. In
these systems, it is often the case that the same product will
be installed on several systems, so that several copies of
each file will reside on the common file server.
In some data processing systems in which several processors are connected in a network, one system is designated
as a cache server to maintain master copies of data items,
and other systems are designated as cache clients to copy
local copies of the master data items into a local cache on an
as-needed basis. Before using a cached item, a cache client
must either reload the cached item, be informed of changes
to the cached item, or confirm that the master item corre-
This is a continuation of application Ser. No. 09/283,160,
filed Apr. 1, 1999, now U.S. Pat. No. 6,415,280, which is a
division of application Ser. No. 08/960,079, filed Oct. 24,
1997, now U.S. Pat. No. 5,978,791 filed Oct. 24,2001 which
is a continuation of Ser. No. 08/425,160, filed Apr. 11, 1995,
now abandoned.
S
10
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to data processing systems and,
more particularly, to data processing systems wherein data
items are identified by substantially unique identifiers which
depend on all of the data in the data items and only on the
data in the data items.
2. Background of the Invention
Data processing (DP) systems, computers, networks of
computers, or the like, typically offer users and programs
various ways to identify the data in the systems.
Users typically identify data in the data processing system
by giving the data some form of name. For example, a
typical operating system (OS) on a computer provides a file
system in which data items are named by alphanumeric
identifiers. Programs typically identify data in the data
processing system using a location or address. For example,
a program may identify a record in a file or database by using
a record number which serves to locate that record.
In all but the most primitive operating systems, users and
programs are able to create and use collections of named
data items, these collections themselves being named by
identifiers. These named collections can then, themselves,
be made part of other named collections. For example, an
OS may provide mechanisms to group files (data items) into
directories (collections). These directories can then, themselves be made part of other directories. A data item may
thus be identified relative to these nested directories using a
sequence of names, or a so-called pathname, which defines
a path through the directories to a particular data item (file
or directory).
As another example, a database management system may
group data records (data items) into tables and then group
these tables into database files (collections). The complete
address of any data record can then be specified using the
database file name, the table name, and the record number of
that data record.
Other examples of identifying data items include: identifying files in a network file system, identifying objects in an
object-oriented database, identifying images in an image
database, and identifying articles in a text database.
In general, the terms "data" and "data item" as used herein
refer to sequences of bits. Thus a data item may be the
contents of a file, a portion of a file, a page in memory, an
object in an object-oriented program, a digital message, a
digital scanned image, a part of a video or audio signal, or
any other entity which can be represented by a sequence of
bits. The term "data processing" herein refers to the processing of data items, and is sometimes dependent on the
type of data item being processed. For example, a data
processor for a digital image may differ from a data processor for an audio signal.
In all of the prior data processing systems the names or
identifiers provided to identify data items (the data items
15
20
25
30
35
40
45
50
55
60
65
Case 6:11-cv-00656 Document 1-4
Filed 12/08/11 Page 37 of 63 PageID #: 164
US 6,928,442 B2
3
4
sponding to the cached item has not changed. In other words,
a cache client must synchronize its data items with those on
the cache server. This synchronization may involve reloading data items onto the cache client. The need to keep the
cache synchronized or reload it adds significant overhead to 5
existing caching mechanisms.
In view of the above and other problems with prior art
systems, it is therefore desirable to have a mechanism which
allows each processor in a multiprocessor system to determine a common and substantially unique identifier for a data 10
item, using only the data in the data item and not relying on
any sort of context.
It is further desirable to have a mechanism for reducing
multiple copies of data items in a data processing system and
to have a mechanism which enables the identification of 15
identical data items so as to reduce multiple copies. It is
further desirable to determine whether two instances of a
data item are in fact the same data item, and to perform
various other systems' functions and applications on data
items without relying on any context information or prop- 20
erties of the data item.
It is also desirable to provide such a mechanism in such
a way as to make it transparent to users of the data
processing system, and it is desirable that a single mecha - 25
nism be used to address each of the problems described
above.
SUMMARY OF THE INVENTION
This invention provides, in a data processing system, a
method and apparatus for identifying a data item in the
system, where the identity of the data item depends on all of
the data in the data item and only on the data in the data item.
Thus the identity of a data item is independent of its name,
origin, location, address, or other information not derivable
directly from the data, and depends only on the data itself.
This invention further provides an apparatus and a method
for determining whether a particular data item is present in
the system or at a location in the system, by examining only
the data identities of a plurality of data items.
Using the method or apparatus of the present invention,
the efficiency and integrity of a data processing system can
be improved. The present invention improves the design and
operation of a data storage system, file system, relational
database, object-oriented database, or the like that stores a
plurality of data items, by making possible or improving the
design and operation of at least some or all of the following
features:
the system stores at most one copy of any data item at a
given location, even when multiple data names in the
system refer to the same contents;
the system avoids copying data from source to destination
locations when the destination locations already have
the data;
the system provides transparent access to any data item by
reference only to its identity and independent of its
present location, whether it be local, remote, or offline;
the system caches data items from a server, so that only
the most recently accessed data items need be retained;
when the system is being used to cache data items,
problems of maintaining cache consistency are
avoided;
the system maintains a desired level of redundancy of data
items in a network of servers, to protect against failure
by ensuring that multiple copies of the data items are
present at different locations in the system;
30
35
40
the system automatically archives data items as they are
created or modified;
the system provides the-size, age, and location of groups
of data items in order to decide whether they can be
safely removed from a local file system;
the system can efficiently record and preserve any collection of data items;
the system can efficiently make a copy of any collection
of data items, to support a version control mechanism
for groups of the data items;
the system can publish data items, allowing other, possibly anonymous, systems in a network to gain access to
the data items and to rely on the availability of the data
items;
the system can maintain a local inventory of all the data
items located on a given removable medium, such as a
diskette or CD-ROM, the inventory is independent of
other properties of the data items such as their name,
location, and date of creation;
the system allows closely related sets of data items, such
as matching or corresponding directories on disconnected computers, to be periodically resynchronized
with one another;
the system can verify that data retrieved from another
location is the desired or requested data, using only the
data identifier used to retrieve the data;
the system can prove possession of specific data items by
content without disclosing the content of the data items,
for purposes of later legal verification and to provide
anonymity;
the system tracks possession of specific data items according to content by owner, independent of the name, date,
or other properties of the data item, and tracks the uses
of specific data items and files by content for accounting purposes.
Other objects, features, and characteristics of the present
invention as well as the methods of operation and functions
of the related elements of structure, and the combination of
parts and economies of manufacture, will become more
apparent upon consideration of the following description
and the appended claims with reference to the accompanying drawings, all of which form a part of this specification.
45
BRIEF DESCRIPTION OF THE DRAWINGS
50
FIGS. l(a) and l(b) depicts a typical data processing
system in which a preferred embodiment of the present
invention operates;
FIG. 2 depicts a hierarchy of data items stored at any
location in such a data processing system;
FIGS. 3-9 depict data structures used to implement an
embodiment of the present invention; and
FIGS. 10(a)-28 are flow charts depicting operation of
various aspects of the present invention.
55
DETAILED DESCRIPTION OF THE
PRESENTLY PREFERRED EXEMPLARY
EMBODIMENTS
60
65
An embodiment of the present invention is now described
with reference to a typical data processing system 100,
which, with reference to FIGS. l(a) and l(b), includes one
or more processors (or computers) 102 and various storage
devices 104 connected in some way, for example by a bus
106.
Each processor 102 includes a CPU 108, a memory 110
and one or more local storage devices 112. The CPU 108,
Case 6:11-cv-00656 Document 1-4
Filed 12/08/11 Page 38 of 63 PageID #: 165
US 6,928,442 B2
5
6
memory 110, and local storage device 112 may be internally
processor 102 in the system, a memory of a particular
connected, for example by a bus 114. Each processor 102
processor, a storage device, a removable storage medium
may also include other devices (not shown), such as a
(such as a floppy disk or compact disk), or any other physical
keyboard, a display, a printer, and the like.
location in the system. The term "local" with respect to a
In a data processing system 100, wherein more than one 5 particular processor 102 refers to the memory and storage
processor 102 is used, that is, in a multiprocessor system, the
devices of that particular processor.
processors may be in one of various relationships. For
In the following, the terms "True Name", "data identity"
example, two processors 102 may be in a client/server,
and "data identifier" refer to the substantially unique data
client/client, or a server/server relationship. These interidentifier for a particular data item. The term "True File"
processor relationships may be dynamic, changing depend- 10
refers to the actual file, segment, or data item identified by
ing on particular situations and functions. Thus, a particular
a True Name.
processor 102 may change its relationship to other procesA file system for a data processing system 100 is now
sors as needed, essentially setting up a peer-to-peer relationdescribed which is intended to work with an existing opership with other processors. In a peer-to-peer relationship,
sometimes a particular processor 102 acts as a client
ating system by augmenting some of the operating system's
processor, whereas at other times the same processor acts as 15 file management system codes. The embodiment provided
a server processor. In other words, there is no hierarchy
relies on the standard file management primitives for actuimposed on or required of processors 102.
ally storing to and retrieving data items from disk, but uses
the mechanisms of the present invention to reference and
In a multiprocessor system, the processors 102 may be
access those data items.
homogeneous or heterogeneous. Further, in a multiprocessor
data processing system 100, some or all of the processors 20
The processes and mechanisms (services) provided in this
102 may be disconnected from the network of processors for
embodiment are grouped into the following categories:
periods of time. Such disconnection may be part of the
primitive mechanisms, operating system mechanisms,
normal operation of the system 100 or it may be because a
remote mechanisms, background mechanisms, and extended
particular processor 102 is in need of repair.
mechanisms.
25
Within a data processing system 100, the data may be
Primitive mechanisms provide fundamental capabilities
organized to form a hierarchy of data storage elements,
used to support other mechanisms. The following primitive
wherein lower level data storage elements are combined to
mechanisms are described:
form higher level elements. This hierarchy can consist of, for
1. Calculate True Name;
example, processors, file systems, regions, directories, data 30
2. Assimilate Data Item;
files, segments, and the like. For example, with reference to
3. New True File;
FIG. 2, the data items on a particular processor 102 may be
organized or structured as a file system 116 which comprises
4. Get True Name from Path;
regions 117, each of which comprises directories 118, each
5. Link path to True Name;
of which can contain other directories 118 or files 120. Each
35
6. Realize True File from Location;
file 120 being made up of one or more data segments 122.
7. Locate Remote File;
In a typical data processing system, some or all of these
8. Make True File Local;
elements can be named by users given certain implementa9. Create Scratch File;
tion specific naming conventions, the name (or pathname) of
an element being relative to a context. In the context of a 40
10. Freeze Directory;
data processing system 100, a pathname is fully specified by
11. Expand Frozen Directory;
a processor name, a filesystem name, a sequence of zero or
12. Delete True File;
more directory names identifying nested directories, and a
13. Process Audit File Entry;
final file name. (Usually the lowest level elements, in this
14. Begin Grooming;
case segments 122, cannot be named by users.)
45
15. Select For Removal; and
In other words, a file system 116 is a collection of
16. End Grooming.
directories 118. A directory 118 is a collection of named files
Operating system mechanisms provide typical familiar
120---both data files 120 and other directory files 118. A file
file system mechanisms, while maintaining the data struc120 is a named data item which is either a data file (which
may be simple or compound) or a directory file 118. A 50 tures required to offer the mechanisms of the present invention. Operating system mechanisms are designed to augment
simple file 120 consists of a single data segment 122. A
existing operating systems, and in this way to make the
compound file 120 consists of a sequence of data segments
present invention compatible with, and generally transparent
122. A data segment 122 is a fixed sequence of bytes. An
to, existing applications. The following operating system
important property of any data segment is its size, the
55 mechanisms are described:
number of bytes in the sequence.
A single processor 102 may access one or more file
1. Open File;
systems 116, and a single storage device 104 may contain
2. Close File;
one or more file systems 116, or portions of a file system 116.
3. Read File;
For instance, a file system 116 may span several storage
4. Write File;
devices 104.
60
5. Delete File or Directory;
In order to implement controls in a file system, file system
6. Copy File or Directory;
116 may be divided into distinct regions, where each region
7. Move File or Directory;
is a unit of management and control. A region consists of a
8. Get File Status; and
given directory 118 and is identified by the pathname (user
defined) of the directory.
65
9. Get Files in Directory.
In the following, the term "location", with respect to a
Remote mechanisms are used by the operating system in
data processing system 100, refers to any of a particular
responding to requests from other processors. These mecha-
Case 6:11-cv-00656 Document 1-4
Filed 12/08/11 Page 39 of 63 PageID #: 166
US 6,928,442 B2
7
8
nisms enable the capabilities of the present invention in a
The data structures described are assumed to reside on
peer-to-peer network mode of operation. The following
individual peer processors 102 in the data processing system
remote mechanisms are described:
100. However, they can also be shared by placing them on
a remote, shared file server (for instance, in a local area
1. Locate True File;
5 network of machines). In order to accommodate sharing data
2. Reserve True File;
structures, it is necessary that the processors accessing the
3. Request True File;
shared database use the appropriate locking techniques to
4. Retire True File;
ensure that changes to the shared database do not interfere
with one another but are appropriately serialized. These
5. Cancel Reservation;
10 locking techniques are well understood by ordinarily skilled
6. Acquire True File;
programmers of distributed applications.
7. Lock Cache;
It is sometimes desirable to allow some regions to be local
8. Update Cache; and
to a particular processor 102 and other regions to be shared
9. Check Expiration Date.
among processors 102. (Recall that a region is a unit of file
Background mechanisms are intended to run occasionally 15 system management and control consisting of a given direcand at a low priority. These provide automated management
tory identified by the pathname of the directory.) In the case
capabilities with respect to the present invention. The folof local and shared regions, there would be both local and
lowing background mechanisms are described:
shared versions of each data structure. Simple changes to the
processes described below must be made to ensure that
1. Mirror True File;
20 appropriate data structures are selected for a given operation.
2. Groom Region;
The local directory extensions (LDE) table 124 is a data
3. Check for Expired Links; and
structure which provides information about files 120 and
4. Verify Region; and
directories 118 in the data processing system 100. The local
5. Groom Source List.
directory extensions table 124 is indexed by a pathname or
Extended mechanisms run within application programs 25 contextual name (that is, a user provided name) of a file and
over the operating system. These mechanisms provide soluincludes the True Name for most files. The information in
tions to specific problems and applications. The following
local directory extension table 124 is in addition to that
extended mechanisms are described:
provided by the native file system of the operating system.
1. Inventory Existing Directory;
The True File registry (TFR) 126 is a data store for listing
2. Inventory Removable, Read-only Files;
30 actual data items which have True Names, both files 120 and
segments 122. When such data items occur in the True File
3. Synchronize directories;
registry 126 they are known as True Files. True Files are
4. Publish Region;
identified in True File registry 126 by their True Names or
5. Retire Directory;
identities. The table True File registry 126 also stores
6. Realize Directory at location;
35 location, dependency, and migration information about True
7. Verify True File;
Files.
8. Track for accounting purposes; and
The region table (RT) 128 defines areas in the network
storage which are to be managed separately. Region table
9. Track for licensing purposes.
The file system herein described maintains sufficient
128 defines the rules for access to and migration of files 120
information to provide a variety of mechanisms not ordi- 40 among various regions with the local file system 116 and
narily offered by an operating system, some of which are
remote peer file systems.
listed and described here. Various processing performed by
The source table (ST) 130 is a list of the sources of True
Files other than the current True File registry 126. The
this embodiment of the present invention will now be
described in greater detail.
source table 130 includes removable volumes and remote
In some embodiments, some files 120 in a data processing 45 processors.
The audit file (AF) 132 is a list of records indicating
system 100 do not have True Names because they have been
recently received or created or modified, and thus their True
changes to be made in local or remote files, these changes to
Names have not yet been computed. A file that does not yet
be processed in background.
have a True Name is called a scratch file. The process of
The accounting log (AL) 134 is a log of file transactions
assigning a True Name to a file is referred to as assimilation, 50 used to create accounting information in a manner which
and is described later. Note that a scratch file may have a
preserves the identity of files being tracked independent of
user provided name.
their name or location.
Some of the processing performed by the present invenThe license table (LT) 136 is a table identifying files,
tion can take place in a background mode or on a delayed or
which may only be used by licensed users, in a manner
as-needed basis. This background processing is used to 55 independent of their name or location, and the users licensed
to use them.
determine information that is not immediately required by
Detailed Descriptions of the Data Structures
the system or which may never be required. As an example,
in some cases a scratch file is being changed at a rate greater
The following table summarizes the fields of an local
than the rate at which it is useful to determine its True Name.
directory extensions table entry, as illustrated by record 138
In these cases, determining the True Name of the file can be 60 in FIG. 3.
postponed or performed in the background.
Data Structures
The following data structures, stored in memory 110 of
Field
Description
one of more processors 102 are used to implement the
mechanisms described herein. The data structures can be 65
Region ID
identifies the region in which this file is
contained.
local to each processor 102 of the system 100, or they can
reside on only some of the processors 102.
Case 6:11-cv-00656 Document 1-4
Filed 12/08/11 Page 40 of 63 PageID #: 167
US 6,928,442 B2
9
-continued
Field
Pathname
True Name
Type
Scratch
File ID
Time of
last
access
Time of
last
modification
Safe flag
Lock flag
Size
Owner
10
-continued
the user provided name or contextual name
of the file or directory, relative to the
region in which it occurs.
the computed True Name or identity of the
file or directory. This True Name is not
always up to date, and it is set to a
special value when a file is modified and
is later recomputed in the background.
indicates whether the file is a data file
or a directory.
the physical location of the file in the
file system, when no True Name has been
calculated for the file. As noted above,
such a file is called a scratch file.
the last access time to this file. If this
file is a directory, this is the last
access time to any file in the directory.
the time of last change of this file. If
this file is a directory, this is the last
modification time of any file in the
directory.
indicates that this file (and, if this file
is a directory, all of its subordinate
files) have been backed up on some other
system, and it is therefore safe to remove
them.
indicates whether a file is locked, that
is, it is being modified by the local processor or a remote processor. Only one
processor may modify a file at a time.
the full size of this directory (including
all subordinate files), if all files in it
were fully expanded and duplicated. For a
file that is not a directory this is the
size of the actual True File.
the identity of the user who owns this
file, for accounting and license tracking
purposes.
Each record of the True File registry 126 has the fields
shown in the True File registry record 140 in FIG. 4. The
True File registry 126 consists of the database described in
the table below as well as the actual True Files identified by
the True File IDs below.
Field
Description
True Name
computed True Name or identity of
the file.
compressed version of the True File
may be stored instead of, or in
addition to, an uncompressed
version. This field provides the
identity of the actual
representation of the compressed
version of the file.
tentative count of how many
references have been selected for
deletion during a grooming
operation.
most recent date and time the
content of this file was accessed.
date and time after which this file
may be deleted by this server.
processor IDs of other processors
which contain references to this
True File.
source ID(s) of zero or more
sources from which this file or
data item may be retrieved.
identity or disk location of the
actual physical representation of
Compressed
File ID
Grooming
delete count
Time of last
access
Expiration
Dependent
processors
Source IDs
True File 10
Field
Description
5
10
Use count
15
20
25
Description
the file or file segment. It is
sufficient to use a filename in the
registration directory of the
underlying operating system. The
True File ID is absent if the
actual file is not currently
present at the current location.
number of other records on this
processor which identify this True
File.
A region table 128, specified by a directory pathname,
records storage policies which allow files in the file system
to be stored, accessed and migrated in different ways.
Storage policies are programmed in a configurable way
using a set of rules described below.
Each region table record 142 of region table 128 includes
the fields described in the following table (with reference to
FIG. 5):
Field
Description
Region ID
internally used identifier for this
region.
file system on the local processor of
which this region is a part.
a pathname relative to the region file
system which defines the location of
this region. The region consists of
all files and directories subordinate
to this pathname, except those in a
region subordinate to this region.
zero or more identifiers of processors
which are to keep mirror or archival
copies of all files in the current
region. Multiple mirror processors
can be defined to form a mirror group.
number of copies of each file in this
region that should be retained in a
mirror group.
specifies whether this region is local
to a single processor 102, shared by
several processors 102 (if, for
instance, it resides on a shared file
server), or managed by a remote
processor.
the migration policy to apply to this
region. A single region might
participate in several policies. The
policies are as follows (parameters in
brackets are specified as part of the
policy):
Region file system
30
35
40
Region pathname
Mirror processor(s)
Mirror duplication
count
Region status
45
Policy
50
55
60
65
region is a cached version from
[processor 10];
region is a member of a mirror set
defined by [processor 10].
region is to be archived on
[processor 10].
region is to be backed up locally,
by placing new copies in [region
ID].
region is read only and may not be
changed.
region is published and expires on
[date].
Files in this region should be
compressed.
A source table 130 identifies a source location for True
Files. The source table 130 is also used to identify client
processors making reservations on the current processor.
Case 6:11-cv-00656 Document 1-4
Filed 12/08/11 Page 41 of 63 PageID #: 168
US 6,928,442 B2
11
12
Each source record 144 of the source table 130 includes the
fields summarized in the following table, with reference to
FIG. 6:
licensed to have access to it. Each license table record 150
includes the information summarized in the following table
"with reference to FIG. 9:
5
Field
Description
Field
Description
source ID
internal identifier used to identify a
particular source.
type of source location:
Removable Storage Volume
Local Region
Cache Server
Mirror Group Server
Cooperative Server
Publishing Server
Client
includes information about the rights
of this processor, such as whether it
can ask the local processor to store
data items for it.
measurement of the bandwidth, cost,
and reliability of the connection to
this source of True Files. The availability
is used to select from among
several possible sources.
information on how the local processor
is to access the source. This may be,
for example, the name of a removable
storage volume, or the processor ID
and region path of a region on a
remote processor.
True Name
licensee
True Name of a data item subject to license validation.
identity of a user authorized to have
access to this object.
source
type
10
Various other data structures are employed on some or all
of the processors 102 in the data processing system 100.
Each processor 102 has a global freeze lock (GFL) 152
15 (FIG. 1), which is used to prevent synchronization errors
source
when a directory is frozen or copied. Any processor 102 may
rights
include a special archive directory (SAD) 154 into which
directories may be copied for the purposes of archival. Any
source
processor 102 may include a special media directory (SMD)
availability
20 156, into which the directories of removable volumes are
stored to form a media inventory. Each processor has a
grooming lock 158, which is set during a grooming operasource
tion. During this period the grooming delete count of True
location
File registry entries 140 is active, and no True Files should
25 be deleted until grooming is complete. While grooming is in
effect, grooming information includes a table of pathnames
selected for deletion, and keeps track of the amount of space
that would be freed if all of the files were deleted.
Primitive Mechanisms
The audit file 132 is a table of events ordered by
The first of the mechanisms provided by the present
timestamp, each record 146 in audit file 132 including the 30
invention, primitive mechanisms, are now described. The
fields summarized in the following table (with reference to
FIG. 7):
mechanisms described here depend on underlying data management mechanisms to create, copy, read, and delete data
items in the True File registry 126, as identified by a True
35 File ID. This support may be provided by an underlying
Description
Field
operating system or disk storage manager.
The following primitive mechanisms are described:
Original Name
path of the file in question.
Operation
whether the file was created, read,
1. Calculate True Name;
written, copied or deleted.
2. Assimilate Data Item;
specifies whether the source is a file
40
or a directory.
3. New True File;
Processor ID
ID of the remote processor generating
4. Get True Name from Path;
this event (if not local).
Timestamp
time and date file was closed (required
5. Link Path to True Name;
only for accessed/modified files).
6. Realize True File from Location;
Pathname
Name of the file (required only for
45
rename).
7. Locate Remote File;
True Name
computed True Name of the file. This is
8. Make True File Local;
used by remote systems to mirror changes
to the directory and is filled in during
9. Create Scratch File;
background processing.
10. Freeze Directory;
50
11. Expand Frozen Directory;
Each record 148 of the accounting log 134 records an
12. Delete True File;
event which may later be used to provide information for
13. Process Audit File Entry;
billing mechanisms. Each accounting log entry record 148
14. Begin Grooming;
includes at least the information summarized in the following table, with reference to FIG. 8:
15. Select For Removal; and
55
16. End Grooming.
1. Calculate True Name
A True Name is computed using a function, MD, which
Field
Description
reduces a data block B of arbitrary length to a relatively
date of entry
date and time of this log entry.
60 small, fixed size identifier, the True Name of the data block,
type of
Entry types include create file,
such that the True Name of the data block is virtually
delete file, and transmit file.
entry
guaranteed to represent the data block B and only data block
True Name of data item in question.
True Name
B.
identity of the user responsible for
owner
this action.
The function MD must have the following properties:
65
1. The domain of the function MD is the set of all data
items. The range of the function MD is the set of True
Each record 150 of the license table 136 records a
relationship between a licensable data item and the user
Names.
Case 6:11-cv-00656 Document 1-4
Filed 12/08/11 Page 42 of 63 PageID #: 169
US 6,928,442 B2
13
14
2. The function MD must take a data item of arbitrary
entire universe of data items, as long as sufficient uniqueness
length and reduce it to an integer value in the range 0
is provided per category of data items. This is because the
to N-l, where N is the cardinality of the set of True
tags provide an additional level of uniqueness.
Names. That is, for an arbitrary length data block B,
A mechanism for calculating a True Name given a data
O
Disclaimer: Justia Dockets & Filings provides public litigation records from the federal appellate and district courts. These filings and docket sheets should not be considered findings of fact or liability, nor do they necessarily reflect the view of Justia.
Why Is My Information Online?