PersonalWeb Technologies LLC v. Google, Inc. et al

Filing 1

COMPLAINT for Patent Infringement against All Defendants ( Filing fee $ 350 receipt number 0540-3354008.), filed by PersonalWeb Technologies LLC. (Attachments: # 1 Civil Cover Sheet, # 2 Exhibit A - Patent '791, # 3 Exhibit B - Patent '280, # 4 Exhibit C - Patent '442, # 5 Exhibit D - Patent '310, # 6 Exhibit E - Patent '539, # 7 Exhibit F - Patent '544, # 8 Exhibit G - Patent '662, # 9 Exhibit H - Patent '096)(Tribble, Max) (Additional attachment(s) added on 12/14/2011: # 10 Exhibit A Searchable, # 11 Exhibit B Searchable, # 12 Exhibit C Searchable, # 13 Exhibit D Searchable, # 14 Exhibit E Searchable, # 15 Exhibit F Searchable, # 16 Exhibit G Searchable, # 17 Exhibit H Searchable) (mjc, ). (Entered: 12/08/2011)

Download PDF
Case 6:11-cv-00656 Document 1-4 Filed 12/08/11 Page 1 of 63 PageID #: 128 EXHIBITC EXHIBIT C !. Case 6:11-cv-00656 Document 1-4 Filed 12/08/11 Page 2 of 63 PageID #: 129 111111 1111111111111111111111111111111111111111111111111111111111111 US006928442B2 (54) (75) United States Patent (10) Farber et ai. (12) (45) ENFORCEMENT AND POLICING OF LICENSED CONTENT USING CONTENT-BASED IDENTIFIERS FOREIGN PATENT DOCUMENTS EP Notice: (21) Appl. No.: 09/987,723 (22) Filed: Subject to any disclaimer, the term of this patent is extended or adjusted under 35 U.S.c. 154(b) by 0 days. Nov. 15,2001 Prior Publication Data (65) (Continued) US 2002/0052884 A1 May 2, 2002 Primary Examiner-Luke S Wossum Assistant Examiner-Khanh Ph am (74) Attorney, Agent, or Firm-Davidson Berquist Jackson & Gowdey, LLP Related U.S. Application Data (63) (51) (52) (58) Continuation of application No. 09/283,160, filed on Apr. 1, 1999, now Pat. No. 6,415,280, which is a division of application No. 08/960,079, filed on Oct. 24, 1997, now Pat. No. 5,978,791, which is a continuation of application No. 08/425,160, filed on Apr. 11, 1995, now abandoned. (57) References Cited U.S. PATENT DOCUMENTS 3,668,647 4,215,402 4,290,105 4,376,299 4,405,829 4,412,285 A A A A A A 6/1972 7/1980 9/1981 3/1983 9/1983 10/1983 ABSTRACT Data files are distributed across a plurality of computers. The computers may form a network such as a content delivery network (CDN) or a peer-to-peer network. The network may operate as a TCP/IP network such as the Internet. Data files may represent may represent digital messages, images, videos or audio signals. For content---data items or files in the system-a name is obtained (or determined), where the name is based, at least in part, on a given function of the data in a data item or file. The given function may be a message digest or hash function, and it may be MD4, MD5, and SHA. A cony of a requested file is only provided to licensed (or authorized) parties. The system may check one or more computers for unauthorized or unlicensed content. Content is served based on a measure of availability of servers. Int. CI? ................................................ G06F 17/30 U.S. CI. ............................. 707/10; 707/3; 707/101; 707/200; 709/203; 709/219; 709/229 Field of Search .............................. 707/3, 6, 9, 10, 707/101,200; 709/203, 219, 229 (56) 4/1994 Gwertzman, James, et al. "The Case for Geographical PushCaching." Technical Report HU TR 34-94 (excerpt), Harvard University, DAS, Cambridge, MA02138, 1994, 2 pgs. Grigni, Michelangelo, et al. "Tight Bounds on Minimum Broadcasts Networks." SIAM Journal of Discrete Mathematics, vol. 4, No.2, May 1991, pp. 207-222. Devine, Robert. "Design and Implementation of DDH: A Distributed Dynamic Hashing Algorithm." In Proceedings of 4th International Conference on Foundations of Data Organizations and Algorithms, 1993, pp. 101-114. Deering, Stephen, et al. "Multicast Routing in Datagram Internetworks and Extended LANs." ACM Transactions on Computer Systems, vol. 8, No.2, May 1990, pp. 85-110. Assignees: Kinetech, Inc., Northbrook, IL (US); Savvis, Inc., Town & Country, MO (US) ( *) 0592045 OTHER PUBLICATIONS Inventors: David A. Farber, Ojai, CA (US); Ronald D. Lachman, Northbrook, IL (US) (73) Patent No.: US 6,928,442 B2 Date of Patent: Aug. 9,2005 Evangelisti Mitchell Cichelli Rivest Rivest Neches 56 Claims, 31 Drawing Sheets (Continued) SIMPL DATA ITEM TRUE NAME ~ Case 6:11-cv-00656 Document 1-4 Filed 12/08/11 Page 3 of 63 PageID #: 130 US 6,928,442 B2 Page 2 U.S. PATENT DOCUMENTS 4,414,624 4,441,155 4,464,713 4,490,782 4,571,700 4,577,293 4,642,793 4,675,810 4,691,299 4,725,945 4,773,039 4,887,235 4,888,681 4,922,414 4,922,417 4,972,367 5,025,421 5,050,074 5,050,212 5,057,837 5,077,658 5,129,081 5,129,082 5,144,667 5,179,680 5,202,982 5,208,858 5,276,901 5,287,499 5,301,286 5,301,316 5,341,477 5,343,527 5,357,623 5,384,565 5,404,508 5,452,447 5,459,860 5,542,087 5,581,758 5,638,443 5,640,564 5,781,629 5,802,291 5,809,494 5,835,087 5,907,704 6,006,018 6,134,603 A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A 11/1983 4/1984 8/1984 12/1984 2/1986 3/1986 2/1987 6/1987 9/1987 2/1988 9/1988 12/1989 12/1989 5/1990 5/1990 11/1990 6/1991 9/1991 9/1991 10/1991 12/1991 7/1992 7/1992 9/1992 1/1993 4/1993 5/1993 1/1994 2/1994 4/1994 4/1994 8/1994 8/1994 10/1994 1/1995 4/1995 9/1995 10/1995 7/1996 12/1996 6/1997 6/1997 * 7/1998 9/1998 9/1998 * 11/1998 5/1999 12/1999 10/2000 Summer, Jr. Fletcher Benhase Dixon Emry, Jr. Matick Meaden Gruner Rivest Kronstadt Zamora Holloway Barnes Holloway Churm et al. .................. 707/1 Burke Cho Marca Dyson Colwell Bendert Kobayashi Tirling Pogue, Jr. Colwell Gramlich et al. .............. 707/2 Vollert Howell Nemes .......................... 707/2 Rajani Hamilton Pitkin et al. ................ 709/226 Moore Megory-Cohen Cannon Konrad Nelson et al. .............. 707/205 Burnett Neimat et al. ................ 707/10 Burnett Stefik et al. .................. 705/54 Hamilton et al. ........... 709/303 Haber et al. ................ 713/177 Balick et al. ............... 709/202 Nguyen ......................... 707/1 Herz et al. .................. 345/810 Gudmundson et al. Burnett et al. ......... 395/200.49 Jones et al. ................. 709/330 OlliER PUBLICATIONS Cormen, Thomas H., et al. Introduction to Algorithms, The MIT Press, Cambridge, Massachusetts, 1994, pp. 219-243, 991-993. Naor, Moni, et al. "The Load, Capacity and Availability of Quorum Systems." In Proceedings of the 35th IEEE Symposium on Foundations of Computer Science, Nov. 1994, pp.214-225. Nisan, Noam. "Pseudorandom Generators for SpaceBounded Computation." In Proceedings of the TwentySecond Annual ACM Symposium on Theory of Computing, May 1990, pp. 204-212. Palmer, Mark, et al. "Fido: A Cache that Learns to Fetch." In Proceedings of the 17th International Conference on Very Large Data Bases, Sep. 1991, pp. 255-264. Peleg, David, et al. "The Availability of Quorum Systems." Information and Computation 123, 1995, 210-223. Rabin, Michael. "Efficient Dispersal of Information for Security, Load Balancing, and Fault Tolerance." Journal of the ACM, vol. 36, No.2, Apr. 1989, pp. 335-348. Ravi, R., "Rapid Rumor Ramification: Approximating the Minimum Broadcast Time." In Proceedings of the 35th IEEE Symposium on Foundation of Computer Science, Nov. 1994, pp. 202-213. Schmidt, Jeanette, et al. "Chernoff-Hoeffding Bounds for Applications with Limited Independence." In Proceedings of the 4th ACS-SIAM Symposium on Discrete Algorithms, 1993, pp. 331-340. Tarjan, Robert Endre, et al. "Storing a Sparse Table." Communications of the ACM, vol. 22, No. 11, Nov. 1979, pp. 606-611. Wegman, Mark, et al. "New Hash Functions and Their Use in Authentication and Set Equality." Journal of Computer and System Sciences vol. 22, Jun. 1981, pp. 265-279. Vitter, Jeffrey Scott, et al. "Optimal Pre fetching via Data Compression." In Proceedings of 32nd IEEE Symposium on Foundations of Computer Science, Nov. 1991, pp. 121-130. Fredman, Michael, et al. "Storing a Sparse Table with 0(1) Worst Case Access Time." Journal of the Association for Computing Machinery, vol. 31, No.3, Jul. 1984, pp. 538-544. Yao, Andrew Chi-Chih. "Should Tables be Sorted?" Journal of the Association for Computing Machinery, vol. 28, No.3, Jul. 1981, pp. 615-628. Floyd, Sally, et al. "A reliable Multicast Framework for Light-Weight Sessions and Application Level Framing." In Proceeding of ACM SIGCOMM '95, pp. 342-356. Feeley, Michael, et al. "Implementing Global Memory Management in a Workstation Cluster." In Proceedings of the 15th ACM Symposium on Operating Systems Principles, 1995, pp. 201-212 . Carter, J. Lawrence, et al. "Universal Classes of Hash Functions." Journal of Computer and System Sciences, vol. 18, No.2, Apr. 1979, pp. 143-154. Patent Abstracts of Japan, "Electronic Mail Multiplexing System and Communication Control Method in The System." Jun. 30, 1993, JP 05162529. Kim et aI., "Experiences with Tripwire: Using Integrity Checkers For Intrusion Detection", COAST Labs. Dept. of Computer Sciences Purdue University, Feb. 22, 1995, pp. 1-12. Kim et aI., "The Design and Implementation of Tripwire: A file System Integrity Checker", COAST Labs. Dept. of Computer Sciences Purdue University, Nov. 19, 1993, pp. 1-21. Bert dem Boer et aI., Collisions for the compression function of MDs, pp. 292-304. Sakti Pramanik et aI., Multi-Directory Hasing, 1993, Info. Sys., vol. 18, No.1, pp. 63-74. Murlidhar Koushik, Dynamic Hashing with Distributed Overflow Space: A File Organization with Good Insertion Performance, 1993, Info. Sys., vol. 18, No.5, pp. 299-317. Witold Litwin et aI., LH* -Linear Hashing for Distributed Files, HP Labs Tech. Report No. HPL-93-21, Jun. 1993, pp. 1-22. Yuliang Zheng et aI., HAVAL-A One-Way Hashing Algorithm with Variable Length of Output (Extended Abstract), pp. 83-105. Chris Charnes and Josef Pieprzky, Linear Nonequivalence versus Nonlinearity, Pieprzky, pp. 156-164. Case 6:11-cv-00656 Document 1-4 Filed 12/08/11 Page 4 of 63 PageID #: 131 US 6,928,442 B2 Page 3 Witold Litwin et aI., Linear Hashing for Distributed Files, ACM SIGMOD, May 1993, pp. 327-336. Ming-Ling Lo et aI., On Optimal Processor Allocation to Support Pipelined Hash loins, ACM SIGMOD, pp. 69-78, May 1993. Thomas A. Berson, Differential Cryptanalysis Mod 232 with Applications to MD5, pp. 69-81. William Perrizo et aI., Distributed loin Processing Performance Evaluation, Twenty-Seventh Hawaii International Conference on System Sciences, vol. II, pp. 236-244. Vijay Kumar, A Concurrency Control Mechanism Based on Extendible Hashing for Main Memory Database Systems, ACM, vol. 3, 1989, pp. 109-113. Birgit Pfitzman, Sorting Out Signature Schemes, Nov. 1993, 1st Conf. Computer & Comm. Security '93, p. 74-85. Zhiyu Tian et aI., A New Hashing Function: Statistical Behaviour and Algorithm, pp. 3-13. G. L. Friedman, Digital Camera with Apparatu for Authentication of Images Produced from an Image File, NASA Case No. NPO-19108-1-CU, U.S. Appl. No. 08/159,980, filed Nov. 24, 1993. H. Goodman, Ada, Object-Oriented Techniques, and Concurrency in Teaching Data Structures and File Management Report Documentation p. AD-A275 385 - 94-04277. Advances in Cryptology-Eurocrypt'93, Workshop on the Theory and Application of Cryptographic Techniques Lofthus, Norway, May 23-27, 1993 Proceedings. Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, vol. 22, Issue 2, lun. 1993. Advances in Cryptology-AUSCRYPT '92-Workshop on the Theory and Application of Cryptographic Techniques Gold Coast, Queensland, Australia, Dec. 13-16, 1992 Proceedings. Peter Deutsch (peterd@bunyip.com), "Re: MD5 and LiFNs (was: Misc Comments)", www.acl.lanl.gov/URl!archive/ uri-94q2.messages/0106.html, Apr. 26, 1994. Alexander Dupuy (dupuy@smarts.com), "RE: MD5 and LIFNs (was: Misc Comments)", www.acl.lanl.gov/URl!archive/uri-94q2.messages/0113.html, Apr. 26, 1994. Alexander Dupuy (dupuy@smarts.com), "MD5 and LIFNs (was: Misc Comments)", www.acl.lanl.gov/URl!archive/ uri-94q2.messages/0081.html, Apr. 17, 1994. Albert Langer (cmf851@anu.oz.au), http://groups.google.com/groups?selm= 1991Aug7.225159.786%40newshost.anu.edu.au&oe= UTF-8&output=gplain, Aug. 7, 1991. Clifford Lynch (Calur@uccmvsa.bitnet), "ietf urI/uri overview draft paper (long)", www.acl.lanl.gov/URl!archive/ uri-93q1.messages/0015.html, Mar. 25, 1993. K. Sollins and L. Masinter, "Functional Requirements for Uniform Resource Names", www.w3.org/Addressing/ rfc1737.txt, Dec. 1994, pp. 1-7. W3C:ID, HTTP: A protocol for networked information, "Basic HTTP as defined in 1992", www.w3.org/Protocols/ HTTP2.html, 1992. European Search Report issued Dec. 23, 2004. * cited by examiner Case 6:11-cv-00656 Document 1-4 u.s. Patent Aug. 9,2005 Filed 12/08/11 Page 5 of 63 PageID #: 132 Sheet 1 of 31 <0 US 6,928,442 B2 0:: C 'W"'" 0 til N 0:: 0 0 ~ fa u (J) tn W N ,.. 0 0 c.. 0:: 0 0 a:: c.. • • I • I 0:: -• 0 r.n CJ) N w 0 ,.. (.) C> lL.. I 0 I , 0:: c.. 0:: ,~W' ~ ~E ~ 0 fij 0 N 0 """" It/) • • '¢ ~ C2 0 ~ [ij )'U,C " en (/) C'! 0 Q 0 0 • c>W 0 c.. " t- C 'W til til W W (.) 0 a:: 0.. ~ Case 6:11-cv-00656 Document 1-4 u.s. Patent Aug. 9,2005 Filed 12/08/11 Page 6 of 63 PageID #: 133 Sheet 2 of 31 US 6,928,442 B2 r---- --------------------------------------------------, I I I I I I I I I I I I I I ~ I I o 1 I I I I. I I I I 1(\1 I I I I I 10 I_ I I I I I I I I I I I I I ~ ~--~~ a !\ CI) en W () a 0:: c.. - w I ~ j~ co ::> o c.. (!) W t2 ~ ~3 en ~ ~ () U ~ ~ ! I I I I : I \) I _______________________________________________________ J L .......... .c . (!) 1.L I FILE SYSTEM ~ ~ , I I 117 REGION ~ I 117 REGION ..... ..... REGION • • • ~17 = I REGION ~ I 118 ({Q ~~ 118 118 N C C (It DIRECTORY •• • DIRECTORY 'Jl =- ~ ~ ..... ~ 1 FILE' \ ( FILE) .1 122 o ...., '""'" ~ ( FILE 12~ SEGMENT SEGMENT 1 J I SEGMENT e rJ'l -..CJ\ \0 N 00 ~ ~ N ~ N Filed 12/08/11 Page 7 of 63 PageID #: 134 DIRECTORY Case 6:11-cv-00656 Document 1-4 d • rJl • Case 6:11-cv-00656 Document 1-4 u.s. Patent Aug. 9,2005 Filed 12/08/11 Page 8 of 63 PageID #: 135 Sheet 4 of 31 US 6,928,442 B2 FIG.3 138· Region ID Pathname True Name Type File ID Time of last access Time of last modification Safe fl.ag Lock fl~ Size owner FIG.4 140 True Name File ID Compressed File ID Source IDs Dependent processors Use count Time of last access Expiration Grooming delete count 142 Reqion ID Region file system Region pathname Region status Mirror processor(s) Mirror duplication count Policy FIG.5 144 FIG.6 ~ source IO source type ~ ..... ..... ~ = source rights source availability source location ~ 146 ({Q original Name ~~ operation N C C Ul FIG.7 Type Timestamp 'JJ. Pathname ..... =- ~ ~ Ul True Name o ...., ~ 148 FIG.8 t:::~::Ul True Name I::n:::e ---- - - - FIG.9 150 .. -I '""'" e rJ'l -..CJ\ \0 N 00 ~ ~ N ~ N Filed 12/08/11 Page 9 of 63 PageID #: 136 Processor IO Case 6:11-cv-00656 Document 1-4 d • rJl • Case 6:11-cv-00656 Document 1-4 u.s. Patent Aug. 9,2005 Filed 12/08/11 Page 10 of 63 PageID #: 137 Sheet 6 of 31 US 6,928,442 B2 FIG. 10(a) S/MPL DATA/TEM , .. -------------S~i8-----------\ : I , , \ 5212 COMPUTEMDFUNcnONON DATA ITEM I I S214 APPEND LENGTH MODULO 32 OF DATA ITEM , , \ \ I \ I ,I \ ',----------------------------TRUE NAME ~ ' Case 6:11-cv-00656 Document 1-4 u.s. Patent Aug. 9,2005 8216 r- YES Filed 12/08/11 Page 11 of 63 PageID #: 138 Sheet 7 of 31 US 6,928,442 B2 0,_ _--. FIG. IO{b) DATA ITEM SIMPLE? 8220 PARTmON DATA ITEM INTO SEGMENTS 5222 ASSIMILATE EACH SEGMENT (COMPUTING ITS TRUE NAME) r ,~----s2f8-----'" I I : COMPUTE TRUE : I I NAME OF SIMPLE : I : DATA ITEM : , .... _----- ------, I 8224 CREATE INDIRECT BLOCK OF SEGMENT TRUE NAMES 8226 ASSIMILATE INDIRECT BLOCK (COMPUTING ITS TRUE NAME) 5228 REPLACE FINAL 32 BITS OF TRUE NAME WITH LENGHT MOD 32 OF DATA ITEM I ~ FIG. II ~ ..... ..... DETERMINE TRUE NAME ~ = > = ({Q ~~ N c c YES Ul =- ~ ~ S236 ..... 00 o ...., * CREATE NEW ENTRY * SET USE COUNT TO 1 * STORE FILE 10 * seT OTHER FIELDS ~ '""'" S238 S239 DELETE FILE 10 STORE FILE lD e rJ'l -..CJ\ \0 N 00 ~ ~ N ~ N Filed 12/08/11 Page 12 of 63 PageID #: 139 'JJ. Case 6:11-cv-00656 Document 1-4 d • rJl • Case 6:11-cv-00656 Document 1-4 u.s. Patent Aug. 9,2005 Filed 12/08/11 Page 13 of 63 PageID #: 140 Sheet 9 of 31 US 6,928,442 B2 FIG.12 YES S240 UPDATE DEPENDENCY LIST NO 8242 SEND MESSAGE TO ~--------------~ CACHESERVERTO UPDATE CACHE S244 COMPRESS (IF DESIRED) 8246 MIRROR (IF DESIRED) Case 6:11-cv-00656 Document 1-4 u.s. Patent Filed 12/08/11 Page 14 of 63 PageID #: 141 US 6,928,442 B2 Sheet 10 of 31 Aug. 9,2005 FIG.13 S250 SEARCH FOR THE ~~~~~U-_ _~~ FAIL PATHNAME FOUND 8258 ASSIMILATE FILE 10 S256 FREEZE DIRECTORY Case 6:11-cv-00656 Document 1-4 u.s. Patent Aug. 9,2005 Filed 12/08/11 Page 15 of 63 PageID #: 142 US 6,928,442 B2 Sheet 11 of 31 5260 CONFIRM THAT TRUE NAME EXISTS LOCALLY 8262 FIG.14 SEARCH FOR PATHNAMEIN LDETABLE 8264 CONFIRM THAT DIRECTORY EXISTS YES 8268 DELETE TRUE FILE 8270 CREATE ENTRY IN LDE & UPDATE ~ ~ NO NEGATIVE RESPONSE 8278 ..... ..... ~ = YES " /' REQUEST MOUNT SEND RTF MESSAGE & WAIT FOR RESPONSE POSITIVE RESPONSE ~ ({Q ~~ N C C Ul 'JJ. =- ~ ~ ..... 'N " ""' o ...., ~ 8276 8280 ENTER TRUE FILE FIND FilE RETURNED INTO TFR '""'" 5282 L..----------.J~I FIG.15 VERIFY TRUE e rJ'l DESIRED) -..CJ\ \0 FILE (IF N 00 ~ ~ N ~ N Filed 12/08/11 Page 16 of 63 PageID #: 143 FAIL 8274 Case 6:11-cv-00656 Document 1-4 d • rJl • ~ 8284 ~ ..... ..... CLIENT SELECTS PROCESSOR(S) ~ = ~ ~o ~I FAIL I ({Q ~~ N C C Ul S286 NEGATIVE RESPONSE OR I TIMEOUT I CLIENT BROADCASTS =- FIG.16(a) ~ ~ ..... '""'" ~ o ...., ~ '""'" CLIENT WAITS POSItiVE RESPONSE -----*------ e rJ'l -..CJ\ \0 N 00 ~ ~ N ~ N Filed 12/08/11 Page 17 of 63 PageID #: 144 'JJ. Case 6:11-cv-00656 Document 1-4 d • rJl • ~ ~ ..... ..... ~ 0 STORE PROCESSOR 10 ~ = ~ ({Q ~~ N C C Ul FIG.16(b) 'JJ. =- ~ ~ ..... '~ ""'" o ...., ~ '""'" S291c SEND MESSAGE TO RESERVE TRUE FILE ON SOURCE PROCESSOR S290C- ~ "-/ o OURCE IS"'>- PUBLISHING SYSTEM? ,. -1 YES S291d DETERMINE EXPIRATION DATE AND ADD TO LIST e rJ'l -..CJ\ \0 N 00 ~ ~ N ~ N Filed 12/08/11 Page 18 of 63 PageID #: 145 5290B LOOK UP TFR FOR TRUE NAME & ADD SOURCE LOCATION 10 TO SOURCE IDS FOR TRUE NAME Case 6:11-cv-00656 Document 1-4 d • rJl • ~ ~ ..... ..... ~ NO FIG. 17(0) ~INI~K~UKIKUc , ... ~ ( FAIL 1 = > = ({Q ~~ N C C Ul DONE I 'JJ. =- ~ ~ ..... '""'" Ul 0 ...., NO ~ '""'" YES I 8298 DECOMPRESsl e rJ'l -..CJ\ \0 NO N 00 ~ ~ N ~ N Filed 12/08/11 Page 19 of 63 PageID #: 146 YES Case 6:11-cv-00656 Document 1-4 d • rJl • ----- ---~--- ~ ~ J S302 NOTIFY USER I ..... ..... ~ = ---STORE 10 > = ({Q S308·~ LOCATE REMOTE FILE NO MORE SOURCE ID S304 T N c c Ul 'JJ. DONE S306 =- ~ ~ ..... '""'" 0'1 o ...., REALIZE TRUE FILE FROM SQURCE(S) ~ '""'" e FIG.17(b) rJ'l -..CJ\ \0 N 00 ~ ~ N ~ N Filed 12/08/11 Page 20 of 63 PageID #: 147 SELECT SOURCE IDS ~~ Case 6:11-cv-00656 Document 1-4 d • rJl • ~ FIG.18(a} >--YES. ~ ..... ..... ~ = ~ ({Q ~~ YEs{;] >--YE5. N C C Ul 'JJ. DELETE ..... '""" -..J =- ~ ~ TRUE FILE 8320 CREATENBN ~I~~______~ SCRATCH FILE o ...., 8322 ~ '""" MAKE TRUE FILE LOCAL e rJ'l -..CJ\ \0 DONE N 00 ~ ~ N ~ N Filed 12/08/11 Page 21 of 63 PageID #: 148 8318 Case 6:11-cv-00656 Document 1-4 d • rJl • Case 6:11-cv-00656 Document 1-4 u.s. Patent Filed 12/08/11 Page 22 of 63 PageID #: 149 Aug. 9,2005 Sheet 18 of 31 ,...... .c -......-. CO - • (!) - US 6,928,442 B2 . ~ ~ ..... ..... 8332 ~ = INCREMENT FREEZE LOCK I FIG. 19(0) ~ FOR EACH SUBORDINATE FILE AND DIRECTORY IN THE \IVEN DIRECTOR,) ~~ N C C Ul 1 -~ 5334 S336 FREEZE IF .. ASSIMILATE DIRECTORY" UNASSIMlLATED FILE 'JJ. =- ~ ~ ..... '"""" ~ o ...., ~ '"""" ... 5337 CREATE NEW e \Jl DATA ITEM -..CJ\ \0 __ __ .I 00 N ~ ~ N ~ N Filed 12/08/11 Page 23 of 63 PageID #: 150 . "". ({Q Case 6:11-cv-00656 Document 1-4 d • rJl • -------,-- I ---- ~ • ~ ~ I FOR EACH S338 SUBORDINATE ADD ENTRY TO t-----1~.. NEW DATA FILE AND DIRECTORY IN THE ITEM GIVEN DIRECTORY I \ I I ~ ) RECORD ADDITIONAL DESIRED INFORMATION ~ = ~ ({Q ~~ N C C Ul 8342 'JJ. =- ASSIMILATE THE NEW DATA ITEM ~ ~ ..... FIG.19(b) . S344 DECREMENT THE FREEZE LOCK N C o ...., ~ '""'" e rJ'l -..CJ\ \0 N .... 00 ~ ~ N ~ N Filed 12/08/11 Page 24 of 63 PageID #: 151 . 5340 ~ ..... ..... Case 6:11-cv-00656 Document 1-4 d • rJl • ~ 1 ..... ..... ~ = MAKE TRUE FILE LOCAL , NO MORE ENTRIES . \. ~ ~ po 8353 FOR EACH DIRECTORY ENTRY J~ ({Q '" S348 MORE ~ENTRIE~ ~ ..... ~ READ ~~ N C C (It DIRECTORY ., Ir S350 CREATE FULL PATHNAME 'JJ. =- ~ ~ .... N """" o ...., ~ """" -~ S352 LINK PATH TO TRUE NAME e rJ'l -..CJ\ \0 N 00 ~ ~ N ~ N Filed 12/08/11 Page 25 of 63 PageID #: 152 [ . DONE~ ~ ~ 5346 FIG. 20 8354 r Case 6:11-cv-00656 Document 1-4 d • rJl • Case 6:11-cv-00656 Document 1-4 u.s. Patent Aug. 9,2005 Filed 12/08/11 Page 26 of 63 PageID #: 153 Sheet 22 of 31 US 6,928,442 B2 5354 WAIT FOR FREEZE LOCK TO TURN OFF S356 FIND TFR FIG.21 ENTRY S358 DECREMENT REFERENCE COUNT YES S362 DELETE TRUE FILE NO S364 REM eVE FILE ID ...- - - - - - - - - - - i AND COMPRESSED FILE 10 Case 6:11-cv-00656 Document 1-4 u.s. Patent Aug. 9,2005 Filed 12/08/11 Page 27 of 63 PageID #: 154 US 6,928,442 B2 Sheet 23 of 31 5365 GET OPERATION FIG. 22 ">-_YES_ _ _ _ _ S368 ~ ASSIMILATE S369 NEW TRUE FILE .-_YES 8378 S370 MODIFY USE COUNT OF EACH COMPONENT 5379 FOR EACH PARENT DIRECTORY OR FILE, UPDATE USE COUNT, LAST ACCESS AND MODIFY TIMES RECORD TRUE NAME IN AUDIT FILE Case 6:11-cv-00656 Document 1-4 u.s. Patent Aug. 9,2005 Filed 12/08/11 Page 28 of 63 PageID #: 155 Sheet 24 of 31 US 6,928,442 B2 5382 FIG. 23 VERIFY GROOMING LOCK OFF 8384 SET GROOMING LOCK " S386 SET GROOM COUNTS Case 6:11-cv-00656 Document 1-4 u.s. Patent Filed 12/08/11 Page 29 of 63 PageID #: 156 Aug. 9,2005 Sheet 25 of 31 US 6,928,442 B2 S388 FIND LDE RECORD FIG. 24 5390 FINDTFR RECORD 5392 INCREMENT GROOMING DELETE COUNT ... , S394 ADJUST FILE SIZES Case 6:11-cv-00656 Document 1-4 u.s. Patent Filed 12/08/11 Page 30 of 63 PageID #: 157 Aug. 9,2005 Sheet 26 of 31 US 6,928,442 B2 FIG. 25 S396 DELETE FILE 5398 UNLOCK GROOMING LOCK " ,.-____..... ;0 ~ YES_ _-.. FIG. 26(0) ~ ~ ..... ..... ~ = '0--..t I 5404 PROHIBIT OPEN S408 DETERMINE REGION I ~ ({Q ~~ N C C Ul YES =- ~ ~ ..... N -..J o ...., ~ '""'" YES, 5422 PROHIBIT OPEN e rJ'l -..CJ\ \0 N 00 ~ ~ N ~ N Filed 12/08/11 Page 31 of 63 PageID #: 158 'JJ. Case 6:11-cv-00656 Document 1-4 d • rJl • ~ ~ ..... ..... ~ = ~.. I LOCK IF NOT LOCKED ~ ({Q -------r-- ~~ N C C Ul 'JJ. =- 8417 CREATE SCRATCH ~ERASEFILE COpy 5406 CREATE SCRATCH FILE ~ ~ ... S420 MAKE LOCAL VERSION & RETURN FILE 10 FROMTFR S424 RETURN SCRATCH FILE 1l1li '4111 10 FIG.26(b) ... .... N 00 o ...., ~ '""'" e rJ'l -..CJ\ \0 N 00 ~ ~ N ~ N Filed 12/08/11 Page 32 of 63 PageID #: 159 0_---, Case 6:11-cv-00656 Document 1-4 d • rJl • ~ ~ ..... ..... S422 ~ = DETERMINE LOE & RTENTRY RECORDS FOR FILE ~ ({Q ~~ N C C Ul PROHIBIT DELETION 'JJ. =- ~ ~ ..... N ~ o ...., ... 8424 NO ~ '""'" IDENTIFY TRUE FILE FROM TRUE NAME FIG. 27(0) e rJ'l -..CJ\ \0 N _.--L. __ 00 ~ ~ N ~ N Filed 12/08/11 Page 33 of 63 PageID #: 160 ,YES--.. Case 6:11-cv-00656 Document 1-4 d • rJl • ~ ~ ..... ..... ~ nO _____-, ~ = ~ ({Q ~~ S427 :) Ul 'JJ. =- ~ ~ ..... ~ c 8430 DELETE TRUE FllE o ...., ~ '""'" 8431 S428 1 ,.' REDUCE USE COUNT BY ONE __ IADD ENTRY TO AUDIT FILE e rJ'l -..CJ\ \0 FIG. 27(b) N 00 'l. ~ N ~ N Filed 12/08/11 Page 34 of 63 PageID #: 161 DELETE SCRATCH COPY OF FILE XES N C C Case 6:11-cv-00656 Document 1-4 d • rJl • ~ ~ S432 FIG. 28 LOOKUP TRUE NAME /'..... YES ..... ..... ~ = > NO = ({Q ~~ N C C Ul =- ~ ~ ..... o 8442 ~I FORWARD REQUEST ~ I.. '" o"'" ...., YES ~ '""'" e rJ'l S444 POSITIVE RESPONSE NEGATIVE RESPONSE -..CJ\ \0 N 00 ~ ~ N ~ N Filed 12/08/11 Page 35 of 63 PageID #: 162 'JJ. Case 6:11-cv-00656 Document 1-4 d • rJl • Case 6:11-cv-00656 Document 1-4 Filed 12/08/11 Page 36 of 63 PageID #: 163 US 6,928,442 B2 1 2 ENFORCEMENT AND POLICING OF LICENSED CONTENT USING CONTENTBASED IDENTIFIERS being files, directories, records in the database, objects in object-oriented programming, locations in memory or on a physical device, or the like) are always defined relative to a specific context. For instance, the file identified by a particular file name can only be determined when the directory containing the file (the context) is known. The file identified by a pathname can be determined only when the file system (context) is known. Similarly, the addresses in a process address space, the keys in a database table, or domain names on a global computer network such as the Internet are meaningful only because they are specified relative to a context. In prior art systems for identifying data items there is no direct relationship between the data names and the data item. The same data name in two different contexts may refer to different data items, and two different data names in the same context may refer to the same data item. In addition, because there is no correlation between a data name and the data it refers to, there is no a priori way to confirm that a given data item is in fact the one named by a data name. For instance, in a DP system, if one processor requests that another processor deliver a data item with a given data name, the requesting processor cannot, in general, verify that the data delivered is the correct data (given only the name). Therefore it may require further processing, typically on the part of the requester, to verify that the data item it has obtained is, in fact, the item it requested. A common operation in a DP system is adding a new data item to the system. When a new data item is added to the system, a name can be assigned to it only by updating the context in which names are defined. Thus such systems require a centralized mechanism for the management of names. Such a mechanism is required even in a multiprocessing system when data items are created and identified at separate processors in distinct locations, and in which there is no other need for communication when data items are added. In many data processing systems or environments, data items are transferred between different locations in the system. These locations may be processors in the data processing system, storage devices, memory, or the like. For example, one processor may obtain a data item from another processor or from an external storage device, such as a floppy disk, and may incorporate that data item into its system (using the name provided with that data item). However, when a processor (or some location) obtains a data item from another location in the DP system, it is possible that this obtained data item is already present in the system (either at the location of the processor or at some other location accessible by the processor) and therefore a duplicate of the data item is created. This situation is common in a network data processing environment where proprietary software products are installed from floppy disks onto several processors sharing a common file server. In these systems, it is often the case that the same product will be installed on several systems, so that several copies of each file will reside on the common file server. In some data processing systems in which several processors are connected in a network, one system is designated as a cache server to maintain master copies of data items, and other systems are designated as cache clients to copy local copies of the master data items into a local cache on an as-needed basis. Before using a cached item, a cache client must either reload the cached item, be informed of changes to the cached item, or confirm that the master item corre- This is a continuation of application Ser. No. 09/283,160, filed Apr. 1, 1999, now U.S. Pat. No. 6,415,280, which is a division of application Ser. No. 08/960,079, filed Oct. 24, 1997, now U.S. Pat. No. 5,978,791 filed Oct. 24,2001 which is a continuation of Ser. No. 08/425,160, filed Apr. 11, 1995, now abandoned. S 10 BACKGROUND OF THE INVENTION 1. Field of the Invention This invention relates to data processing systems and, more particularly, to data processing systems wherein data items are identified by substantially unique identifiers which depend on all of the data in the data items and only on the data in the data items. 2. Background of the Invention Data processing (DP) systems, computers, networks of computers, or the like, typically offer users and programs various ways to identify the data in the systems. Users typically identify data in the data processing system by giving the data some form of name. For example, a typical operating system (OS) on a computer provides a file system in which data items are named by alphanumeric identifiers. Programs typically identify data in the data processing system using a location or address. For example, a program may identify a record in a file or database by using a record number which serves to locate that record. In all but the most primitive operating systems, users and programs are able to create and use collections of named data items, these collections themselves being named by identifiers. These named collections can then, themselves, be made part of other named collections. For example, an OS may provide mechanisms to group files (data items) into directories (collections). These directories can then, themselves be made part of other directories. A data item may thus be identified relative to these nested directories using a sequence of names, or a so-called pathname, which defines a path through the directories to a particular data item (file or directory). As another example, a database management system may group data records (data items) into tables and then group these tables into database files (collections). The complete address of any data record can then be specified using the database file name, the table name, and the record number of that data record. Other examples of identifying data items include: identifying files in a network file system, identifying objects in an object-oriented database, identifying images in an image database, and identifying articles in a text database. In general, the terms "data" and "data item" as used herein refer to sequences of bits. Thus a data item may be the contents of a file, a portion of a file, a page in memory, an object in an object-oriented program, a digital message, a digital scanned image, a part of a video or audio signal, or any other entity which can be represented by a sequence of bits. The term "data processing" herein refers to the processing of data items, and is sometimes dependent on the type of data item being processed. For example, a data processor for a digital image may differ from a data processor for an audio signal. In all of the prior data processing systems the names or identifiers provided to identify data items (the data items 15 20 25 30 35 40 45 50 55 60 65 Case 6:11-cv-00656 Document 1-4 Filed 12/08/11 Page 37 of 63 PageID #: 164 US 6,928,442 B2 3 4 sponding to the cached item has not changed. In other words, a cache client must synchronize its data items with those on the cache server. This synchronization may involve reloading data items onto the cache client. The need to keep the cache synchronized or reload it adds significant overhead to 5 existing caching mechanisms. In view of the above and other problems with prior art systems, it is therefore desirable to have a mechanism which allows each processor in a multiprocessor system to determine a common and substantially unique identifier for a data 10 item, using only the data in the data item and not relying on any sort of context. It is further desirable to have a mechanism for reducing multiple copies of data items in a data processing system and to have a mechanism which enables the identification of 15 identical data items so as to reduce multiple copies. It is further desirable to determine whether two instances of a data item are in fact the same data item, and to perform various other systems' functions and applications on data items without relying on any context information or prop- 20 erties of the data item. It is also desirable to provide such a mechanism in such a way as to make it transparent to users of the data processing system, and it is desirable that a single mecha - 25 nism be used to address each of the problems described above. SUMMARY OF THE INVENTION This invention provides, in a data processing system, a method and apparatus for identifying a data item in the system, where the identity of the data item depends on all of the data in the data item and only on the data in the data item. Thus the identity of a data item is independent of its name, origin, location, address, or other information not derivable directly from the data, and depends only on the data itself. This invention further provides an apparatus and a method for determining whether a particular data item is present in the system or at a location in the system, by examining only the data identities of a plurality of data items. Using the method or apparatus of the present invention, the efficiency and integrity of a data processing system can be improved. The present invention improves the design and operation of a data storage system, file system, relational database, object-oriented database, or the like that stores a plurality of data items, by making possible or improving the design and operation of at least some or all of the following features: the system stores at most one copy of any data item at a given location, even when multiple data names in the system refer to the same contents; the system avoids copying data from source to destination locations when the destination locations already have the data; the system provides transparent access to any data item by reference only to its identity and independent of its present location, whether it be local, remote, or offline; the system caches data items from a server, so that only the most recently accessed data items need be retained; when the system is being used to cache data items, problems of maintaining cache consistency are avoided; the system maintains a desired level of redundancy of data items in a network of servers, to protect against failure by ensuring that multiple copies of the data items are present at different locations in the system; 30 35 40 the system automatically archives data items as they are created or modified; the system provides the-size, age, and location of groups of data items in order to decide whether they can be safely removed from a local file system; the system can efficiently record and preserve any collection of data items; the system can efficiently make a copy of any collection of data items, to support a version control mechanism for groups of the data items; the system can publish data items, allowing other, possibly anonymous, systems in a network to gain access to the data items and to rely on the availability of the data items; the system can maintain a local inventory of all the data items located on a given removable medium, such as a diskette or CD-ROM, the inventory is independent of other properties of the data items such as their name, location, and date of creation; the system allows closely related sets of data items, such as matching or corresponding directories on disconnected computers, to be periodically resynchronized with one another; the system can verify that data retrieved from another location is the desired or requested data, using only the data identifier used to retrieve the data; the system can prove possession of specific data items by content without disclosing the content of the data items, for purposes of later legal verification and to provide anonymity; the system tracks possession of specific data items according to content by owner, independent of the name, date, or other properties of the data item, and tracks the uses of specific data items and files by content for accounting purposes. Other objects, features, and characteristics of the present invention as well as the methods of operation and functions of the related elements of structure, and the combination of parts and economies of manufacture, will become more apparent upon consideration of the following description and the appended claims with reference to the accompanying drawings, all of which form a part of this specification. 45 BRIEF DESCRIPTION OF THE DRAWINGS 50 FIGS. l(a) and l(b) depicts a typical data processing system in which a preferred embodiment of the present invention operates; FIG. 2 depicts a hierarchy of data items stored at any location in such a data processing system; FIGS. 3-9 depict data structures used to implement an embodiment of the present invention; and FIGS. 10(a)-28 are flow charts depicting operation of various aspects of the present invention. 55 DETAILED DESCRIPTION OF THE PRESENTLY PREFERRED EXEMPLARY EMBODIMENTS 60 65 An embodiment of the present invention is now described with reference to a typical data processing system 100, which, with reference to FIGS. l(a) and l(b), includes one or more processors (or computers) 102 and various storage devices 104 connected in some way, for example by a bus 106. Each processor 102 includes a CPU 108, a memory 110 and one or more local storage devices 112. The CPU 108, Case 6:11-cv-00656 Document 1-4 Filed 12/08/11 Page 38 of 63 PageID #: 165 US 6,928,442 B2 5 6 memory 110, and local storage device 112 may be internally processor 102 in the system, a memory of a particular connected, for example by a bus 114. Each processor 102 processor, a storage device, a removable storage medium may also include other devices (not shown), such as a (such as a floppy disk or compact disk), or any other physical keyboard, a display, a printer, and the like. location in the system. The term "local" with respect to a In a data processing system 100, wherein more than one 5 particular processor 102 refers to the memory and storage processor 102 is used, that is, in a multiprocessor system, the devices of that particular processor. processors may be in one of various relationships. For In the following, the terms "True Name", "data identity" example, two processors 102 may be in a client/server, and "data identifier" refer to the substantially unique data client/client, or a server/server relationship. These interidentifier for a particular data item. The term "True File" processor relationships may be dynamic, changing depend- 10 refers to the actual file, segment, or data item identified by ing on particular situations and functions. Thus, a particular a True Name. processor 102 may change its relationship to other procesA file system for a data processing system 100 is now sors as needed, essentially setting up a peer-to-peer relationdescribed which is intended to work with an existing opership with other processors. In a peer-to-peer relationship, sometimes a particular processor 102 acts as a client ating system by augmenting some of the operating system's processor, whereas at other times the same processor acts as 15 file management system codes. The embodiment provided a server processor. In other words, there is no hierarchy relies on the standard file management primitives for actuimposed on or required of processors 102. ally storing to and retrieving data items from disk, but uses the mechanisms of the present invention to reference and In a multiprocessor system, the processors 102 may be access those data items. homogeneous or heterogeneous. Further, in a multiprocessor data processing system 100, some or all of the processors 20 The processes and mechanisms (services) provided in this 102 may be disconnected from the network of processors for embodiment are grouped into the following categories: periods of time. Such disconnection may be part of the primitive mechanisms, operating system mechanisms, normal operation of the system 100 or it may be because a remote mechanisms, background mechanisms, and extended particular processor 102 is in need of repair. mechanisms. 25 Within a data processing system 100, the data may be Primitive mechanisms provide fundamental capabilities organized to form a hierarchy of data storage elements, used to support other mechanisms. The following primitive wherein lower level data storage elements are combined to mechanisms are described: form higher level elements. This hierarchy can consist of, for 1. Calculate True Name; example, processors, file systems, regions, directories, data 30 2. Assimilate Data Item; files, segments, and the like. For example, with reference to 3. New True File; FIG. 2, the data items on a particular processor 102 may be organized or structured as a file system 116 which comprises 4. Get True Name from Path; regions 117, each of which comprises directories 118, each 5. Link path to True Name; of which can contain other directories 118 or files 120. Each 35 6. Realize True File from Location; file 120 being made up of one or more data segments 122. 7. Locate Remote File; In a typical data processing system, some or all of these 8. Make True File Local; elements can be named by users given certain implementa9. Create Scratch File; tion specific naming conventions, the name (or pathname) of an element being relative to a context. In the context of a 40 10. Freeze Directory; data processing system 100, a pathname is fully specified by 11. Expand Frozen Directory; a processor name, a filesystem name, a sequence of zero or 12. Delete True File; more directory names identifying nested directories, and a 13. Process Audit File Entry; final file name. (Usually the lowest level elements, in this 14. Begin Grooming; case segments 122, cannot be named by users.) 45 15. Select For Removal; and In other words, a file system 116 is a collection of 16. End Grooming. directories 118. A directory 118 is a collection of named files Operating system mechanisms provide typical familiar 120---both data files 120 and other directory files 118. A file file system mechanisms, while maintaining the data struc120 is a named data item which is either a data file (which may be simple or compound) or a directory file 118. A 50 tures required to offer the mechanisms of the present invention. Operating system mechanisms are designed to augment simple file 120 consists of a single data segment 122. A existing operating systems, and in this way to make the compound file 120 consists of a sequence of data segments present invention compatible with, and generally transparent 122. A data segment 122 is a fixed sequence of bytes. An to, existing applications. The following operating system important property of any data segment is its size, the 55 mechanisms are described: number of bytes in the sequence. A single processor 102 may access one or more file 1. Open File; systems 116, and a single storage device 104 may contain 2. Close File; one or more file systems 116, or portions of a file system 116. 3. Read File; For instance, a file system 116 may span several storage 4. Write File; devices 104. 60 5. Delete File or Directory; In order to implement controls in a file system, file system 6. Copy File or Directory; 116 may be divided into distinct regions, where each region 7. Move File or Directory; is a unit of management and control. A region consists of a 8. Get File Status; and given directory 118 and is identified by the pathname (user defined) of the directory. 65 9. Get Files in Directory. In the following, the term "location", with respect to a Remote mechanisms are used by the operating system in data processing system 100, refers to any of a particular responding to requests from other processors. These mecha- Case 6:11-cv-00656 Document 1-4 Filed 12/08/11 Page 39 of 63 PageID #: 166 US 6,928,442 B2 7 8 nisms enable the capabilities of the present invention in a The data structures described are assumed to reside on peer-to-peer network mode of operation. The following individual peer processors 102 in the data processing system remote mechanisms are described: 100. However, they can also be shared by placing them on a remote, shared file server (for instance, in a local area 1. Locate True File; 5 network of machines). In order to accommodate sharing data 2. Reserve True File; structures, it is necessary that the processors accessing the 3. Request True File; shared database use the appropriate locking techniques to 4. Retire True File; ensure that changes to the shared database do not interfere with one another but are appropriately serialized. These 5. Cancel Reservation; 10 locking techniques are well understood by ordinarily skilled 6. Acquire True File; programmers of distributed applications. 7. Lock Cache; It is sometimes desirable to allow some regions to be local 8. Update Cache; and to a particular processor 102 and other regions to be shared 9. Check Expiration Date. among processors 102. (Recall that a region is a unit of file Background mechanisms are intended to run occasionally 15 system management and control consisting of a given direcand at a low priority. These provide automated management tory identified by the pathname of the directory.) In the case capabilities with respect to the present invention. The folof local and shared regions, there would be both local and lowing background mechanisms are described: shared versions of each data structure. Simple changes to the processes described below must be made to ensure that 1. Mirror True File; 20 appropriate data structures are selected for a given operation. 2. Groom Region; The local directory extensions (LDE) table 124 is a data 3. Check for Expired Links; and structure which provides information about files 120 and 4. Verify Region; and directories 118 in the data processing system 100. The local 5. Groom Source List. directory extensions table 124 is indexed by a pathname or Extended mechanisms run within application programs 25 contextual name (that is, a user provided name) of a file and over the operating system. These mechanisms provide soluincludes the True Name for most files. The information in tions to specific problems and applications. The following local directory extension table 124 is in addition to that extended mechanisms are described: provided by the native file system of the operating system. 1. Inventory Existing Directory; The True File registry (TFR) 126 is a data store for listing 2. Inventory Removable, Read-only Files; 30 actual data items which have True Names, both files 120 and segments 122. When such data items occur in the True File 3. Synchronize directories; registry 126 they are known as True Files. True Files are 4. Publish Region; identified in True File registry 126 by their True Names or 5. Retire Directory; identities. The table True File registry 126 also stores 6. Realize Directory at location; 35 location, dependency, and migration information about True 7. Verify True File; Files. 8. Track for accounting purposes; and The region table (RT) 128 defines areas in the network storage which are to be managed separately. Region table 9. Track for licensing purposes. The file system herein described maintains sufficient 128 defines the rules for access to and migration of files 120 information to provide a variety of mechanisms not ordi- 40 among various regions with the local file system 116 and narily offered by an operating system, some of which are remote peer file systems. listed and described here. Various processing performed by The source table (ST) 130 is a list of the sources of True Files other than the current True File registry 126. The this embodiment of the present invention will now be described in greater detail. source table 130 includes removable volumes and remote In some embodiments, some files 120 in a data processing 45 processors. The audit file (AF) 132 is a list of records indicating system 100 do not have True Names because they have been recently received or created or modified, and thus their True changes to be made in local or remote files, these changes to Names have not yet been computed. A file that does not yet be processed in background. have a True Name is called a scratch file. The process of The accounting log (AL) 134 is a log of file transactions assigning a True Name to a file is referred to as assimilation, 50 used to create accounting information in a manner which and is described later. Note that a scratch file may have a preserves the identity of files being tracked independent of user provided name. their name or location. Some of the processing performed by the present invenThe license table (LT) 136 is a table identifying files, tion can take place in a background mode or on a delayed or which may only be used by licensed users, in a manner as-needed basis. This background processing is used to 55 independent of their name or location, and the users licensed to use them. determine information that is not immediately required by Detailed Descriptions of the Data Structures the system or which may never be required. As an example, in some cases a scratch file is being changed at a rate greater The following table summarizes the fields of an local than the rate at which it is useful to determine its True Name. directory extensions table entry, as illustrated by record 138 In these cases, determining the True Name of the file can be 60 in FIG. 3. postponed or performed in the background. Data Structures The following data structures, stored in memory 110 of Field Description one of more processors 102 are used to implement the mechanisms described herein. The data structures can be 65 Region ID identifies the region in which this file is contained. local to each processor 102 of the system 100, or they can reside on only some of the processors 102. Case 6:11-cv-00656 Document 1-4 Filed 12/08/11 Page 40 of 63 PageID #: 167 US 6,928,442 B2 9 -continued Field Pathname True Name Type Scratch File ID Time of last access Time of last modification Safe flag Lock flag Size Owner 10 -continued the user provided name or contextual name of the file or directory, relative to the region in which it occurs. the computed True Name or identity of the file or directory. This True Name is not always up to date, and it is set to a special value when a file is modified and is later recomputed in the background. indicates whether the file is a data file or a directory. the physical location of the file in the file system, when no True Name has been calculated for the file. As noted above, such a file is called a scratch file. the last access time to this file. If this file is a directory, this is the last access time to any file in the directory. the time of last change of this file. If this file is a directory, this is the last modification time of any file in the directory. indicates that this file (and, if this file is a directory, all of its subordinate files) have been backed up on some other system, and it is therefore safe to remove them. indicates whether a file is locked, that is, it is being modified by the local processor or a remote processor. Only one processor may modify a file at a time. the full size of this directory (including all subordinate files), if all files in it were fully expanded and duplicated. For a file that is not a directory this is the size of the actual True File. the identity of the user who owns this file, for accounting and license tracking purposes. Each record of the True File registry 126 has the fields shown in the True File registry record 140 in FIG. 4. The True File registry 126 consists of the database described in the table below as well as the actual True Files identified by the True File IDs below. Field Description True Name computed True Name or identity of the file. compressed version of the True File may be stored instead of, or in addition to, an uncompressed version. This field provides the identity of the actual representation of the compressed version of the file. tentative count of how many references have been selected for deletion during a grooming operation. most recent date and time the content of this file was accessed. date and time after which this file may be deleted by this server. processor IDs of other processors which contain references to this True File. source ID(s) of zero or more sources from which this file or data item may be retrieved. identity or disk location of the actual physical representation of Compressed File ID Grooming delete count Time of last access Expiration Dependent processors Source IDs True File 10 Field Description 5 10 Use count 15 20 25 Description the file or file segment. It is sufficient to use a filename in the registration directory of the underlying operating system. The True File ID is absent if the actual file is not currently present at the current location. number of other records on this processor which identify this True File. A region table 128, specified by a directory pathname, records storage policies which allow files in the file system to be stored, accessed and migrated in different ways. Storage policies are programmed in a configurable way using a set of rules described below. Each region table record 142 of region table 128 includes the fields described in the following table (with reference to FIG. 5): Field Description Region ID internally used identifier for this region. file system on the local processor of which this region is a part. a pathname relative to the region file system which defines the location of this region. The region consists of all files and directories subordinate to this pathname, except those in a region subordinate to this region. zero or more identifiers of processors which are to keep mirror or archival copies of all files in the current region. Multiple mirror processors can be defined to form a mirror group. number of copies of each file in this region that should be retained in a mirror group. specifies whether this region is local to a single processor 102, shared by several processors 102 (if, for instance, it resides on a shared file server), or managed by a remote processor. the migration policy to apply to this region. A single region might participate in several policies. The policies are as follows (parameters in brackets are specified as part of the policy): Region file system 30 35 40 Region pathname Mirror processor(s) Mirror duplication count Region status 45 Policy 50 55 60 65 region is a cached version from [processor 10]; region is a member of a mirror set defined by [processor 10]. region is to be archived on [processor 10]. region is to be backed up locally, by placing new copies in [region ID]. region is read only and may not be changed. region is published and expires on [date]. Files in this region should be compressed. A source table 130 identifies a source location for True Files. The source table 130 is also used to identify client processors making reservations on the current processor. Case 6:11-cv-00656 Document 1-4 Filed 12/08/11 Page 41 of 63 PageID #: 168 US 6,928,442 B2 11 12 Each source record 144 of the source table 130 includes the fields summarized in the following table, with reference to FIG. 6: licensed to have access to it. Each license table record 150 includes the information summarized in the following table "with reference to FIG. 9: 5 Field Description Field Description source ID internal identifier used to identify a particular source. type of source location: Removable Storage Volume Local Region Cache Server Mirror Group Server Cooperative Server Publishing Server Client includes information about the rights of this processor, such as whether it can ask the local processor to store data items for it. measurement of the bandwidth, cost, and reliability of the connection to this source of True Files. The availability is used to select from among several possible sources. information on how the local processor is to access the source. This may be, for example, the name of a removable storage volume, or the processor ID and region path of a region on a remote processor. True Name licensee True Name of a data item subject to license validation. identity of a user authorized to have access to this object. source type 10 Various other data structures are employed on some or all of the processors 102 in the data processing system 100. Each processor 102 has a global freeze lock (GFL) 152 15 (FIG. 1), which is used to prevent synchronization errors source when a directory is frozen or copied. Any processor 102 may rights include a special archive directory (SAD) 154 into which directories may be copied for the purposes of archival. Any source processor 102 may include a special media directory (SMD) availability 20 156, into which the directories of removable volumes are stored to form a media inventory. Each processor has a grooming lock 158, which is set during a grooming operasource tion. During this period the grooming delete count of True location File registry entries 140 is active, and no True Files should 25 be deleted until grooming is complete. While grooming is in effect, grooming information includes a table of pathnames selected for deletion, and keeps track of the amount of space that would be freed if all of the files were deleted. Primitive Mechanisms The audit file 132 is a table of events ordered by The first of the mechanisms provided by the present timestamp, each record 146 in audit file 132 including the 30 invention, primitive mechanisms, are now described. The fields summarized in the following table (with reference to FIG. 7): mechanisms described here depend on underlying data management mechanisms to create, copy, read, and delete data items in the True File registry 126, as identified by a True 35 File ID. This support may be provided by an underlying Description Field operating system or disk storage manager. The following primitive mechanisms are described: Original Name path of the file in question. Operation whether the file was created, read, 1. Calculate True Name; written, copied or deleted. 2. Assimilate Data Item; specifies whether the source is a file 40 or a directory. 3. New True File; Processor ID ID of the remote processor generating 4. Get True Name from Path; this event (if not local). Timestamp time and date file was closed (required 5. Link Path to True Name; only for accessed/modified files). 6. Realize True File from Location; Pathname Name of the file (required only for 45 rename). 7. Locate Remote File; True Name computed True Name of the file. This is 8. Make True File Local; used by remote systems to mirror changes to the directory and is filled in during 9. Create Scratch File; background processing. 10. Freeze Directory; 50 11. Expand Frozen Directory; Each record 148 of the accounting log 134 records an 12. Delete True File; event which may later be used to provide information for 13. Process Audit File Entry; billing mechanisms. Each accounting log entry record 148 14. Begin Grooming; includes at least the information summarized in the following table, with reference to FIG. 8: 15. Select For Removal; and 55 16. End Grooming. 1. Calculate True Name A True Name is computed using a function, MD, which Field Description reduces a data block B of arbitrary length to a relatively date of entry date and time of this log entry. 60 small, fixed size identifier, the True Name of the data block, type of Entry types include create file, such that the True Name of the data block is virtually delete file, and transmit file. entry guaranteed to represent the data block B and only data block True Name of data item in question. True Name B. identity of the user responsible for owner this action. The function MD must have the following properties: 65 1. The domain of the function MD is the set of all data items. The range of the function MD is the set of True Each record 150 of the license table 136 records a relationship between a licensable data item and the user Names. Case 6:11-cv-00656 Document 1-4 Filed 12/08/11 Page 42 of 63 PageID #: 169 US 6,928,442 B2 13 14 2. The function MD must take a data item of arbitrary entire universe of data items, as long as sufficient uniqueness length and reduce it to an integer value in the range 0 is provided per category of data items. This is because the to N-l, where N is the cardinality of the set of True tags provide an additional level of uniqueness. Names. That is, for an arbitrary length data block B, A mechanism for calculating a True Name given a data O<MD(B)<N. 5 item is now described, with reference to FIGS. 10(a) and 3. The results of MD(B) must be evenly and randomly 10(b). distributed over the range of N, in such a way that A simple data item is a data item whose size is less than simple or regular changes to B are virtually guaranteed a particular given size (which must be defined in each to produce a different value of MD(B). particular implementation of the invention). To determine 4. It must be computationally difficult to find a different 10 the True Name of a simple data item, with reference to FIG. value B' such that MD(B)=MD(B'). 10(a), first compute the MD function (described above) on 5. The function MD(B) must be efficiently computed. the given simple data item (step S212). Then append to the A family of functions with the above properties are the resulting 128 bits, the byte length modulo 32 of the data item so-called message digest functions, which are used in digital (Step S214). The resulting 160-bit value is the True Name of security systems as techniques for authentification of data. 15 the simple data item. A compound data item is one whose size is greater than These functions (or algorithms) include MD4, MD5, and the particular given size of a simple data item. To determine SHA. the True Name of an arbitrary (simple or compound) data In the presently preferred embodiments, either MD5 or item, with reference to FIG. 10(b), first determine if the data SHA is employed as the basis for the computation of True Names. Whichever of these two message digest functions is 20 item is a simple or a compound data item (Step S216). If the employed, that same function must be employed on a data item is a simple data item, then compute its True Name system-wide basis. in step S218 (using steps S212 and S214 described above), It is impossible to define a function having a unique otherwise partition the data item into segments (Step S220) and assimilate each segment (Step S222) (the primitive output for each possible input when the number of elements in the range of the function is smaller than the number of 25 mechanism, Assimilate a Data Item, is described below), elements in its domain. However, a crucial observation is computing the True Name of the segment. Then create an that the actual data items that will be encountered in the indirect block consisting of the computed segment True Names (Step S224). An indirect block is a data item which operation of any system embodying this invention form a very sparse subset of all the possible inputs. consists of the sequence of True Names of the segments. A colliding set of data items is defined as a set wherein, 30 Then, in step S226, assimilate the indirect block and comfor one or more pairs x and y in the set, MD(x)=MD(y). pute its True Name. Finally, replace the final thirty-two (32) Since a function conforming to the requirements for MD bits of the resulting True Name (that is, the length of the must evenly and randomly distribute its outputs, it is indirect block) by the length modulo 32 of the compound data item (Step S228). The result is the True Name of the possible, by making the range of the function large enough, to make the probability arbitrarily small that actual inputs 35 compound data item. encountered in the operation of an embodiment of this Note that the compound data item may be so large that the invention will form a colliding set. indirect block of segment True Names is itself a compound data item. In this case the mechanism is invoked recursively To roughly quantify the probability of a collision, assume until only simple data items are being processed. that there are no more than 230 storage devices in the world, Both the use of segments and the attachment of a length and that each storage device has an average of at most 2 20 40 different data items. Then there are at most 2 50 data items to the True Name are not strictly required in a system using in the world. If the outputs of MD range between 0 and 2128 , the present invention, but are currently considered desirable it can be demonstrated that the probability of a collision is features in the preferred embodiment. approximately 1 in 229 . Details on the derivation of these 2. Assimilate Data Item probability values are found, for example, in P. Flajolet and 45 A mechanism for assimilating a data item (scratch file or A. M. Odlyzko, "Random Mapping Statistics," Lecture segment) into a file system, given the scratch file ID of the Notes in Computer Science 434: Advances in Cryptologydata item, is now described with reference to FIG. 11. The Eurocrypt '89 Proceedings, Springer-Verlag, pp. 329-354. purpose of this mechanism is to add a given data item to the Note that for some less preferred embodiments of the True File registry 126. If the data item already exists in the present invention, lower probabilities of uniqueness may be 50 True File registry 126, this will be discovered and used acceptable, depending on the types of applications and during this process, and the duplicate will be eliminated. mechanisms used. In some embodiments it may also be Thereby the system stores at most one copy of any data item or file by content, even when multiple names refer to useful to have more than one level of True Names, with some of the True Names having different degrees of uniquethe same content. First, determine the True Name of the data item correness. If such a scheme is implemented, it is necessary to 55 ensure that less unique True Names are not propagated in the sponding to the given scratch File ID using the Calculate system. True Name primitive mechanism (Step S230). Next, look for While the invention is described herein using only the an entry for the True Name in the True File registry 126 True Name of a data item as the identifier for the data item, (Step S232) and determine whether a True Name entry, other preferred embodiments use tagged, typed, categorized 60 record 140, exists in the True File registry 126. If the entry or classified data items and use a combination of both the record includes a corresponding True File ID or compressed True Name and the tag, type, category or class of the data File ID (Step S237), delete the file with the scratch File ID item as an identifier. Examples of such categorizations are (Step S238). Otherwise store the given True File ID in the files, directories, and segments; executable files and data entry record (step S239). If it is determined (in step S232) that no True Name entry files, and the like. Examples of classes are classes of objects 65 in an object-oriented system. In such a system, a lower exists in the True File registry 126, then, in Step S236, create degree of True Name uniqueness is acceptable over the a new entry in the True File registry 126 for this True Name. Case 6:11-cv-00656 Document 1-4 Filed 12/08/11 Page 43 of 63 PageID #: 170 US 6,928,442 B2 15 16 Set the True Name of the entry to the calculated True Name, entry record and other data structures as follows: fill in the set the use count for the new entry to one, store the given True Name field of the entry with the specified True Name; True File ID in the entry and set the other fields of the entry increment the use count for the True File registry entry as appropriate. record 140 of the corresponding True Name; note whether Because this procedure may take some time to compute, 5 the entry is a directory by reading the True File to see if it it is intended to run in background after a file has ceased to contains a tag (magic number) indicating that it represents a change. In the meantime, the file is considered an unassimifrozen directory (see also the description of the Freeze lated scratch file. Directory primitive mechanism regarding the tag); and com3. New True File pute and set the other fields of the local directory extensions The New True File process is invoked when processing 10 appropriately. For instance, search the region table 128 to the audit file 132, some time after a True File has been identify the region of the path, and set the time of last access assimilated (using the Assimilate Data Item primitive and time of last modification to the current time. mechanism). Given a local directory extensions table entry 6. Realize True File from Location record 138 in the local directory extensions table 124, the This mechanism is used to try to make a local copy of a New True File process can provide the following steps (with reference to FIG. 12), depending on how the local processor 15 True File, given its True Name and the name of a source location (processor or media) that may contain the True File. is configured: This mechanism is now described with reference to FIG. 15. First, in step S238, examine the local directory extensions First, in step S272, determine whether the location specitable entry record 138 to determine whether the file is locked fied is a processor. If it is determined that the location by a cache server. If the file is locked, then add the ID of the cache server to the dependent processor list of the True File 20 specified is a processor, then send a Request True File message (using the Request True File remote mechanism) to registry table 126, and then send a message to the cache the remote processor and wait for a response (Step S274). If server to update the cache of the current processor using the a negative response is received or no response is received Update Cache remote mechanism (Step 242). If desired, compress the True File (Step S246), and, if after a timeout period, this mechanism fails. If a positive desired, mirror the True File using the Mirror True File 25 response is received, enter the True File returned in the True background mechanism (Step S248). File registry 126 (Step S276). (If the file received was compressed, enter the True File ID in the compressed File ID 4. Get True Name from Path field.) The True Name of a file can be used to identify a file by contents, to confirm that a file matches its original contents, If, on the other hand, it is determined in step S272 that the or to compare two files. The mechanism to get a True Name 30 location specified is not a processor, then, if necessary, given the pathname of a file is now described with reference request the user or operator to mount the indicated volume to FIG. 13. (Step S278). Then (Step S280) find the indicated file on the First, search the local directory extensions table 124 for given volume and assimilate the file using the Assimilate the entry record 138 with the given pathname (Step S250). Data Item primitive mechanism. If the volume does not If the pathname is not found, this process fails and no True 35 contain a True File registry 126, search the media inventory Name corresponding to the given pathname exists. Next, to find the path of the file on the volume. If no such file can determine whether the local directory extensions table entry be found, this mechanism fails. record 138 includes a True Name (Step S252), and if so, the At this point, whether or not the location is determined (in mechanism's task is complete. Otherwise, determine step S272) to be a processor, if desired, verify the True File whether the local directory extensions table entry record 138 40 (in step S282). identifies a directory (Step 5254), and if so, freeze the 7. Locate Remote File directory (Step S256) (the primitive mechanism Freeze This mechanism allows a processor to locate a file or data Directory is described below). item from a remote source of True Files, when a specific Otherwise, in step S258, assimilate the file (using the source is unknown or unavailable. A client processor system Assimilate Data Item primitive mechanism) defined by the 45 may ask one of several or many sources whether it can File ID field to generate its True Name and store its True supply a data object with a given True Name. The steps to perform this mechanism are as follows (with reference to Name in the local directory extensions entry record. Then FIGS. 16(a) and 16(b)). return the True Name identified by the local directory The client processor 102 uses the source table 145 to extensions table 124. 5. Link Path to True Name 50 select one or more source processors (Step S284). If no source processor can be found, the mechanism fails. Next, The mechanism to link a path to a True Name provides a way of creating a new directory entry record identifying an the client processor 102 broadcasts to the selected sources a existing, assimilated file. This basic process may be used to request to locate the file with the given True Name using the copy, move, and rename files without a need to copy their Locate True File remote mechanism (Step S286). The contents. The mechanism to link a path to a True Name is 55 request to locate may be augmented by asking to propagate now described with reference to FIG. 14. this request to distant servers. The client processor then waits for one or more servers to respond positively (Step First, if desired, confirm that the True Name exists locally by searching for it in the True Name registry or local S288). After all servers respond negatively, or after a timeout directory extensions table 135 (Step S260). Most uses of this period with no positive response, the mechanism repeats mechanism will require this form of validation. Next, search 60 selection (Step S284) to attempt to identify alternative for the path in the local directory extensions table 135 (Step sources. If any selected source processor responds, its processor ID is the result of this mechanism. Store the processor S262). Confirm that the directory containing the file named ID in the source field of the True File registry entry record in the path already exists (Step S264). If the named file itself exists, delete the File using the Delete True File operating 140 of the given True Name (Step S290). 65 If the source location of the True Name is a different system mechanism (see below) (Step S268). processor or medium than the destination (Step S290a), Then, create an entry record in the local directory extenperform the following steps: sions with the specified path (Step S270) and update the Case 6:11-cv-00656 Document 1-4 Filed 12/08/11 Page 44 of 63 PageID #: 171 US 6,928,442 B2 17 18 (i) Look up the True File registry entry record 140 for the corresponding True Name, and add the source location ID to the list of sources for the True Name (Step S290b); and (ii) If the source is a publish ing system, determine the expiration date on the publishing system for th e True Name and add that to the list of sources. If the source is not a publishing system, send a message to reserve the True File on the source processor (Step S290c). Source selection in step S284 may be based on optimizations involving general availability of the source, access time, bandwidth, and transmission cost, and ignoring previously selected processors which did not respond in step S288. 8. Make True File Local This mechanism is used when a True Name is known and a locally accessible copy of the corresponding file or data item is required. This mechanism makes it possible to actually read the data in a True File. The mechanism takes a True Name and returns when there is a local, accessible copy of the True File in the True File registry 126. This mechanism is described here with reference to the flow chart of FIGS. 17(a) and 17(b). First, look in the True file registry 126 for a True File entry record 140 for the corresponding True Name (Step S292). If no such entry is found this mechanism fails. If there is already a True File ID for the entry (Step S294), this mechanism's task is complete. If there is a compressed file ID for the entry (Step S296), decompress the file corresponding to the file ID (Step S296) and store the decompressed file ID in the entry (Step S300). This mechanism is then complete. If there is no True File ID for the entry (Step S294) and there is no compressed file ID for the entry (Step S296), then continue searching for the requested file. At this time it may be necessary to notify the user that the system is searching for the requested file. If there are one or more source IDs, then select an order in which to attempt to realize the source ID (Step S304). The order may be based on optimizations involving general availability of the source, access time, bandwidth, and transmission cost. For each source in the order chosen, realize the True File from the source location (using the Realize True File from Location primitive mechanism), until the True File is realized (Step S306). If it is realized, continue with step S294. If no known source can realize the True File, use the Locate Remote File primitive mechanism to attempt to find the True File (Step S308). If this succeeds, realize the True File from the identified source location and continue with step S296. 9. Create Scratch File A scratch copy of a file is required when a file is being created or is about to be modified. The scratch copy is stored in the file system of the underlying operating system. The scratch copy is eventually assimilated when the audit file record entry 146 is processed by the Process Audit File Entry primitive mechanism. This Create Scratch File mechanism requires a local directory extensions table entry record 138. When it succeeds, the local directory extensions table entry record 138 contains the scratch file ID of a scratch file that is not contained in the True File registry 126 and that may be modified. This mechanism is now described with reference to FIGS. 18(a) and 18(b). First determine whether the scratch file should be a copy of the existing True File (Step S310). If so, continue with step S312. Otherwise, determine whether the local directory extensions table entry record 138 identifies an existing True File (Step S316), and if so, delete the True File using the Delete True File primitive mechanism (Step S318). Then create a new, empty scratch file and store its scratch file ID in the local directory extensions table entry record 138 (Step S320). This mechanism is then complete. If the local directory extensions table entry record 138 identifies a scratch file ID (Step S312), then the entry already has a scratch file, so this mechanism succeeds. If the local directory extensions table entry record 138 identifies a True File (S316), and there is no True File ID for the True File (S312), then make the True File local using the Make True File Local primitive mechanism (Step S322). If there is still no True File ID, this mechanism fails. There is now a local True File for this file. If the use count in the corresponding True File registry entry record 140 is one (Step S326), save the True File ID in the scratch file ID of the local directory extensions table entry record 138, and remove the True File registry entry record 140 (Step S328). (This step makes the True File into a scratch file.) This mechanism's task is complete. Otherwise, if the use count in the corresponding True File registry entry record 140 is not one (in step S326), copy the file with the given True File ID to a new scratch file, using the Read File OS mechanism and store its file ID in the local directory extensions table entry record 138 (step S330), and reduce the use count for the True File by one. If there is insufficient space to make a copy, this mechanism fails. 10. Freeze Directory This mechanism freezes a directory in order to calculate its True Name. Since the True Name of a directory is a function of the files within the directory, they must not change during the computation of the True Name of the directory. This mechanism requires the pathname of a directory to freeze. This mechanism is described with reference to FIGS. 19(a) and 19(b). In step S332, add one to the global freeze lock. Then search the local directory extensions table 124 to find each subordinate data file and directory of the given directory, and freeze each subordinate directory found using the Freeze Directory primitive mechanism (Step S334). Assimilate each unassimilated data file in the directory using the Assimilate Data Item primitive mechanism (Step S336). Then create a data item which begins with a tag or marker (a "magic number") being a unique data item indicating that this data item is a frozen directory (Step S337). Then list the file name and True Name for each file in the current directory (Step s338). Record any additional information required, such as the type, time of last access and modification, and size (Step S340). Next, in step S342, using the Assimilate Data Item primitive mechanism, assimilate the data item created in step S338. The resulting True Name is the True Name of the frozen directory. Finally, subtract one from the global freeze lock (Step S344). 11. Expand Frozen Directory This mechanism expands a frozen directory in a given location. It requires a given pathname into which to expand the directory, and the True Name of the directory and is described with reference to FIG. 20. First, in step S346, make the True File with the given True Name local using the Make True File Local primitive mechanism. Then read each directory entry in the local file created in step S346 (Step S348). For each such directory entry, do the following: Create a full pathname using the given pathname and the file name of the entry (Step S350); and link the created path to the True Name (Step S352) using the Link Path to True Name primitive mechanism. 5 10 15 20 25 30 35 40 45 50 55 60 65 Case 6:11-cv-00656 Document 1-4 Filed 12/08/11 Page 45 of 63 PageID #: 172 US 6,928,442 B2 19 20 12. Delete True File files selected for deletion (Step S384). For each True File in This mechanism deletes a reference to a True Name. The the True File registry 126, set the delete count to zero (Step underlying True Pile is not removed from the True File S386). registry 126 unless there are no additional references to the 15. Select For Removal file. With reference to FIG. 21, this mechanism is performed 5 This grooming mechanism tentatively selects a pathname as follows: to allow its corresponding True File to be removed. With If the global freeze lock is on, wait until the global freeze reference to FIG. 24, first find the local directory extensions lock is turned off (Step S354). This prevents deleting a table entry record 138 corresponding to the given pathname True File while a directory which might refer to it is (Step S388). Then find the True File registry entry record being frozen. Next, find the True File registry entry 10 140 corresponding to the True File name in the local record 140 given the True Name (Step S356). If the directory extensions table entry record 138 (Step S390). Add reference count field of the True File registry 126 is one to the grooming delete count in the True File registry greater than zero, subtract one from the reference count entry record 140 and add the pathname to a list of files field (Step S358). If it is determined (in step S360) that selected for deletion (Step S392). If the grooming delete the reference count field of the True File registry entry 15 count of the True File registry entry record 140 is equal to record 140 is zero, and if there are no dependent the use count of the True File registry entry record 140, and systems listed in the True File registry entry record 140, if the there are no entries in the dependency list of the True then perform the following steps: File registry entry record 140, then add the size of the file (i) If the True File is a simple data item, then delete the indicated by the True File ID and or compressed file ID to True File, otherwise, 20 the total amount of space freed during grooming (Step (ii) (the True File is a compound data item) for each True S394). Name in the data item, recursively delete the True File 16. End Grooming corresponding to the True Name (Step S362). This grooming mechanism ends the grooming phase and (iii) Remove the file indicated by the True File ID and removes all files selected for removal. With reference to compressed file ID from the True File registry 126, and 25 FIG. 25, for each file in the list of files selected for deletion, remove the True File registry entry record 140 (Step delete the file (Step S396) and then unlock the global S364). grooming lock (Step S398). 13. Process Audit File Entry Operating System Mechanisms This mechanism performs tasks which are required to The next of the mechanisms provided by the present maintain information in the local directory extensions table 30 invention, operating system mechanisms, are now described. 124 and True File registry 126, but which can be delayed The following operating system mechanisms are while the processor is busy doing more time-critical tasks. described: Entries 142 in the audit file 132 should be processed at a 1. Open File; background priority as long as there are entries to be 2. Close File; processed. With reference to FIG. 22, the steps for process- 35 3. Read File; ing an entry are as follows: 4. Write File; Determine the operation in the entry 142 currently being 5. Delete File or Directory; processed (Step S365). If the operation indicates that a file 6. Copy File or Directory; was created or written (Step S366), then assimilate the file 7. Move File or Directory; using the Assimilate Data Item primitive mechanism (Step 40 8. Get File Status; and S368), use the New True File primitive mechanism to do 9. Get Files in Directory. additional desired processing (such as cache update, 1. Open File compression, and mirroring) (Step S369), and record the A mechanism to open a file is described with reference to newly computed True Name for the file in the audit file 45 FIGS. 26(a) and 26(b). This mechanism is given as input a record entry (Step S370). pathname and the type of access required for the file (for Otherwise, if the entry being processed indicates that a compound data item or directory was copied (or deleted) example, read, write, read/write, create, etc.) and produces (Step S376), then for each component True Name in the either the File ID of the file to be opened or an indication that no file should be opened. The local directory extensions compound data item or directory, add (or subtract) one to the use count of the True File registry entry record 140 corre- 50 table record 138 and region table record 142 associated with sponding to the component True Name (Step S378). the opened file are associated with the open file for later use In all cases, for each parent directory of the given file, in other processing functions which refer to the file, such as update the size, time of last access, and time of last read, write, and close. First, determine whether or not the named file exists modification, according to the operation in the audit record 55 locally by examining the local directory extensions table 124 (Step S379). to determine whether there is an entry corresponding to the Note th at the audit record is not removed after given pathname (Step S400). If it is determined that the file processing, but is retained for some reasonable period so that name does not exist locally, then, using the access type, it may be used by the Synchronize Directory extended mechanism to allow a disconnected remote processor to determine whether or not the file is being created by this 60 opening process (Step S402). If the file is not being created, update its representation of the local system. 14. Begin Grooming prohibit the open (Step S404). If the file is being created, This mechanism makes it possible to select a set of files create a zero-length scratch file using an entry in local for removal and determine the overall amount of space to be directory extensions table 124 and produce the scratch file recovered. With reference to FIG. 23, first verify that the ID of this scratch file as the result (Step S406). If, on the other hand, it is determined in step S400 that the global grooming lock is currently unlocked (Step S382). 65 Then set the global grooming lock, set the total amount of file name does exist locally, then determine the region in space freed during grooming to zero a nd empty the list of which the file is located by searching the region table 128 to Case 6:11-cv-00656 Document 1-4 Filed 12/08/11 Page 46 of 63 PageID #: 173 US 6,928,442 B2 21 22 iii. Determine the File ID of the True File specified find the record 142 with the longest region path which is a prefix of the file pathname (Step S408). This record identiby the True Name corresponding to this segment. iv. Use the Read File mechanism (recursively) to fies the region of the specified file. read from this segment into the corresponding Next, determine using the access type, whether the file is location in the specified buffer. being opened for writing or whether it is being opened only 5 4. Write File for reading (Step S410). If the file is being opened for File writing uses the file ID and data management capareading only, then, if the file is a scratch file (Step S419), bilities of the underlying operating system. File access return the scratch File ID of the file (Step S424). Otherwise get the True Name from the local directory extensions table (Make File Local described above) can be deferred until the 124 and make a local version of the True File associated with first read or write. 10 5. Delete File or Directory the True Name using the Make True File Local primitive The process of deleting a file, for a given pathname, is mechanism, and then return the True File ID associated with the True Name (Step S420). described here with reference to FIGS. 27(a) and 27(b). If the file is not being opened for reading only (Step First, determine the local directory extensions table entry S410), then, if it is determined by inspecting the region table record 138 and region table entry record 142 for the file entry record 142 that the file is in a read-only directory (Step 15 (Step S422). If the file has no local directory extensions table S416), then prohibit the opening (Step S422). entry record 138 or is locked or is in a read-only region, If it is determined by inspecting the region table 128 that prohibit the deletion. the file is in a cached region (Step S423), then send a Lock Identify the corresponding True File given the True Name Cache message to the corresponding cache server, and wait of the file being deleted using the True File registry 126 for a return message (Step S418). If the return message says 20 (Step S424). If the file has no True Name, (Step S426) then the file is already locked, prohibit the opening. delete the scratch copy of the file based on its scratch file ID If the access type indicates that the file being modified is in the local directory extensions table 124 (Step S427), and being rewritten completely (Step S419), so that the original continue with step S428. data will not be required, then Delete the File using the If the file has a True Name and the True File's use count Delete File OS mechanism (Step S421) and perform step 25 is one (Step S429), then delete the True File (Step S430), and S406. Otherwise, make a scratch copy of the file (Stop S417) continue with step S428. and produce the scratch file ID of the scratch file as the result If the file has a True Name and the True File's use count (Step S424). is greater than one, reduce its use count by one (Step S431). 2. Close File Then proceed with step S428. This mechanism takes as input the local directory exten30 In Step S428, delete the local directory extensions table sions table entry record 138 of an open file and the data entry record, and add an entry to the audit file 132 indicating maintained for the open file. To close a file, add an entry to the time and the operation performed (delete). the audit file indicating the time and operation (create, read 6. Copy File or Directory or write). The audit file processing (using the Process Audit A mechanism is provided to copy a file or directory given File Entry primitive mechanism) will take care of assimi35 a source and destination processor and pathname. The Copy lating the file and thereby updating the other records. File mechanism does not actually copy the data in the file, 3. Read File only the True Name of the file. This mechanism is performed To read a file, a program must provide the offset and as follows: length of the data to be read, and the location of a buffer into (A) Given the source path, get the True Name from the which to copy the data read. 40 path. If this step fails, the mechanism fails. The file to be read from is identified by an open file (B) Given the True Name and the destination path, link descriptor which includes a File ID as computed by the Open the destination path to the True Name. File operating system mechanism defined above. The File ID (C) If the source and destination processors have different may identify either a scratch file or a True File (or True File True File registries, find (or, if necessary, create) an segment). If the File ID identifies a True File, it may be 45 entry for the True Name in the True File registry table either a simple or a compound True File. Reading a file is 126 of the destination processor. Enter into the source accomplished by the following steps: ID field of this new entry the source processor identity. In the case where the File ID identifies a scratch file or a (D) Add an entry to the audit file 132 indicating the time simple True File, use the read capabilities of the underand operation performed (copy). lying operating system. 50 This mechanism addresses capability of the system to In the case where the File ID identifies a compound file, avoid copying data from a source location to a destination break the read operation into one or more read operalocation when the destination already has the data. In tions on component segments as follows: addition, because of the ability to freeze a directory, this A. Identify the segment(s) to be read by dividing the specified file offset and length each by the fixed size 55 mechanism also addresses capability of the system immediately to make a copy of any collection of files, thereby to of a segment (a system dependent parameter), to support an efficient version control mechanisms for groups determine the segment number and number of segof files. ments that must be read. 7. Move File or Directory B. For each segment number computed above, do the A mechanism is described which moves (or renames) a 60 following: file from a source path to a destination path. The move 1. Read the compound True File index block to operation, like the copy operation, requires no actual transfer determine the True Name of the segment to be of data, and is performed as follows: read. (A) Copy the file from the source path to the destination ii. Use the Realize True File from Location primitive path. mechanism to make the True File segment avail- 65 (B) If the source path is different from the destination able locally. (If that mechanism fails, the Read File mechanism fails). path, delete the source path. Case 6:11-cv-00656 Document 1-4 Filed 12/08/11 Page 47 of 63 PageID #: 174 US 6,928,442 B2 23 24 8. Get File Status First determine if the True File is available locally or if This mechanism takes a file pathname and provides there is some indication of where the True File is located (for information about the pathname. First the local directory example, in the Source IDs field). Look up the requested extensions table entry record 138 corresponding to the True Name in the True File registry 126 (Step S432). If a True File registry entry record 140 is not found for this pathname given is found. If no such entry exists, 10 then this 5 True Name (Step S434), and the flag indicates that the mechanism fails, otherwise, gather information about the file request is not to be forwarded (Step S436), respond negaand its corresponding True File from the local directory extensions table 124. The information can include any tively (Step S438). That is, respond to the effect that the True information shown in the data structures, including the size, File is not available. type, owner, True Name, sources, time oflast access, time of 10 One the other hand, if a True File registry entry record 140 last modification, state (local or not, assimilated or not, is not found (Step S434), and the flag indicates that the compressed or not), use count, expiration date, and reserrequest for this True File is to be forwarded (Step S436), vations. then forward a request for this True File to some other 9. Get Files in Directory processors in the system (Step S442). If the source table for This mechanism enumerates the files in a directory. It is 15 the current processor identifies one or more publishing used (implicitly) whenever it is necessary to determine servers which should have a copy of this True File, then whether a file exists (is present) in a directory. For instance, forward the request to each of those publishing servers (Step it is implicitly used in the Open File, Delete File, Copy File S436). or Directory, and Move File operating system mechanisms, If a True File registry entry record 140 is found for the because the files operated on are referred to by pathnames required True File (Step S434), and if the entry includes a containing directory names. The mechanism works as fol- 20 True File ID or Compressed File ID (Step S440), respond lows: positively (Step S444). If the entry includes a True File ID The local directory extensions table 124 is searched for an then this provides the identity or disk location of the actual entry 138 with the given directory pathname. If no such physical representation of the file or file segment required. entry is found, or if the entry found is not a directory, then 25 If the entry include a Compressed File ID, then a comthis mechanism fails. pressed version of the True File may be stored instead of, or If there is a corresponding True File field in the local in addition to, an uncompressed version. This field provides directory extensions table record, then it is assumed that the the identity of the actual representation of the compressed True File represents a frozen directory. version of the file. The Expand Frozen Directory primitive mechanism is 30 If the True File registry entry record 140 is found (Step used to expand the existing True File into directory entries S434) but does not include a True File ID (the File ID is in the local directory extensions table. absent if the actual file is not currently present at the current Finally, the local directory extensions table 124 is again location) (Step S440), and if the True File registry entry searched, this time to find each directory subordinate to the record 140 includes one or more source processors, and if given directory. The names found are provided as the result. 35 the request can be forwarded, then forward the request for Remote Mechanisms this True File to one or more of the source processors (Step The remote mechanisms provided by the present invenS444). tion are now described. Recall that remote mechanisms are 2. Reserve True File used by the operating system in responding to requests from This mechanism allows a remote processor to indicate other processors. These mechanisms enable the capabilities that it depends on the local processor for access to a specific of the present invention in a peer-to-peer network mode of 40 True File. It takes a True Name as input. This mechanism is operation. described here. In a presently preferred embodiment, processors commu(A) Find the True File registry entry record 140 associated nicate with each other using a remote procedure call (RPC) with the given True File. If no entry exists, reply style interface, running over one of any number of communegatively. nication protocols such as IPX/SPX or TCP/IP. Each peer 45 (B) If the True File registry entry record 140 does not processor which provides access to its True File registry 126 include a True File ID or compressed File ID, and if the or file regions, or which depends on another peer processor, True File registry entry record 140 includes no source provides a number of mechanisms which can be used by its IDs for removable storage volumes, then this processor peers. does not have access to a copy of the given file. Reply 50 The following remote mechanisms are described: negatively. 1. Locate True File; (C) Add the ID of the sending processor to the list of 2. Reserve True File; dependent processors for the True File registry entry 3. Request True File; record 140. Reply positively, with an indication of 4. Retire True File; whether the reserved True File is on line or off line. 55 5. Cancel Reservation; 3. Request True File 6. Acquire True File; This mechanism allows a remote processor to request a 7. Lock Cache; copy of a True File from the local processor. It requires a True Name and responds positively by sending a True File 8. Update Cache; and 60 back to the requesting processor. The mechanism operates as 9. Check Expiration Date. follows: 1. Locate True File (A) Find the True File registry entry record 140 associated This mechanism allows a remote processor to determine with the given True Name. If there is no such True File whether the local processor contains a copy of a specific registry entry record 140, reply negatively. True File. The mechanism begins with True Name and a flag indicating whether to forward requests for this file to other 65 (B) Make the True File local using the Make True File servers. This mechanism is now described with reference to Local primitive mechanism. If this mechanism fails, the FIG. 28. Request True File mechanism also fails. Case 6:11-cv-00656 Document 1-4 Filed 12/08/11 Page 48 of 63 PageID #: 175 US 6,928,442 B2 25 (C) Send the local True File in either it is uncompressed or compressed form to the requesting remote processor. Note that if the True File is a compound file, the components are not sent. (D) If the remote file is listed in the dependent process list of the True File registry entry record 140, remove it. 4. Retire True File This mechanism allows a remote processor to indicate that it no longer plans to maintain a copy of a given True File. An alternate source of the True File can be specified, if, for instance, the True File is being moved from one server to another. It begins with a True Name, a requesting processor ID, and an optional alternate source. This mechanism operates as follows: (A) Find a True Name entry in the True File registry 126. If there is no entry for this True Name, this mechanism's task is complete. (B) Find the requesting processor on the source list and, if it is there, remove it. (C) If an alternate source is provided, add it to the source list for the True File registry entry record 140. (D) If the source list of the True File registry entry record 140 has no items in it, use the Locate Remote File primitive mechanism to search for another copy of the file. If it fails, raise a serious error. 5. Cancel Reservation This mechanism allows a remote processor to indicate that it no longer requires access to a True File stored on the local processor. It begins with a True Name and a requesting processor ID and proceeds as follows: (A) Find the True Name entry in the True File registry 126. If there is no entry for this True Name, this mechanism's task is complete. (B) Remove the identity of the requesting processor from the list of dependent processors, if it appears. (C) If the list of dependent processors becomes zero and the use count is also zero, delete the True File. 6. Acquire True File This mechanism allows a remote processor to insist that a local processor make a copy of a specified True File. It is used, for example, when a cache client wants to write through a new version of a file. The Acquire True File mechanism begins with a data item and an optional True Name for the data item and proceeds as follows: (A) Confirm that the requesting processor has the right to require the local processor to acquire data items. If not, send a negative reply. (B) Make a local copy of the data item transmitted by the remote processor. (C) Assimilate the data item into the True File registry of the local processor. (D) If a True Name was provided with the file, the True Name calculation can be avoided, or the mechanism can verify that the file received matches the True Name sent. (E) Add an entry in the dependent processor list of the true file registry record indicating that the requesting processor depends an this copy of the given True File. (F) Send a positive reply. 7. Lock Cache This mechanism allows a remote cache client to lock a local file so that local users or other cache clients cannot change it while the remote processor is using it. The mechanism begins with a pathname and proceeds as follows: (A) Find the local directory extensions table entry record 138 of the specified pathname. If no such entry exists, reply negatively. 26 (B) If an local directory extensions table entry record 138 exists and is already locked, reply negatively that the file is already locked. (C) If an local directory extensions table entry record 138 5 exists and is not locked, lock the entry. Reply positively. 8. Update Cache This mechanism allows a remote cache client to unlock a local file and update it with new contents. It begins with a 10 pathname and a True Name. The file corresponding to the True Name must be accessible from the remote processor. This mechanism operates as follows: Find the local directory extensions table entry record 138 corresponding to the given pathname. Reply negatively if no such entry exists or if the entry is not locked. 15 Link the given pathname to the given True Name using the Link Path to True Name primitive mechanism. Unlock the local directory extensions table entry record 138 and return positively. 20 9. Check Expiration Date Return current or new expiration date and possible alternative source to caller. Background Processes and Mechanisms The background processes and mechanisms provided by 25 the present invention are now described. Recall that background mechanisms are intended to run occasionally and at a low priority to provide automated management capabilities with respect to the present invention. The following background mechanisms are described: 30 1. Mirror True File; 2. Groom Region; 3. Check for Expired Links; 4. Verify Region; and 35 5. Groom Source List. 1. Mirror True File This mechanism is used to ensure that files are available in alternate locations in mirror groups or archived on archival servers. The mechanism depends on application-specific 40 migration/archival criteria (size, time since last access, number of copies required, number of existing alternative sources) which determine under what conditions a file should be moved. The Mirror True File mechanism operates as follows, using the True File specified, perform the fol45 lowing steps: (A) Count the number of available locations of the True File by inspecting the source list of the True File registry entry record 140 for the True File. This step 50 determines how many copies of the True File are available in the system. (B) If the True File meets the specified migration criteria, select a mirror group server to which a copy of the file should be sent. Use the Acquire True File remote mechanism to copy the True File to the selected mirror 55 group server. Add the identity of the selected system to the source list for the True File. 2. Groom Region This mechanism is used to automatically free up space in 60 a processor by deleting data items that may be available elsewhere. The mechanism depends on application-specific grooming criteria (for instance, a file may be removed if there is an alternate online source for it, it has not been accessed in a given number of days, and it is larger than a 65 given size). This mechanism operates as follows: Repeat the following steps (i) to (iii) with more aggressive grooming criteria until sufficient space is freed or until Case 6:11-cv-00656 Document 1-4 Filed 12/08/11 Page 49 of 63 PageID #: 176 US 6,928,442 B2 27 28 all grooming criteria have been exercised. Use groomthe affected True Files to determine whether there are too many mirror copies. This can be done with the following ing information to determine how much space has been steps: freed. Recall that, while grooming is in effect, groomFor each affected True File, ing information includes a table of pathnames selected (A) Search the local directory extensions table to find for deletion, and keeps track of the amount of space that 5 each region that refers to the True File. would be freed if all of the files were deleted. (B) Create a set of "required sources", initially empty. (i) Begin Grooming (using the primitive mechanism). (C) For each region found, (ii) For each pathname in the specified region, for the True (a) determine the mirroring criteria for that region, File corresponding to the pathname, if the True File is (b) determine which sources for the True File satisfy present, has at least one alternative source, and meets 10 the mirroring criteria, and application specific grooming criteria for the region, (c) add these sources to the set of required sources. select the file for removal (using the primitive (D) For each source in the True File registry entry, if the mechanism). source identifies a remote processor (as opposed to (iii) End Grooming (using the primitive mechanism). removable media), and if the source is not a publisher, If the region is used as a cache, no other processors are 15 and if the source is not in the set of required sources, dependent an True Files to which it refers, and all such True then eliminate the source, and use the Cancel ReserFiles are mirrored elsewhere. In this case, True Files can be vation remote mechanism to eliminate the given proremoved with impunity. For a cache region, the grooming cessor from the list of dependent processors recorded at criteria would ordinarily eliminate the least recently the remote processor identified by the source. accessed True Files first. This is best done by sorting the 20 Extended Mechanisms True Files in the region by the most recent access time The extended mechanisms provided by the present invenbefore performing step (ii) above. The application specific tion are now described. Recall that extended mechanisms criteria would thus be to select for removal every True File run within application programs over the operating system encountered (beginning with the least recently used) until to provide solutions to specific problems and applications. 25 the required amount of free space is reached. The following extended mechanisms are described: 3. Check for Expired Links 1. Inventory Existing Directory; This mechanism is used to determine whether dependen2. Inventory Removable, Read-only Files; cies on published files should be refreshed. The following 3. Synchronize Directories; steps describe the operation of this mechanism: 4. Publish Region; For each pathname in the specified region, for each True 30 File corresponding to the pathname, perform the following 5. Retire Directory; step: 6. Realize Directory at Location; If the True File registry entry record 140 corresponding to 7. Verify True File; the True File contains at least one source which is a 8. Track for Accounting Purposes; and publishing server, and if the expiration date on the depen- 35 9. Track for Licensing Purposes. dency is past or close, then perform the following steps: 1. Inventory Existing Directory (A) Determine whether the True File registry entry record This mechanism determines the True Names of files in an contains other sources which have not expired. existing on-line directory in the underlying operating sys(B) Check the True Name expiration of the server. If the 40 tem. One purpose of this mechanism is to install True Name expiration date has been extended, or an alternate mechanisms in an existing file system. source is suggested, add the source to the True File An effect of such an installation is to eliminate immediregistry entry record 140. ately all duplicate files from the file system being traversed. (C) If no acceptable alternate source was found in steps If several file systems are inventoried in a single True File (A) or (B) above, make a local copy of the True File. 45 registry, duplicates across the volumes are also eliminated. (D) Remove the expired source. (A) Traverse the underlying file system in the operating 4. Verify Region system. For each file encountered, excluding This mechanism can be used to ensure that the data items directories, perform the following: in the True File registry 126 have not been damaged acci(i) Assimilate the file encountered (using the Assimilate dentally or maliciously. The operation of this mechanism is 50 File primitive mechanism). This process computes described by the following steps: its True Name and moves its data into the True File registry 126. (A) Search the local directory extensions table 124 for each pathname in the specified region and then perform (ii) Create a pathname consisting of the path to the the following steps: volume directory and the relative path of the file on (i) Get the True File name corresponding to the path- 55 the media. Link this path to the computed True Name using the Link Path to True Name primitive mechaname; msm. (ii) If the True File registry entry 140 for the True File 2. Inventory Removable, Read-only Files does not have a True File ID or compressed file ID, A system with access to removable, read-only media ignore it. (iii) Use the Verify True File mechanism (see extended 60 volumes (such as WORM disks and CD-ROMS) can create a usable inventory of the files on these disks without having mechanisms below) to confirm that the True File specified is correct. to make online copies. These objects can then be used for archival purposes, directory overlays, or other needs. An S. Groom Source List The source list in a True File entry should be groomed operator must request that an inventory be created for such sometimes to ensure there are not too many mirror or archive 65 a volume. This mechanism allows for maintaining inventories of the copies. When a file is deleted or when a region definition or its mirror criteria are changed, it may be necessary to inspect contents of files and data items on removable media, such as Case 6:11-cv-00656 Document 1-4 Filed 12/08/11 Page 50 of 63 PageID #: 177 US 6,928,442 B2 29 30 diskettes and CD-ROMs, independent of other properties of after a period of no communication between the local and remote processor. Such a period might occur if the local the files such as name, location, and date of creation. processor were a mobile processor detached from its server, The mechanism creates an online inventory of the files on or if two distant processors were run independently and one or more removable volumes, such as a floppy disk or CD-ROM, when the data on the volume is represented as a 5 updated nightly. An advantage of the described synchronization process is directory. The inventory service uses a True Name to identify each file, providing a way to locate the data independent that it does not depend on synchronizing the clocks of the of its name, date of creation, or location. local and remote processors. However, it does require that The inventory can be used for archival of data (making it the local processor track its position in the remote procespossible to avoid archiving data when that data is already on 10 sor's audit file. a separate volume), for grooming (making it possible to This mechanism does not resolve changes made simultadelete infrequently accessed files if they can be retrieved neously to the same file at several sites. If that occurs, an from removable volumes), for version control (making it external resolution mechanism such as, for example, operapossible to generate a new version of a CD-ROM without tor intervention, is required. 15 having to copy the old version), and for other purposes. The mechanism takes as input a start time, a local The inventory is made by creating a volume directory in directory pathname, a remote processor name, and a remote the media inventory in which each file named identifies the directory pathname name, and it operates by the following data item on the volume being inventoried. Data items are steps: not copied from the removable volume during the inventory (A) Request a copy of the audit file 132 from the remote process. 20 processor using the Request True File remote mechaAn operator must request that an inventory be created for msm. a specific volume. Once created, the volume directory can be (B) For each entry 146 in the audit file 132 after the start frozen or copied like any other directory. Data items from time, if the entry indicates a change to a file in the either the physical volume or the volume directory can be remote directory, perform the following steps: accessed using the Open File operating system mechanism 25 (i) Compute the pathname of the corresponding file in which will cause them to be read from the physical volume the local directory. Determine the True Name of the using the Realize True File from Location primitive mechacorresponding file. msm. (ii) If the True Name of the local file is the same as the To create an inventory the following steps are taken: old True Name in the audit file, or if there is no local (A) A volume directory in the media inventory is created 30 file and the audit entry indicates a new file is being to correspond to the volume being inventoried. Its created, link the new True Name in the audit file to contextual name identifies the specific volume. the local pathname using the Link Path to True Name (B) A source table entry 144 for the volume is created in primitive mechanism. the source table 130. This entry 144 identifies the (iii) Otherwise, note that there is a problem with the physical source volume and the volume directory cre- 35 synchronization by sending a message to the operaated in step (A). tor or to a problem resolution program, indicating the (C) The filesystem on the volume is traversed. For each local pathname, remote pathname, remote processor, file encountered, excluding directories, the following and time of change. steps are taken: (C) After synchronization is complete, record the time of (i) The True Name of the file is computed. An entry is 40 the final change. This time is to be used as the new start created in the True Name registry 124, including the time the next time this directory is synchronized with True Name of the file using the primitive mechathe same remote processor. nism. The source field of the True Name registry 4. Publish Region entry 140 identifies the source table entry 144. The publish region mechanism allows a processor to offer (ii) A pathname is created consisting of the path to the 45 the files in a region to any client processors for a limited volume directory and the relative path of the file on period of time. the media. This path is linked to the computed True The purpose of the service is to eliminate any need for Name using Link Path to True Name primitive client processors to make reservations with the publishing mechanism. processor. This in turn makes it possible for the publishing (D) After all files have been inventoried, the volume 50 processor to service a much larger number of clients. directory is frozen. The volume directory serves as a When a region is published, an expiration date is defined table of contents for the volume. It can be copied using for all files in the region, and is propagated into the pubthe Copy File or Directory primitive mechanism to lishing system's True File registry entry record 140 for each create an "overlay" directory which can then be file. modified, making it possible to edit a virtual copy of a 55 When a remote file is copied, for instance using the Copy read-only medium. File operating system mechanism, the expiration date is 3. Synchronize Directories copied into the source field of the client's True File registry entry record 140. When the source is a publishing system, no Given two versions of a directory derived from the same dependency need be created. starting point, this mechanism creates a new, synchronized The client processor must occasionally and in version which includes the changes from each. Where a file 60 is changed in both versions, this mechanism provides a user background, check for expired links, to make sure it still has exit for handling the discrepancy. By using True Names, access to these files. This is described in the background mechanism Check for Expired Links. comparisons are instantaneous, and no copies of files are 5. Retire Directory necessary. This mechanism makes it possible to eliminate safely the This mechanism lets a local processor synchronize a 65 directory to account for changes made at a remote processor. True Files in a directory, or at least dependencies on them, Its purpose is to bring a local copy of a directory up to date after ensuring that any client processors depending on those Case 6:11-cv-00656 Document 1-4 Filed 12/08/11 Page 51 of 63 PageID #: 178 US 6,928,442 B2 31 32 files remove their dependencies. The files in the directory are (D) Confirm that the calculated True Name is equal to the given True Name. not actually deleted by this process. The directory can be (E) If the True Names are not equal, there is an e~ror in deleted with the Delete File operating system mechanism. the True File registry 126. Remove the True FIle ID The mechanism takes the pathname of a given directory, from the True File registry entry record 140 and place and optionally, the identification of a preferred alternate 5 it somewhere else. Indicate that the True File registry source processor for clients to use. The mechanism performs entry record 140 contained an error. the following steps: 8. Track for Accounting Purposes (A) Traverse the directory. For each file in the directory, This mechanism provides a way to know reliably which perform the following steps: files have been stored on a system or transmitted from one c (i) Get the True Name of the file from its path and find 10 system to another. The mechanism can be used as a basis ,or the True File registry entry 140 associated with the a value-based accounting system in which charges are based True Name. on the identity of the data stored or transmitted, rather than (ii) Determine an alternate source for the True File. If simply on the number of bits. the source IDs field of the TFR entry includes the This mechanism allows the system to track possession of preferred alternate source, that is the alternate 15 specific data items according to content by owner, indepensource. If it does not, but includes some other source, dent of the name, date, or other properties of the data item, that is the alternate source. If it contains no alternate and tracks the uses of specific data items and files by content sources, there is no alternate source. for accounting purposes. True names make it possible to (iii) For each dependent processor in the True File identify each file briefly yet uniquely for this purpose. registry entry 140, ask that processor to retire the 20 Tracking the identities of files requires maintaining an True File, specifying an alternate source if one was accounting log 134 and processing it for accounting or determined, using the remote mechanism. billing purposes. The mechanism operates in the following 6. Realize Directory at Location steps: This mechanism allows the user or operating system to (A) Note every time a file is created or deleted, for force copies of files from some source location to the True 25 instance by monitoring audit entries in the Process File registry 126 at a given location. The purpose of the Audit File Entry primitive mechanism. When such an mechanism is to ensure that files are accessible in the event event is encountered, create an entry 148 in the the source location becomes inaccessible. This can happen accounting log 134 that shows the responsible party for instance if the source or given location are on mobile and the identity of the file created or deleted. computers, or are on removable media, or if the network 30 (B) Every time a file is transmitted, for instance when a connection to the source is expected to become unavailable, file is copied with a Request True File remote mechaor if the source is being retired. nism or an Acquire True File remote mechanism, create This mechanism is provided in the following steps for an entry in the accounting log 134 that shows the each file in the given directory, with the exception of responsible party, the identity of the file, and the source subdirectories: 35 and destination processors. (A) Get the-local directory extensions table entry record (C) Occasionally run an accounting program to process 138 given the pathname of the file. Get the True Name the accounting log 134, distributing the events to the of the local directory extensions table entry record 138. account records of each responsible party. The account This service assimilates the file if it has not already records can eventually be summarized for billing pur40 been assimilated. poses. (B) Realize the corresponding True File at the given 9. Track for Licensing Purposes location. This service causes it to be copied to the given This mechanism ensures that licensed files are not used by location from a remote system or removable media. unauthorized parties. The True Name provides a safe way to 7. Verify True File 45 identify licensed material. This service allows proof of This mechanism is used to verify that the data item in a possession of specific files according to their contents withTrue File registry 126 is indeed the correct data item given out disclosing their contents. its True Name. Its purpose is to guard against device errors, Enforcing use of valid licenses can be active (for example, malicious changes, or other problems. by refusing to provide access to a file without authorization) If an error is found, the system has the ability to "heal" 50 or passive (for example, by creating a report of users who do itself by finding another source for the True File with the not have proper authorization). given name. It may also be desirable to verify that the error One possible way to perform license validation is to has not propagated to other systems, and to log the problem perform occasional audits of employee systems. The service or indicate it to the computer operator. These details are not described herein relies on True Names to support such an described here. 55 audit, as in the following steps: To verify a data item that is not in a True File registry 126, (A) For each licensed product, record in the license table use the Calculate True Name primitive mechanism described 136 the True Name of key files in the product (that is, above. files which are required in order to use the product, and The basic mechanism begins with a True Name, and which do not occur in other products) Typically, for a operates in the following steps: software product, this would include the main execut60 (A) Find the True File registry entry record 140 correable image and perhaps other major files such as sponding to the given True Name. clip-art, scripts, or online help. Also record the identity (B) If there is a True File ID for the True File registry of each system which is authorized to have a copy of entry record 140 then use it. Otherwise, indicate that no the file. file exists to verify. 65 (B) Occasionally, compare the contents of each user (C) Calculate the True Name of the data item given the file processor against the license table 136. For each True ID of the data item. Name in the license table do the following: Case 6:11-cv-00656 Document 1-4 Filed 12/08/11 Page 52 of 63 PageID #: 179 US 6,928,442 B2 33 34 (i) Unless the user processor is authorized to have a components of the data item, copying of an entire data item copy of the file, confirm that the user processor does can be avoided. Since some (or all) of the components of a not have a copy of the file using the Locate True File large data item may already be present at a destination location, only those components which are not present there mechanism. (ii) If the user processor is found to have a file that it 5 need be copied. This property derives from the manner in is not authorized to have, record the user processor which True Names are determined. and True Name in a license violation table. When a file is copied by the Copy File or Directory The System in Operation operating system mechanism, only the True Name of the file Given the mechanisms described above, the operation of is actually replicated. When a file is opened (using the Open File operating a typical DP system employing these mechanisms is now 10 described in order to demonstrate how the present invention system mechanism), it uses the Make True File Local meets its requirements and capabilities. primitive mechanism (either directly or indirectly through In operation, data items (for example, files, database the Create Scratch File primitive mechanism) to create a records, messages, data segments, data blocks, directories, local copy of the file. The Open File operating system instances of object classes, and the like) in a DP system 15 mechanism uses the Make True File Local primitive employing the present invention are identified by substanmechanism, which uses the Realize True File from Location primitive mechanism, which, in turn uses the Request True tially unique identifiers (True Names), the identifiers File remote mechanism. depending on all of the data in the data items and only on the The Request True File remote mechanism copies only a data in the data items. The primitive mechanisms Calculate True Name and Assimilate Data Item support this property. 20 single data item from one processor to another. If the data For any given data item, using the Calculate True Name item is a compound file, its component segments are not primitive mechanism, a substantially unique identifier or copied, only the indirect block is copied. The segments are True Name for that data item can be determined. copied only when they are read (or otherwise needed). The Read File operating system mechanism actually reads Further, in operation of a DP system incorporating the present invention, multiple copies of data items are avoided 25 data. The Read File mechanism is aware of compound files (unless they are required for some reason such as backups or and indirect blocks, and it uses the Realize True File from mirror copies in a fault-tolerant system). Multiple copies of Location primitive mechanism to make sure that component data items are avoided even when multiple names refer to segments are locally available, and then uses the operating system file mechanisms to read data from the local file. the same data item. The primitive mechanisms Assimilate Thus, when a compound file is copied from a remote Data Items and New True File support this property. Using 30 the Assimilate Data Item primitive mechanism, if a data item system, only its True Name is copied. When it is opened, already exists in the system, as indicated by an entry in the only its indirect block is copied. When the corresponding file True File registry 126, this existence will be discovered by is read, the required component segments are realized and this mechanism, and the duplicate data item (the new data therefore copied. item) will be eliminated (or not added). Thus, for example, 35 In operation data items can be accessed by reference to if a data file is being copied onto a system from a floppy their identities (True N ames) independent of their present disk, if, based on the True Name of the data file, it is location. The actual data item or True File corresponding to determined that the data file already exists in the system (by a given data identifier or True Name may reside anywhere in the same or some other name), then the duplicate copy will the system (that is, locally, remotely, offline, etc). If a not be installed. If the data item was being installed on the 40 required True File is present locally, then the data in the file system by some name other than its current name, then, can be accessed. If the data item is not present locally, there are a number of ways in which it can be obtained from using the Link Path to True Name primitive mechanism, the other (or new) name can be linked to the already existing wherever it is present. Using the source IDs field of the True data item. File registry table, the location(s) of copies of the True File In general, the mechanisms of the present invention 45 corresponding to a given True Name can be determined. The operate in such a way as to avoid recreating an actual data Realize True File from Location primitive mechanism tries to make a local copy of a True File, given its True Name and item at a location when a copy of that data item is already present at that location. In the case of a copy from a floppy the name of a source location (processor or media) that may disk, the data item (file) may have to be copied (into a contain the True File. If, on the other hand, for some reason scratch file) before it can be determined that it is a duplicate. 50 it is not known where there is a copy of the True File, or if the processors identified in the source IDs field do not This is because only one processor is involved. On the other respond with the required True File, the processor requiring hand, in a multiprocessor environment or DP system, each processor has a record of the True Names of the data items the data item can make a general request for the data item on that processor. When a data item is to be copied to using the Request True File remote mechanism from all another location (another processor) in the DP system, all 55 processors in the system that it can contact. As a result, the system provides transparent access to any that is necessary is to examine the True Name of the data item prior to the copying. If a data item with the same True data item by reference to its data identity, and independent of its present location. Name already exists at the destination location (processor), In operation, data items in the system can be verified and then there is no need to copy the data item. Note that if a data item which already exists locally at a destination location is 60 have their integrity checked. This is from the manner in which True Names are determined. This can be used for still copied to the destination location (for example, because the remote system did not have a True Name for the data security purposes, for instance, to check for viruses and to item or because it arrives as a stream of un-named data), the verify that data retrieved from another location is the desired Assimilate Data Item primitive mechanism will prevent and requested data. For example, the system might store the 65 True Names of all executable applications on the system and multiple copies of the data item from being created. Since the True Name of a large data item (a compound then periodically redetermine the True Names of each of these applications to ensure that they match the stored True data item) is derived from and based on the True Names of Case 6:11-cv-00656 Document 1-4 Filed 12/08/11 Page 53 of 63 PageID #: 180 US 6,928,442 B2 35 36 these data items. True Names are globally unique identifiers Names. Any change in a True Name potentially signals which can be published simply by copying them. For corruption in the system and can be further investigated. The example, a user might create a textual representation of a file Verify Region background mechanism and the Verify True on system A with True Name N (for instance as a hexadeciFile extended mechanisms provide direct support for this mode of operation. The Verify Region mechanism is used to 5 mal string), and post it on a computer bulletin board. Another user on system B could create a directory entry F ensure that the data items in the True File registry have not for this True Name N by using the Link Path to True Name been damaged accidentally or maliciously. The Verify True primitive mechanism. (Alternatively, an application could File mechanism verifies that a data item in a True File be developed which hides the True Name from the users, but registry is indeed the correct data item given its True Name. 10 provides the same public transfer service.) Once a processor has determined where (that is, at which When a program on system B attempts to open pathname other processor or location) a copy of a data item is in the P linked to True Name N, the Locate Remote File primitive DP system, that processor might need that other processor or mechanism would be used, and would use the Locate True location to keep a copy of that data item. For example, a File remote mechanism to search for True Name N on one processor might want to delete local copies of data items to or more remote processors, such as system A. If system B make space available locally while knowing that it can rely 15 has access to system A, it would be able to realize the True on retrieving the data from somewhere else when needed. To File (using the Realize True File from Location primitive this end the system allows a processor to Reserve (and mechanism) and use it locally. Alternatively, system B could cancel the reservation of) True Files at remote locations find True Name N by accessing any publicly available True (using the remote mechanism). In this way the remote Name server, if the server could eventually forward the locations are put on notice that another location is relying on 20 request to system A. Clients of a local server can indicate that they depend on the presence of the True File at their location. a given True File (using the Reserve True File remote A DP system employing the present invention can be mechanism) so that the True File is not deleted from the made into a fault-tolerant system by providing a certain server registry as long as some client requires access to it. amount of redundancy of data items at multiple locations in the system. Using the Acquire True File and Reserve True 25 (The Retire True File remote mechanism is used to indicate that a client no longer needs a given True File.) File remote mechanisms, a particular processor can impleA publishing server, on the other hand, may want to ment its own form of fault-tolerance by copying data items provide access to many clients, and possibly anonymous to other processors and then reserving them there. However, ones, without incurring the overhead of tracking dependenthe system also provides the Mirror True File background mechanism to mirror (make copies) of the True File avail- 30 cies for each client. Therefore, a public server can provide able elsewhere in the system. Any degree of redundancy expiration dates for True Files in its registry. This allows (limited by the number of processors or locations in the client systems to safely maintain references to a True File on system) can be implemented. As a result, this invention the public server. The Check For Expired Links background maintains a desired degree or level of redundancy in a mechanism allows the client of a publishing server to network of processors, to protect against failure of any 35 occasionally confirm that its dependencies on the publishing server are safe. particular processor by ensuring that multiple copies of data items exist at different locations. In a variation of this aspect of the invention, a processor that is newly connected (or reconnected after some absence) The data structures used to implement various features to the system can obtain a current version of all (or of and mechanisms of this invention store a variety of useful information which can be used, in conjunction with the 40 needed) data in the system by requesting it from a server various mechanisms, to implement storage schemes and processor. Any such processor can send a request to update policies in a DP system employing the invention. For or resynchronize all of its directories (starting at a root example, the size, age and location of a data item (or of directory), simply by using the Synchronize Directories groups of data items) is provided. This information can be extended mechanism on the needed directories. used to decide how the data items should be treated. For 45 Using the accounting log or some other user provided example, a processor may implement a policy of deleting mechanism, a user can prove the existence of certain data local copies of all data items over a certain age if other items at certain times. By publishing (in a public place) a list of all True Names in the system on a given day (or at some copies of those data items are present elsewhere in the system. The age (or variations on the age) can be determined given time), a user can later refer back to that list to show using the time of last access or modification in the local 50 that a particular data item was present in the system at the directory extensions table, and the presence of other copies time that list was published. Such a mechanism is useful in tracking, for example, laboratory notebooks or the like to of the data item can be determined either from the Safe Flag prove dates of conception of inventions. Such a mechanism or the source IDs, or by checking which other processors in the system have copies of the data item and then reserving also permits proof of possession of a data item at a particular at least one of those copies. 55 date and time. The accounting log file can also track the use of specific In operation, the system can keep track of data items regardless of how those items are named by users (or data items and files by content for accounting purposes. For regardless of whether the data items even have names). The instance, an information utility company can determine the system can also track data items that have different names data identities of data items that are stored and transmitted (in different or the same location) as well as different data 60 through its computer systems, and use these identities to items that have the same name. Since a data item is identified provide bills to its customers based on the identities of the by the data in the item, without regard for the context of the data items being transmitted (as defined by the substantially data, the problems of inconsistent naming in a DP system are unique identifier). The assignment of prices for storing and overcome. transmitting specific True Files would be made by the In operation, the system can publish data items, allowing 65 information utility and/or its data suppliers; this information other, possibly anonymous, systems in a network to gain would be joined periodically with the information in the access to the data items and to rely on the availability of accounting log file to produce customer statements. Case 6:11-cv-00656 Document 1-4 Filed 12/08/11 Page 54 of 63 PageID #: 181 US 6,928,442 B2 37 38 Mirror True File background mechanism if the True File is Backing up data items in a DP system employing the in a mirrored or archived region. This mechanism causes one present invention can be done based on the True Names of the data items. By tracking backups using True Names, or more copies of the new file to be made on remote duplication in the backups is prevented. In operation, the processors. system maintains a backup record of data identifiers of data 5 In operation, the system can efficiently record and preitems already backed up, and invokes the Copy File or serve any collection of data items. The Freeze Directory Directory operating system mechanism to copy only those primitive mechanism creates a True File which identifies all of the files in the directory and its subordinates. Because this data items whose data identifiers are not recorded in the True File includes the True Names of its constituents, it backup record. Once a data item has been backed up, it can be restored by retrieving it from its backup location, based 10 represents the exact contents of the directory tree at the time on the identifier of the data item. Using the backup record it was frozen. The frozen directory can be copied with its produced by the backup to identify the data item, the data components preserved. The Acquire True File remote mechanism (used in miritem can be obtained using, for example, the Make True File roring and archiving) preserves the directory tree structure Local primitive mechanism. In operation, the system can be used to cache data items 15 by ensuring that all of the component segments and True from a server, so that only the most recently accessed data Files in a compound data item are actually copied to a items need be retained. To operate in this way, a cache client remote system. Of course, no transfer is necessary for data is configured to have a local registry (its cache) with a items already in the registry of the remote system. remote Local Directory Extensions table (from the cache In operation, the system can efficiently make a copy of server). Whenever a file is opened (or read), the Local 20 any collection of data items, to support a version control Directory Extensions table is Used to identify the True mechanism for groups of the data items. Name, and the Make True File Local primitive mechanism The Freeze Directory primitive mechanism is used to create a collection of data items. The constituent files and inspects the local registry. When the local registry already segments referred to by the frozen directory are maintained has a copy, the file is already cached. Otherwise, the Locate True File remote mechanism is used to get a copy of the file. 25 in the registry, without any need to make copies of the This mechanism consults the cache server and uses the constituents each time the directory is frozen. Request True File remote mechanism to make a local copy, Whenever a pathname is traversed, the Get Files in effectively loading the cache. Directory operating system mechanism is used, and when it The Groom Cache background mechanism flushes the encounters a frozen directory, it uses the Expand Frozen cache, removing the least-recently-used files from the cache 30 Directory primitive mechanism. client's True File registry. While a file is being modified on A frozen directory can be copied from one pathname to a cache client, the Lock Cache and Update Cache remote another efficiently, merely by copying its True Name. The mechanisms prevent other clients from trying to modify the Copy File operating system mechanism is used to copy a same file. frozen directory. In operation, when the system is being used to cache data 35 Thus it is possible to efficiently create copies of different versions of a directory, thereby creating a record of its items, the problems of maintaining cache consistency are avoided. history (hence a version control system). To access a cache and to fill it from its server, a key is In operation, the system can maintain a local inventory of required to identify the data item desired. Ordinarily, the key all the data items located on a given removable medium, is a name or address (in this case, it would be the pathname 40 such as a diskette or CD-ROM. The inventory is indepenof a file). If the data associated with such a key is changed, dent of other properties of the data items such as their name, the client's cache becomes inconsistent; when the cache location, and date of creation. client refers to that name, it will retrieve the wrong data. In The Inventory Existing Directory extended mechanism provides a way to create True File Registry entries for all of order to maintain cache consistency it is necessary to notify every client immediately whenever a change occurs on the 45 the files in a directory. One use of this inventory is as a way to pre-load a True File registry with backup record inforserver. By using an embodiment of the present invention, the mation. Those files in the registry (such as previously cache key uniquely identifies the data it represents. When installed software) which are on the volumes inventoried need not be backed up onto other volumes. the data associated with a name changes, the key itself The Inventory Removable, Read-only Files extended changes. Thus, when a cache client wishes to access the 50 modified data associated with a given file name, it will use mechanism not only determines the True Names for the files a new key (the True Name of the new file) rather than the key on the medium, but also records directory entries for each to the old file contents in its cache. The client will always file in a frozen directory structure. By copying and modirequest the correct data, and the old data in its cache will be fying this directory, it is possible to create an on line patch, eventually aged and flushed by the Groom Cache back- 55 or small modification of an existing read-only file. For ground mechanism. example, it is possible to create an online representation of a modified CD-ROM, such that the unmodified files are Because it is not necessary to immediately notify clients actually on the CD-ROM, and only the modified files are when changes on the cache server occur, the present invenonline. tion makes it possible for a single server to support a much 60 In operation, the system tracks possession of specific data larger number of clients than is otherwise possible. In operation, the system automatically archives data items items according to content by owner, independent of the name, date, or other properties of the data item, and tracks as they are created or modified. After a file is created or the uses of specific data items and files by content for modified, the Close File operating system mechanism creates an audit file record, which is eventually processed by accounting purposes. Using the Track for Accounting Purthe Process Audit File Entry primitive mechanism. This 65 poses extended mechanism provides a way to know reliably mechanism uses the New True File primitive mechanism for which files have been stored on a system or transmitted from any file which is newly created, which in turn uses the one system to another. Case 6:11-cv-00656 Document 1-4 Filed 12/08/11 Page 55 of 63 PageID #: 182 US 6,928,442 B2 39 40 True Names in Relational and Object-Oriented Databases (a) tracking which files have been stored on a computer; Although the preferred embodiment of this invention has and been presented in the context of a file system, the invention (b) tracking which files have been transmitted from a of True Names would be equally valuable in a relational or computer. object-oriented database. A relational or object-oriented 5 6. A method, in a system in which a plurality of files are database system using True Names would have similar distributed across a plurality of computers, wherein data in benefits to those of the file system employing the invention. a file in the system may represent a digital message, a digital For instance, such a database would permit efficient elimiimage, a video signal or an audio signal, the method comnation of duplicate records, support a cache for records, prising: simplify the process of maintaining cache consistency, pro- 10 obtaining a name for a data file, the name having been vide location-independent access to records, maintain determined using an MD5 function of the data, wherein archives and histories of records, and synchronize with the data used by the MD5 function comprises the distant or disconnected systems or databases. contents of the data file; and The mechanisms described above can be easily modified in response to a request for the data file, the request to serve in such a database environment. The True Name 15 including at least the name of the data file, providing a registry would be used as a repository of database records. copy of the data file from a given one of the plurality All references to records would be via the True Name of the of computers, said providing being based at least in part record. (The Local Directory Extensions table is an example on the obtained name, and wherein a copy of the of a primary index that uses the True Name as the unique requested file is only provided to licensed parties. identifier of the desired records.) 20 7. A method, in a system in which a plurality of files are In such a database, the operations of inserting, updating, distributed across a plurality of computers, wherein some of and deleting records would be implemented by first assimithe computers communicate with each other using a TCP/IP lating records into the registry, and then updating a primary communication protocol, the method comprising: key index to map the key of the record to its contents by obtaining a name for a data file, the contents of said data using the True Name as a pointer to the contents. 25 file representing a digital image, the name having been The mechanisms described in the preferred embodiment, determined using at least a given function of the data in or similar mechanisms, would be employed in such a the data file, wherein the data used by the given system. These mechanisms could include, for example, the function to determine the name comprises the contents mechanisms for calculating true names, assimilating, of the data file; and locating, realizing, deleting, copying, and moving True 30 in response to a request for the data file, the request Files, for mirroring True Files, for maintaining a cache of including at least the name of the data file, providing a True Files, for grooming True Files, and other mechanisms copy of the file from a given one of the plurality of based on the use of substantially unique identifiers. computers, wherein a copy of the requested file is not While the invention has been described in connection provided to unlicensed parties or to unauthorized parwith what is presently considered to be the most practical 35 ties. and preferred embodiments, it is to be understood that the 8. A method, in a network comprising a plurality of invention is not to be limited to the disclosed embodiment, computers, some of the computers functioning as servers but on the contrary, is intended to cover various modificaand some of the computers functioning as clients, wherein tions and equivalent arrangements included within the spirit 40 some computers in the network communicate with each and scope of the appended claims. other using a TCP/IP communication protocol, wherein a What is claimed is: key is required to identify a file on the network, the method 1. In a system in which a plurality of files are distributed comprising: across a plurality of computers, a method comprising: storing some files on a first computer in the network and obtaining a name for a data file, the name being based at storing copies of some of the files from the first least in part on a given function of the data, wherein the 45 computer on a set of computers distinct from the first data used by the given function to determine the name computer; comprises the contents of the data file; and for a particular file, determining a different cache key in response to a request for the a data file, the request from an ordinarily used key for the file, the different including at least the name of the particular file, causing key being determined at least in part using a message a copy of the file to be provided from a given one of the 50 function MD5 of the data, wherein the data used by the plurality of computers, wherein a copy of the requested function to determine the name comprises the contents file is only provided to licensed parties. of the particular file; and 2. A method as in claim 1 further comprising: responsive to a request for the particular file, the request determining, using at least the name, whether a copy of 55 including the different key for the file, causing a copy the data file is present on a particular one of said of the particular file to be provided to the requester, computers. wherein the requested file is not provided to unlicensed 3. A method as in claim 1 further comprising: parties, and determining, using at least the name, whether an unauwherein the contents of the file may represent: a page in thorized or unlicensed copy of the data file is present on 60 memory, a digital message, a digital image, a video a particular one of said computers. signal or an audio signal. 4. A method as in claim 1, further comprising: 9. A content delivery method, comprising: maintaining accounting information relating to the data distributing files across a network of servers; files. for a particular file having a contextual name specifying 5. A method as in claim 4, wherein the maintaining of 65 accounting information includes at least some of activities locations in the network at which the file may be selected from: located, determining another name for the particular Case 6:11-cv-00656 Document 1-4 Filed 12/08/11 Page 56 of 63 PageID #: 183 US 6,928,442 B2 41 42 file, the other name including at least a data identifier 22. A method, in a system in which a plurality of files are determined using a given function of the data, where distributed across a plurality of computers, the method said data used by the given function to determine the comprising: other name comprises the contents of the particular file; obtaining a name for a data file, the name being based at obtaining a request for the particular file, the request 5 least in part on an MD5 function of the data which including the contextual name and the other name of comprises the contents of the particular file; and the particular file; and determining, using at least the obtained name, whether an responsive to the request, providing a copy of the parunauthorized or unlicensed copy of the data file is ticular file from one of the servers of the network of present on a at least one of said computers. servers, said providing being based, at least in part, on 10 23. A method comprising: the other name of the particular item, wherein the obtaining a list of file names, at least one file name for requested file is not provided to unlicensed parties. 10. A method, in a system in which a plurality of files are each of a plurality of files, each of said file names distributed across a plurality of computers, the method having been determined, at least in part, by applying a 15 comprising: function to the contents of the corresponding file; and obtaining a name for a data file, the name being based at using at least said list to determine whether unauthorized least in part on a given function of the data, wherein the or unlicensed copies of some of the plurality of data data used by the function comprises the contents of the files are present on a particular computer. particular file; 24. A method as in claim 23 further comprising: determining, using at least the name, whether a copy of in response to a request for a particular data file, allowing the data file is present on at least one of said computers; 20 the contents of the data file to be provided from a and computer determined to have a licensed or authorized determining whether a copy of the data file that is present copy of the data file. on a at least one of said computers is an unauthorized 25. A method as in claim 23 wherein the particular copy or an unlicensed copy of the data file. 25 computer is part of a peer-to-peer network of computers. 11. A method as in claim 10 further comprising: 26. A method as in claim 23 further comprising: allowing the file to be provided from one of the computers if the computer is found to have a file that it is not having an authorized or licensed copy of the file. authorized or licensed to have, recording information 12. A method as in claim 10 wherein at least some of the about the computer and about the file. 30 plurality of computers comprise a peer-to-peer network. 27. A method as in claim 23 wherein the function is a 13. A method, in a system in which a plurality of files are message digest function or a hash function. distributed across a plurality of computers which form a 28. A method as in claim 23 wherein the function is peer-to-peer network, the method comprising: selected from the functions: MD4, MD5, and SHA. obtaining a True Name for a data file, the True Name 29. A method as in claim 23 wherein the given function being based at least in part on a given function of the 35 randomly distributes its outputs. data, wherein the data used by the given function 30. A method as in claim 23 wherein the function procomprises the contents of the particular file; and duces a substantially unique value based on the data comdetermining, using at least the name, whether an unliprising the data file. censed or unauthorized copy of the data file is present 31. A method comprising: on a particular computer. 40 obtaining a list of True Names, one for each of a plurality 14. A method comprising: of files, wherein, for each of the files, the True Name for obtaining a name for a data file, the name being based at that file is determined using a function of the contents least in part on a function of the data, wherein the data of the file; used by the function comprise at least the contents of for at least some computers that make up part of a the file; and 45 peer-to-peer network of computers, comparing at least in response to a request for the data file, the request some of the contents of the computers to the list of True including at least the obtained name of the data file, Names to determine whether unauthorized or unlicausing the contents of the data file to be provided from censed copies of some of the plurality of data files are a computer having a licensed copy of the data file. present on those computers; and 15. A method as in claim 14 wherein the function is a 50 based at least in part on said comparing, if a computer is message digest function or a hash function. found to have content that it is not authorized or 16. A method as in claim 14 wherein the function is licensed to have, recording information about the comselected from the functions: MD4, MD5, and SHA. puter and about the unauthorized or unlicensed content. 17. A method as in claim 14 wherein the given function 32. A method as in claim 31 wherein the True Names are 55 randomly distributes its outputs. determined using a message digest function or a hash 18. A method as in claim 14 wherein the function profunction. duces a substantially unique value based on the data com33. A method as in claim 31 wherein the function IS prising the data file. selected from the functions: MD4, MD5, and SHA. 19. A method as in claim 14 wherein a data file may 34. A method as in claim 31, further comprising: comprise a file, a portion of a file, a page in memory, a digital 60 in response to a request for the data file, allowing a copy message, a digital image, a video signal or an audio signal. of the file to be provided from a given one of the 20. A method as in claim 14 wherein certain processors in plurality of computers having an authorized or licensed the network communicate with each other using a TCP/IP copy of the file. communication protocol. 35. A method comprising: 21. A method as in claim 14 wherein said name for said 65 data file, as determined using said function, will change obtaining a list of True Names, one for each of a plurality of files, wherein, for each of the files, the True Name for when the data file is modified. Case 6:11-cv-00656 Document 1-4 Filed 12/08/11 Page 57 of 63 PageID #: 184 US 6,928,442 B2 43 44 that file is determined using an MD5 function of the by applying a function to the contents of the correcontents of the file; sponding file; and determine, using at least said list, whether unauthorized or comparing the True Names of at least some of the contents unlicensed copies of some of the plurality of data files of a computer to the list of True Names to determine whether unauthorized or unlicensed copies of some of 5 are present on a particular computer. 43. A computer system as in claim 42 wherein the the plurality of data files are present on that computer; function is a message digest function or a hash function. and 44. A computer system as in claim 42 wherein the based at least in part on said comparing, if a computer is function is an MD5 function. found to have content that it is not authorized or 45. In a system in which a plurality of files are distributed licensed to have, recording information about the com- 10 across a plurality of computers, at least some of the computer and about the unauthorized or unlicensed content. puters forming a peer-to-peer network, a method compris36. In a system in which a data file is distributed across ing: a plurality of computers, a method comprising: obtaining a name for a data file, the name being based at obtaining a name for the data file, the name being based least in part on a given function of the data, wherein the at least in part on a given function of the data, wherein 15 data used by the given function comprises the contents the data used by the function which comprise the of the data file; and contents of the data file; and in response to a request for the a data file, the request determining, using at least the name, whether an unauincluding at least the name of the particular file, thorized or unlicensed copy of the data file is present on attempting to cause a copy of the file to be provided 20 a particular one of said computers. from a given one of the plurality of computers, wherein 37. Computer-readable media tangibly embodying a prothe requested file is only provided to authorized or gram of instructions executable by at least one computer, the licensed parties. program comprising code to: 46. A method, in a system in which a plurality of files are obtain a name for a data file, the name being based at least in part on a given function of the data, wherein the data 25 distributed across a plurality of computers, at least some of the computers forming a peer-to-peer network, the method used by the function comprises the contents of the data comprising: file; and obtaining a name for a data file, the name being based at in response to a request for the a data file, the request least in part on a given function of the data, wherein the including at least the name of the particular file, cause data used by the function comprises the contents of the a copy of the file to be provided from a given one of the 30 data file; plurality of computers, wherein the copy is only prodetermining whether a copy of the data file is present on vided to licensed parties. a at least one of said computers; and 38. Computer-readable media tangibly embodying a program of instructions executable by at least one computer, the determining, using at least the name, whether a copy of 35 the data file that is present on a at least one of said program comprising code to: computers is an unlicensed copy of the file. obtain a True Name for a data file, the True Name being 47. A method, in a system in which a plurality of files are based at least in part on a given function of the data, distributed across a plurality of computers, the method wherein the data used by the function comprises the comprising: contents of the particular file; and 40 determining whether a copy of a data file is present on a determine, using at least the name, whether an unauthoat least one of said computers; rized or unlicensed copy of the data file is present on a obtaining a name for a data file, the name being based at particular computer. 39. Computer-readable media tangibly embodying a proleast in part on a given function of the data which gram of instructions executable by at least one computer, the 45 comprises the contents of the data file; and, program comprising code to: using at least the name, attempting to determine whether a copy of the data file that is present on the at least one obtain a list of file names, one for each of a plurality of files, each of said file names having been determined, of said computers is an unlicensed copy of the file. at least in part, by applying a function to the contents 48. A method as in claim 47 wherein the given function of the corresponding file; and 50 comprises a message digest or a hash function. 49. A method as in claim 48 wherein the given function determine, using at least said list, whether unauthorized or is selected from the functions: MD4, MD5 and SHA. unlicensed copies of some of the plurality of data files 50. A method as in claim 47 further comprising: are present on a particular computer. if a computer is found to have a file that it is not licensed 40. Computer-readable media tangibly embodying a proto have, recording information about the computer. gram of instructions executable by at least one computer, the 55 program comprising code to: 51. A method as in claim 47, further comprising: obtain a name for the data file, the name being based at maintaining accounting information relating to data files in the system. least in part on a given function of the data which comprise the contents of the data file; and 52. A method as in anyone of claims 4 and 51, further determine, using at least the name, whether an unautho- 60 comprising: using the accounting information as a basis for a system rized or unlicensed copy of the data file is present on a in which charges are based on an identity of the data particular one of said computers. 41. Media as in claim 40 wherein the given function is a files. message digest function or a hash function. 53. A method as in claim 47, further comprising, for at 42. A computer system programmed to: 65 least one computer in the system: obtain a list of file names for a plurality of files, each of (a) tracking which data files have been stored on a said file names having been determined, at least in part, computer; and Case 6:11-cv-00656 Document 1-4 Filed 12/08/11 Page 58 of 63 PageID #: 185 US 6,928,442 B2 45 46 (b) tracking which data files have been transmitted from (b) obtaining a name for the data file, the name being a computer. based at least in part on an MD5 function of the data 54. A method, in network in which a plurality of files are which comprises the contents of the data file; distributed across a plurality of computers of the network, (c) when a copy of the data file is found to be present 5 the method comprising: on one of the computers, determining, using at least determining whether a copy of a data file is present on a the name, whether the copy of the data file-is an at least one of said computers, said data file representauthorized or licensed copy of the data file; and, ing one or more of a digital image, a video signal or an (d) if a computer is found to have a file that it is not audio signal; authorized or licensed to have, recording information obtaining a name for the data file, the name being based 10 about the computer or the data file. at least in part on an MD5 function of the data which 56. A method, in a system in which a plurality of files are comprises the contents of the data file; and, distributed across a plurality of computers, the method when a copy of the data file is found to be present on one of the computers, determining, using at least the name, 15 comprising: whether the copy of the data file is an authorized or obtaining a name for a data file, the name being based at licensed copy of the data file; and least in part on a given function of the data, wherein the if a computer is found to have a file that it is not data used by the given function comprises the contents authorized or licensed to have, recording information of the data file; and about the computer or the data file. 20 determining, based at least in part on the obtained name, 55. In network in which a plurality of files arc distributed whether a copy of the data file that is present on a at across a plurality of computers of the network, a method comprising: least one of said computers is an unauthorized or unlicensed copy of the file. for each file of a plurality of data files, (a) determining whether a copy of the file data file is 25 present on a at least one of said computers; * * * * * Case 6:11-cv-00656 Document 1-4 Filed 12/08/11 Page 59 of 63 PageID #: 186 111111 1111111111111111111111111111111111111111111111111111111111111 US006928442C 1 (12) EX PARTE REEXAMINATION CERTIFICATE (7609th) United States Patent (10) Farber et al. (4S) (S4) Inventors: David A. Farber, Ojai, CA (US); Ronald D. Lachman, Northbrook, lL (US) (73) 4,914,586 4,949,302 5,007,658 5,014,192 5,047,918 5,084,815 5,117,351 5,163,147 5,182,799 5,199,073 5,202,982 5,204,897 ENFORCEMENT AND POLICING OF LICENSE CONTENT USING CONTENT-BASED IDENTIFIERS (7S) Assignee: Level 3 Communications, LLC, Broomfield, CO (US) Reexamination Request: No. 90/010,260, Aug. 29, 2008 5/1988 5/1989 9/1993 OTHER PUBLICATIONS Continuation of application No. 091283,160, filed on Apr. 1, 1999, now Pat. No. 6,415,280, which is a division of application No. 08/960,079, filed on Oct. 24, 1997, now Pat. No. 5,978,791, which is a continuation of application No. 08/425,160, filed on Apr. 11, 1995, now abandoned. (Sl) Int. Cl. G06F 17/30 David R. Cheriton and Timothy P. Mann "Decentralizing a Global Naming Service for Improved Performance and Fault Tolerance," published by ACM Transactions on Computer Systems, vol. 7, No.2 (1989).* (Continued) Primary Examiner-Christopher E Lee (2006.01) U.S. Cl. ............... 707/10; 707/999.01; 707/E17.0l; 709/203; 709/219; 709/229 Field of Classification Search .................... 70S/S2, 70S/S9; 707/2; 713/168, 180 See application file for complete search history. References Cited U.S. PATENT DOCUMENTS 3,835,260 4,096,568 4,221,003 4,558,413 4,658,093 4,821,184 0268069 A2 0315425 0558945 A2 (Continued) (63) (S6) 707/2 FOREIGN PATENT DOCUMENTS EP EP EP Related U.S. Application Data (S8) A 4/1990 Swinehart et al. A 8/1990 Arnoldet al. A 4/1991 Blechschmidt et aI. A 5/1991 Mansfield et aI. A 9/1991 Schwartz et al. A 111992 Mazzario A 5/1992 Miller A 1111992 Orita A 111993 Tamura et al. A 3/1993 Scott A * 4/1993 Gramlich et al. ............... A 4/1993 Wyman (Continued) Reexamination Certificate for: Patent No.: 6,928,442 Issued: Aug. 9, 2005 Appl. No.: 09/987,723 Filed: Nov. 15, 2001 (S2) Number: US 6,928,442 Cl Certificate Issued: Jul. 13,2010 A 911974 A 611978 A 911980 A 1211985 A * 411987 A 411989 Prescher et al. Bennett et al. Chang et aI. Schmidt et aI. Hellman ...................... Clancy et al. 705/52 s'MPL (S7) ABSTRACT Data files are distributed across a plurality of computers. The computers may form a network such as a content delivery network (CDN) or a peer-to-peer network. The network may operate as a TCP/lP network such as the Internet. Data files may represent may represent digital messages, images, videos or audio signals. For content---data items or files in the system-a name is obtained (or determined), where the name is based, at least in part, on a given function of the data in a data item or file. The given function may be a message digest or hash function, and it may be MD4, MDS, and SHA. A cony of a requested file is only provided to licensed (or authorized) parties. The system may check one or more computers for unauthorized or unlicensed content. Content is served based on a measure of availability of servers. DATA ITEM Case 6:11-cv-00656 Document 1-4 Filed 12/08/11 Page 60 of 63 PageID #: 187 US 6,928,442 Cl Page 2 u.s. PATENT DOCUMENTS 5,204,958 5,204,966 5,247,620 5,260,999 5,276,869 5,287,514 5,297,279 5,317,693 5,347,653 5,351,302 5,357,440 5,359,523 5,361,356 5,371,897 5,394,555 5,403,639 5,438,508 5,442,343 5,448,718 5,454,039 5,465,365 5,467,471 5,475,826 5,491,817 5,499,294 5,504,879 5,537,585 5,553,143 5,581,615 5,581,764 5,588,147 5,600,834 5,604,803 5,604,892 5,630,067 5,632,031 5,649,196 5,677,952 5,678,038 5,694,596 5,701,316 5,710,922 5,724,425 5,724,552 5,742,807 5,745,879 5,757,913 5,757,915 5,826,049 5,864,683 5,907,619 5,940,504 5,978,791 5,991,414 6,135,646 6,415,280 6,732,180 6,928,442 200210052884 200210082999 2003/0078888 2003/0078889 2003/0095660 2004/0139097 2005/0010792 2005/0114296 2007/0185848 2008/0065635 2008/0066191 2008/0071855 A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A Bl Bl B2 Al Al Al Al Al Al Al Al Al Al Al Al * * * * 411993 411993 911993 1111993 111994 211994 311994 511994 911994 911994 1011994 1011994 1111994 1211994 211995 411995 811995 811995 911995 911995 1111995 1111995 1211995 211996 311996 411996 711996 911996 1211996 1211996 1211996 211997 211997 211997 511997 511997 711997 1011997 1011997 1211997 1211997 111998 311998 311998 411998 411998 511998 511998 1011998 111999 511999 811999 1111999 1111999 1012000 7/2002 5/2004 8/2005 5/2002 6/2002 4/2003 4/2003 5/2003 7/2004 112005 5/2005 8/2007 3/2008 3/2008 3/2008 Cheng et al. Wittenberg et al. Fukuzawa et al. Wyman Forrest et al. Gram Bannon et al. Cuenod et al. Flynn et al. Leighton et al. Talbott et al. Talbott et al. Clark et al. Brown et al. Hunter et al. Belsan et al. Wyman Cato et al. Cohn et al. Coppersmith et al. Winterbottom Bader Fischer Gopal et al. Friedman Eisenberg et al. Blickenstaff et al. Ross et al. .................... 705/59 Stern .......................... 713/180 Fitzgerald et al. Neeman et al. Howard Aziz Nuttall et al. Kindell et al. Velissaropoulos et al. Woodhill et al. Blakley, III et al. Dockter et al. Campbell Alferness et al. Alleyetal. Chang et al. Taoda Masinter Wyman Bellareetal. ............... 713/168 Aucsmith et al. Ogata et al. Boebert et al. Davis Griswold ..................... 705/59 Farber et al. Garayet al. Kahn et al. Farber et al. Hale et al. Farber et al. Farber et al. Lee et al. Lee et al. Lee et al. Lee et al. Farber et al. Carpentier et al. Farber et al. Farber et al. Farber et al. Farber et al. Farber et al. 2008/0082551 Al 4/2008 Farber et al. FOREIGN PATENT DOCUMENTS EP EP EP EP GB JP JP JP JP JP JP JP WO WO WO WO WO 0566967 0631226 0654920 0658022 2294132 59058564 63-106048 63-273961 2-127755 05162529 06187384 06348558 WO 92120021 WO 94/06087 WO 94/20913 WO 95/01599 W097/43717 A2 Al A2 A2 A A2 A 10/1993 12/1994 5/1995 6/1995 4/1996 4/1984 5/1988 1111988 5/1990 6/1993 7/1994 12/1994 1111992 3/1994 9/1994 111995 1111997 OTHER PUBLICATIONS Berners-Lee, T. et aI., RFC 1738, "Uniform Resource Locators (URL)," pp. 1-25, The Internet Engineering Task Force (lETF), Network Working Group, Dec. 1994. Berners-Lee, T. et aI., RFC 1945, "Hypertext Transfer Protoco1-HTTPIl.0," The Internet Engineering Task Force (lETF), Network Working Group, May 1996, pp. 1-60. Berners-Lee, T., RFC 1630 "Universal Resource Identifiers in WWW," The Internet Engineering Task Force (lETF), Network Working Group, Jun. 1994, pp. 1-28. Bowman, C. Mic, et aI., "Harvest: A Scalable, Customizab1e Discovery and Access System," Tech. Report CU-CS-732-94, Dept. Compo Sci., U. of. Colorado-Boulder, original date Aug. 1994, revised Mar. 1995, pp. 1-29. Browne, Shirley et aI., "Location-Independent Naming for Virtual Distributed Software Repositories," Univ. ofTennessee, Dept. Compo Sci., Feb. 1995, also In ACM-S1GSOFT 1995 Symp. on Software Reusability, Seattle, Wash. Apr. 1995, 7 pages. Fa1strom, P. et aI., RFC 1914, "How to Interact with a Whois++ Mesh," The Internet Engineering Task Force (lETF), Network Working Group, Feb. 1996, pp. 1-10. Fielding, R. et aI., RFC 2068, "Hypertext Transfer Protoco1-HTTP/l.l," The Internet Engineering Task Force (lETF), Network Working Group, Jan. 1997, pp. 1-163. Fielding, R. et aI., RFC 2616, "Hypertext Transfer Protoco1-HTTP/l.l," The Internet Engineering Task Force (lETF), Network Working Group, Jun. 1999, pp. 1-176. Moats, R., RFC 2141, "URN Syntax," The Internet Engineering Task Force (lETF), Network Working Group, May 1997,pp.1-8. Vincenzetti, D. et aI., "Anti Tampering Program," Proc. Fourth {USEN1X} Security Symp. , Santa Clara, (USEN1X Association) CA, Oct. 1993, 11 pages. Vincenzetti, D. et aI., "Anti Tampering Program," Proc. Fourth {USEN1X} Security Symp., Santa Clara, CA, Oct. 1993, printed from http://www.ja.net/CER1IVincenzetti_ and_CotrozzilATP_Anti_Tamp on Mar. 22, 2006, (USEN1X Association), 8 pgs. Fowler, et a1. "A User-Level Replicated File System," AT&T Bell Laboratories Technical Memorandum 0112670-930414-05, Apr. 1993, and USEN1X 1993 Summer Conference Proceedings, Cincinnati, OH, Jun. 1993. Case 6:11-cv-00656 Document 1-4 Filed 12/08/11 Page 61 of 63 PageID #: 188 US 6,928,442 Cl Page 3 Greene, D., et aI., "Multi-Index Hashing for lnfonnation Retrieval", Nov. 20-22, 1994, Proceedings, 35th Annual Symp on Foundations of Computer Science, IEEE, pp. 722-73l. Hirano, et aI, "Extendible hashing for concurrent insertions and retrievals," in Proc 4th Euromicro Workshop on Parallel and Distributed Processing, 1996 (PDP '96), Jan. 24, 1996 to Jan. 26, 1996, pp. 235-242, Braga, Portugal. Preneel et aI., "The Cryptographic Hash Function RlPEMD-160", appeared in CryptoBytes RSA Laboratories, vol. 3, No.2, pp. 9-14, Fall, 1997 (also Bosselaers et aI., "The RlPEMD-160 Cryptographic Hash Function", Jan. 1997, Dr. Dobb's Journal, pp. 24-28). Prusker et aI., "The Siphon: Managing Distant Replicated Repositories" Nov. 8-9, 1990, Proc. Management of Replicated Data IEEE. Reply to Examination Report, Munich, Nov. 18, 2009, in Application No. EP 96 910 762.2 [19 pgs.]. Rich, K. et aI, "Hobgoblin: A File and Directory Auditor", Sep. 30--0ct. 3, 1991, Lisa v., San Diego, CA. Barbara, D., et aI., "Exploiting symmetries for low-cost comparison of file copies," 8th lnCl Conf. on Distributed Computing Systems, Jun. 1988, pp. 471-479, San Jose, CA. Bowman, C. Mic, et aI., "Harvest: A Scalable, Customizable Discovery and Access System," Aug. 4, 1994, pp. 1-27. Brisco, T., "DNS Support for Load Balancing," IETF RFC l794,Apr.1995,pp.1-7. Browne, Shirley et aI., "Location-Independent Naming for Virtual Distributed Software Repositories," Nov. 11, 1994, printed from http:/www.netlib.org/utk/papers/lifnlmain.html on Mar. 22, 2006, 18 pages. Campbell, M., "The Design of Text Signatures for Text Retrieval Systems," Tech. Report, Sep. 5, 1994, Deakin University, School of Computing & Math., Geelong, Australia. Chang, W. W. et aI., "A signature access method for the Starburst database system," in Proc. 15th lnCl Conf. on Very Large Data Bases (Amsterdam), The Netherlands), pp. 145-153. Danzig, P.B., et aI., "Distributed Indexing: A Scalable Mechanism For Distributed lnfonnation Retrieval," Proc. 14th Annual lnt'! ACM SlGlR Conf. on Research and Development in Information Retrieval, pp. 220-229, Oct. l3-l6,199l. Faloutsos, C. "Access methods for text," ACM Comput. Surv. 17, 1 (Mar. 1985),49-74. Faloutsos, C. et aI., "Description and perfonnance analysis of signature file methods for office filing," ACM Trans. lnf. Syst. 5, 3 (Jul. 1987),237-257. Faloutsos, C. et aI., "Signature files: an access method for documents and its analytical performance evaluation," ACM Trans. lnf. Syst. 2, 4 (Oct. 1984),267-288. Federal Information Processing Standards (FlPS) Publication 180-1; Secure Hash Standard, Apr. 17, 1995 [17 pgs.] Feigenbaum, J. et aI., "Cryptographic protection of databases and software," in Distributed Computing and Cryptography: Proc. DlMACS Workshop, Apr. 1991, pp. 161-172, American Mathematical Society, Boston, Mass. Harrison, M. C., "Implementation of the substring test by hashing," Commun. ACM 14, 12 (Dec. 1971),777-779. Hauzeur, B. M., "A Model For Naming, Addressing, And Routing," ACM Trans. lnf. Syst. 4, Oct. 4, 1986), 293-311. IEEE, The Authoritative Dictionary of IEEE Standards Tenns, 7th ed., Copyright 2000, pp. 107, 176,209,240,241, 432,468,505,506,682, 1016, 1113, 1266, and 1267. International Search Report dated Jun. 24, 1996 in international application PCTIUS 1996/004733. Ishikawa, Y., et aI., "Evaluation of signature files as set access facilities in OODBs," In Proc. of the 1993 ACM SlGMOD Inter. Conf. on Management of Data (Washington, D.C., U.S., May 1993). P. Buneman & S. Jajodia, Eds. SlGMOD '93. ACM, NY, NY, 247-256. Karp, R. M. and Rabin, M. 0., "Efficient randomized pattern-matching algorithms," IBM J. Res. Dev. 31, 2 (Mar. 1987), 249-260. Khare, R. and Lawrence, S., RFC 2817, "Upgrading to TLS Within HTTP/l.l," May 2000, pp. 1-12. Khoshafian, S. N. et al. 1986. Object identity. In Conf. Proc. on Object-Oriented Programming Systems, Languages and Applications (Portland, Oregon, United States, Sep. 29-0ct. 2, 1986), N. Meyrowitz, Ed. OOPLSA '86. ACM Press, New York, NY, 406-416. Kim et al. "The Design and Implementation of Tripwire: A file System Integrity Checker", COAST Labs. Dept. of Computer Sciences Purdue University, Feb. 23, 1995, pp. 1-18. Kim, Gene H., and Spafford, Eugene H., "Writing, Supporting, and Evaluating Tripwire: A Publicly Available Security Tool." COAST Labs. Dept. of Computer Sciences Purdue University, Mar. 12, 1994, pp. 1-23. Knuth, D. E., "The Art of Computer Programming," 1973, vol. 3 "Sorting and Searching," Ch. 6.4 "Hashing," pp. 506-549. Kumar, Vijay, A Concurrency Control Mechanism Based on Extendible Hashing for Main Memory Database Systems, ACM, vol. 3, 1989, pp. 109-113. Lantz, K. A., et aI., "Towards a universal directory service." In Proc. 4th AnnualACM Symp. on Principles of Distributed Computing (Minaki, Ontario, Canada). PODC '85. ACM Press, New York, NY, 250-260. Leach, P. J., et al. The file system of an integrated local network. In Proc. 1985 ACM 13th Annual Conf. on Compo Sci. CSC '85. ACM Press, NY, NY, 309-324. Leach, P.J., et aI., "UlDs as Internal Names in a Distributed File System," In Proc. 1st ACM SlGACT-SlGOPS Symp. on Principles of Distributed Computing (Ottawa, Canada, Aug. 18-20, 1982). PODC '82. ACM Press, New York, NY, 34-4l. Ma, C. 1992. On building very large naming systems. In Proc. 5th Workshop on ACM SlGOPS European Workshop: Models and Paradigms For Distributed Systems Structuring (France, Sep. 21-23, 1992). EW 5. ACM Press, New York, NY, 1-51. Myers, J. and Rose, M., RFC 1864, "The Content-MD5 Header Field," Oct. 1995, pp. 1-4. Panagopoulos, G., et aI., "Bit-sliced signature files for very large text databases on a parallel machine architecture," In Proc. ofthe 4th Inter. Conf. on Extending Database Technology (EDBT), Cambridge, u.K., Mar. 1994, pp. 379-392 (Proc. LNCS 779 Springer 1994, ISBN 3-540-57818-8) [14 pgs.] Patent Abstract, "Management System for Plural Versions," Pub. No. 63273961 A, published Nov. 11, 1988, NEC Corp. Patent Abstracts of Japan, "Data Processor," Appln. No. 05135620, filed Jun. 7,1993, Toshiba Corp. Patent Abstracts of Japan, "Device for Generating Database and Method for the Same," Application No. 03-080504, Sun Microsyst. Inc., published Jun. 1993,38 pages. Case 6:11-cv-00656 Document 1-4 Filed 12/08/11 Page 62 of 63 PageID #: 189 US 6,928,442 Cl Page 4 Patent Abstracts of Japan, "Method for Registering and Retrieving Data Base," Application No. 03-187303, Nippon Telegr. & Teleph. Corp., published Feb. 1993, 11 pages. Peterson, L. L. 1988. A yellow-pages service for a local-area network. In Proc. ACM Workshop on Frontiers in Computer Communications Technology (Vennont, 1987). J. J. Garcia-Luna-Aceves, Ed. SIGCOMM '87. ACM Press, New York, NY, 235-242. Ravindran, K. and Ramakrishnan, K. K. 1991. A naming system for feature-based service specification in distributed operating systems. SIGSMALLIPC Notes 17, 3-4 (Sep. 1991), 12-21. Reed Wade (wade@cs.utk.edu), "re: Dienst and BFD/LIFN emails," Aug. 8, 1994, printed from http://www.webhistory. orglwww.lists/www-talkI994q3/0416.htmlon Mar. 22, 2006, (7 pages). Rivest, R., RFC 1320, "The MD4 Message-Digest Algorithm," Apr. 1992. Rivest, R., RFC 1321, "The MD5 Message-Digest Algorithm," Apr. 1992, pp. 1-19 and errata sheet (1 page). Rose, M., RFC 1544, "The Content-MD5 Header Field," Nov. 1993, pp. 1-3. Ross, K., "Hash-Routing for Collections of Shared Web Caches," IEEE Network Magazine, pp. 37-44, Nov.-Dec. 1997. Sacks-Davis, R., et a!., "Multikey access methods based on superimposed coding techniques," ACM Trans. Database Syst. 12,4 (Nov. 1987),655-696. Schneier, B., "One-Way Hash Functions, Using Crypographic Algorithms for Hashing," 1991, printed from http:// 202.179135 A/datalDDJ/articlesI1991191 09/91909g/ 9109g.htm on Mar. 22, 2006 [9 pgs.] Schwartz, M., et a!. 1987. A name service for evolving heterogeneous systems. In Proc. 11 th ACM Symp. on OS Principles (Texas, Nov. 8-11, 1987). SOSP '87. ACM Press, NY, NY, 52-62. Shaheen-Gouda, A. and Loucks, L. 1992. Name borders. In Proc. 5th Workshop onACM SIGOPS European Workshop: Models and Paradigms For Distributed Systems Structuring (Mont Saint-Michel, France, Sep. 21-23, 1992). EW 5. ACM Press, NY, NY, 1-6. Siegel, A., et a!., "Deceit: a Flexible Distributed File System," Proc. Workshop on the Management of Replicated Data, Houston, TX, pp. 15-17, Nov. 8-9,1990. Siegel, A., et a!., "Deceit: a Flexible Distributed File System," Technical Report, TR89-1042, Cornell University, Nov. 1989. Sun Microsystems, Inc., RFC 1094, "NFS: Network File System Protocol Specification," Mar. 1989, pp. 1-27. Terry, D. B. 1984. An analysis of naming conventions for distributed computer systems. In Proc. ACM SIGCOMM Symp. on Communications Architectures and Protocols: Tutorials & Symp. SIGCOMM '84. ACM Press, NY, NY, 218-224. Vincenzetti, David and Cotrrozzi, Massimo, "Anti Tampering Program," Proceedings of the Fourth {USENIX} Security Symposium, Santa Clara, CA, 1993, 11 pages. * cited by examiner Case 6:11-cv-00656 Document 1-4 Filed 12/08/11 Page 63 of 63 PageID #: 190 US 6,928,442 CI 1 2 EX PARTE REEXAMINATION CERTIFICATE ISSUED UNDER 35 U.S.C. 307 AS A RESULT OF REEXAMINATION, IT HAS BEEN DETERMINED THAT: THE PATENT IS HEREBY AMENDED AS INDICATED BELOW. 5 The patentability of claims 1-12, 14-21, 23-35, 37, 39 and 42-55 is confinned. Claims 13, 22, 36, 38, 40, 41 and 56 are cancelled. * * * * *

Disclaimer: Justia Dockets & Filings provides public litigation records from the federal appellate and district courts. These filings and docket sheets should not be considered findings of fact or liability, nor do they necessarily reflect the view of Justia.


Why Is My Information Online?