Wisconsin Alumni Research Foundation v. Apple Inc.

Filing 193

ORDER: Defendant and counter claimant Apple, Inc.'s motion for summary judgment (dkt. # 116 ) is denied as to its counterclaims and defenses of anticipation by Steely and indefiniteness, and denied as to plaintiff's willful infringement cla im premised on (1) Apple's claim construction, (2) anticipation by Steely, and (3) indefiniteness of claims 5 and 6. The court reserves on the motion in all other respects.Plaintiff's motion for summary judgment (dkt. # 117 ) is granted. Signed by District Judge William M. Conley on 8/5/2016. (voc)

Download PDF
IN THE UNITED STATES DISTRICT COURT FOR THE WESTERN DISTRICT OF WISCONSIN WISCONSIN ALUMNI RESEARCH FOUNDATION, Plaintiff, OPINION AND ORDER v. 14-cv-062-wmc APPLE, INC., Defendant. In this lawsuit, plaintiff Wisconsin Alumni Research Foundation (“WARF”) alleges that defendant Apple, Inc. infringes U.S. Patent No. 5,781,752 (the “’752 patent”), which concerns a “table based data speculation circuit for parallel processing computer.” Before the court are the parties’ cross-motions for summary judgment and claim construction. (Dkt. ##116, 117.) For the reasons that follow, the court will adopt WARF’s proposed construction of the term “prediction” and grant summary judgment to WARF on Apple’s counterclaims and defenses based on anticipation under 35 U.S.C. § 102 with respect to U.S. Patent No. 5,619,662 (“Steely” or the “Steely patent”), as well as indefiniteness under 35 U.S.C. § 112 ¶ 2 with respect to claims 5 and 6. In turn, the court will deny Apple’s motion for summary judgment based on those same defenses and counterclaims. As for Apple’s motion for summary judgment on WARF’s claim of willful infringement, the court will deny Apple’s motion with respect to any defenses premised on (1) Apple’s claim construction, (2) anticipation by Steely, and (3) indefiniteness of claims 5 and 6, but will reserve on Apple’s motion in all other respects. UNDISPUTED FACTS1 A. The Parties and Overview of This Lawsuit Plaintiff Wisconsin Alumni Research Foundation (“WARF”) is a Wisconsin corporation, with its principal place of business in Madison, Wisconsin. WARF is the owner of the ’752 patent. Defendant Apple, Inc. is a California corporation, with its principal place of business in Cupertino, California. On January 31, 2014, WARF filed suit against Apple alleging infringement of the ’752 patent. Apple answered and asserted counterclaims for declaratory judgment of non-infringement and invalidity of the ’752 patent. Material to the present motions, Apple contends that claims 1-3, 5, 6, and 9 of the ’752 patent are invalid as anticipated by the “Steely patent. Apple also alleges that claims 5 and 6 of the ’752 patent are invalid as indefinite. B. Technology Overview A modern computer device includes both hardware and software. Hardware typically includes memory, a microprocessor and peripherals, while software typically consists of sequences of instructions or “programs” that run on the hardware. At a general level, the microprocessor is responsible for fetching instructions and data, executing those instructions to modify the data, and then saving the results.2 Typically, 1 Except as otherwise noted, for purposes of summary judgment, the court finds the following facts to be material and undisputed. 2 While the court uses the term “executing,” the court acknowledges that the parties have agreed on a construction for the term “in fact executed” described below in subheading “E” of this Facts section. 2 individual instructions call for the performance of a relatively simple task, such as reading a value from or writing a value to a memory location, or adding, subtracting or comparing two numbers. There are generally three types of software instructions: (1) memory instructions; (2) computing instructions; and (3) control instructions. Memory instructions are instructions that “when executed, cause data to be loaded into the processing unit from memory or stored from the processing unit to memory.” (’752 Patent (dkt. #1-1) 1:38-40.) So-called “LOAD” instructions copy or read a value stored at a memory location specified by an address and return a value. LOAD instructions are also called “data consuming instructions,” because they consume data by obtaining data from memory, though as Apple cautions, other types of software instructions also “consume” data. “STORE” instructions, on the other hand, copy or write a value to a memory location specified by an address. For that reason, STORE instructions are also called “data producing instructions,” as they produce data by providing data to memory. (Apple similarly points out that other types of instructions “produce” data.) When a STORE instruction executes, it overwrites any value previously stored at that memory location. Both LOAD and STORE instructions are memory instructions. Generally speaking, software instructions in a program have a predefined “program order,” where the processor performs the instructions sequentially. Instructions, however, need not always be executed in the listed order. Instead, they may be executed “out of order.” In out-of-order executions, instructions are typically executed when ready -- in other words, based on the availability of their input data, or “operands,” rather than 3 a specified program order.3 There are some obvious benefits to permitting instructions to execute out of order. For instance, because some instructions in a program take longer to execute than others, performing instructions in program order may slow processor performance since it requires waiting for earlier instructions to execute before performing later instructions in the program. Out-of-order execution, therefore, may result in increased efficiency since it allows the processor to use free time to execute other instructions that are ready to be processed. On the other hand, out-of-order execution may have a detrimental effect on performance if it leads to errors that require the processor to expend resources to correct. A key requirement of efficient out-of-order execution, therefore, is that it must yield the same results as would the execution of instructions in program order. This requirement touches on the concept of “instruction dependency.” A dependent instruction is one that must wait for the result of an earlier-in-order execution before it can safely execute.4 For example, data dependency exists when an earlier-in-order STORE instruction writes data to the same address that is accessed by a later-in-order LOAD instruction. In that situation, the STORE and LOAD must execute in program order for the LOAD to read the correct data from the shared memory address that both instructions access. 3 Apple clarifies that out-of-order executions also depend on the availability of necessary hardware. 4 While “[a] processor may permit dependent instructions to execute out-of-order and then invoke a recovery process to return to a correct machine state,” as Apple describes, Apple fails to dispute WARF’s point that to execute “safely,” the dependent instruction must wait to execute until after the instruction on which it depends has been executed. (Pl.’s Reply to Pl.’s PFOFs (dkt. #157) ¶ 21.) 4 In some situations, whether a given LOAD instruction depends on a STORE instruction from an earlier earlier-in-order program step cannot be known until after one or both of the instructions are executed. In other words, the processor lacks sufficient information to resolve whether or not a dependency actually exists. This uncertainty is known as “ambiguous dependency.” Ambiguous dependencies may occur, for example, when the memory addresses that must be accessed by a given LOAD or STORE instruction are computed “on the fly” as the program executes. In those circumstances, the processor may have to perform additional computations with data that are not currently available in order to resolve whether one instruction is dependent on another. To maximize processing speed, however, the processor may elect to execute a LOAD instruction before an earlier STORE instruction. The out-of-order execution of instructions without knowing if there is an actual dependency between them is known as “speculation” or “speculative execution,” because the processor is speculating that there is no actual dependency. Speculation can be advantageous if it turns out to be correct (i.e., the LOAD instruction in fact was not dependent on the STORE instruction); then the out-of-order execution will yield the correct result and the performance will improve. 5 In contrast, if a LOAD instruction is speculatively executed ahead of a STORE instruction of earlier program order and it turns out that the speculation was incorrect (i.e., the 5 The parties dispute whether “it is quite often the case that an ambiguous dependency is resolved as no dependency at all,” as the ’752 patent represents. (’752 patent (dkt. #1-1) 2:26-27.) As Apple contends, “[t]he degree to which ambiguous dependencies will turn out to be resolved as no dependency depends upon the workload.” (Def.’s Resp. to Pl.’s PFOFs (dkt. #141) ¶ 30.) At the same time, Apple also proposes on summary judgment that “[m]any program instructions are ‘independent’ of each other and can safely execute-out-of-order with respect to each other.” (Def.’s PFOFs (dkt. #119) ¶ 24.) Even if this could be construed as a dispute, it is not material to the issues before the court on summary judgment. 5 LOAD instruction was in fact dependent on the earlier STORE instruction), then the instructions will cause an error -- the prematurely executed LOAD instruction having obtained incorrect or stale data.6 In the patent-in-suit, this error is referred to as “mis-speculation,” although the Steely patent -- as described below -- refers to it as a “collision.” As discussed generally already, and as the patent-in-suit explains specifically, a mis-speculation can be detrimental to processor performance because it requires “the results of the prematurely executed dependent instructions [to] be discarded” or “squashed,” and the instruction will need to be re-executed in program order. (’752 patent (dkt. #1-1) 2:46-49; Def.’s PFOFs (dkt. #119) ¶ 33.)7 C. The ’752 Patent i. Overview and Prosecution History The ’752 patent, entitled “Table Based Data Speculation Circuit for Parallel Processing Computer,” was filed on December 26, 1996, and issued on July 14, 1998. The listed inventors are Drs. Andreas I. Moshovos, Scott E. Breach, Terani N. Vijaykumar, Gurindar S. Sohi. Plaintiff WARF is listed as the original assignee. WARF maintains that the named inventors conceived of the claimed invention no later than December 11, 1995. 6 Apple maintains that there will be no error, at least technically, if the yet-to-execute STORE instruction would not change the value already written to the memory address, because the LOAD instruction still obtains the correct data. (Def.’s Resp. to Pl.’s PFOFs (dkt. #141) ¶ 31.) 7 Apple contends that there may be times that the performance cost of mis-speculation does not outweigh the performance benefit of speculation. (Def.’s Resp. to Pl.’s PFOFs (dkt. #141) ¶ 34.) 6 During prosecution of the ’752 patent, the named inventors provided no prior art to the Patent Office. On October 8, 1997, the patent examiner issued a Notice of References Cited, which listed four pieces of prior art. The patent examiner rejected pending claims 1-2, 6 and 8-11 as anticipated in light of U.S. Patent No. 5,555,432 (“Hinton”). On January 5, 1998, WARF filed a response cancelling pending claims 9 and 10, but arguing that claims 1, 2, 6, 8 and 11 were allowable over Hinton. On February 3, 1998, the examiner allowed those claims. ii. Objectives and Specification The ’752 patent recognizes that “[t]he performance cost is a function of the frequency that speculation is required, the probability of mis-speculation and the time required to recover from a mis-speculation.” (’752 patent at 3:14-18.) The ’752 patent also observes that “most data dependent mis-speculations can be attributed to a few static STORE/LOAD instruction pairs,” and that mis-speculations typically “exhibit ‘temporal locality,’” such that “if one LOAD/STORE pair causes a data mis-speculation at a given point in time, it is highly likely that a later instance of the same pair will soon cause another mis-speculation.” (Id. at 3:51-57.) The patent further observes that The present inventors believe that a relatively limited number of LOAD/STORE pairs will create mis-speculation and so the operation described above prevents the majority of the LOAD/STORE pairs from being slowed in execution. The list of critical LOAD/STORE pairs is prepared dynamically in a synchronization method for those LOAD/STORE pairs . . . . 7 (Id. at 14:15-22.) Based on these observations, the inventors concluded that load-based memory dependencies may be amenable to history-based prediction.8 As such, the ’752 patent associates predictions with particular LOAD instructions that have mis-speculated in the past. The specification of the ’752 patent describes a processor containing a “data speculation circuit” that detects dependence between LOAD and STORE instructions. The data speculation circuit also detects mis-speculations where a LOAD instruction that is dependent for its data on a STORE instruction appearing earlier in program order is in fact executed before the STORE instruction. According to the preferred embodiment, the data speculation circuit sends a “mis-speculation indication” to a “predictor circuit” if a mis-speculation is detected, which uses the indication to then produce a prediction. The greater the “prediction,” the greater the likelihood that the speculative execution of its associated LOAD instruction will cause a mis-speculation; the lower a prediction at a given time, the lower the likelihood of mis-speculation. The processor uses each prediction to decide whether its associated LOAD instruction should be allowed to execute speculatively. The patent discloses a “three-tiered approach” to dealing with ambiguous dependency. The first tier considers whether a LOAD instruction has a history of misspeculation. “If there is no history of data mis-speculation, [the instruction] is executed without further inquiry.” (’752 patent (dkt. #1-1) 3:64-66.) At this tier, the ’752 patent 8 Apple disputes that the named inventors were the first to develop history-based techniques for load-based memory dependencies. (Def.’s Resp. to Pl.’s PFOFs (dkt. #141) ¶ 42.) 8 describes a “prediction table,” in which entries are created when the processor detects a mis-speculation by a LOAD instruction. “[I]f no entry is found in the prediction table,” then “the reasonable assumption is that speculation can proceed.” (Id. at 11:22-24.) The second tier becomes relevant with a LOAD instruction has mis-speculated in the past. In this tier, “a predictor based on the past history of mis-speculations for that LOAD instruction is employed to determine whether the instruction should be executed or delayed.” (Id. at 4:1-4.) With respect to the second tier, the patent explains that “it is an object of the invention to provide a predictor circuit that may identify data dependencies on an on-going and dynamic basis.” (Id. at 4:31-33.) Finally, in cases where the prediction indicates that the LOAD instruction should not be executed speculatively, the third tier may be employed to decide when the LOAD instructions should be allowed to execute. This part of the patent describes a “synchronization table,” which “indicates whether there is in fact a pending LOAD instruction awaiting its dependent STORE instruction.” (Id. at 11:45-47.) iii. Claim Construction a) “Prediction” The heart of the parties’ dispute turns on the meaning of the term “prediction” as used in claim 1 and all other independent claims. As context, WARF contends that the term “prediction” should be construed to mean “a variable that indicates the likelihood that the data speculative execution of a load instruction will result in a mis-speculation,” where “a ‘prediction’ must be capable of receiving ongoing updates.” (Pl.’s PFOFs (dkt. #122) ¶ 67.) In contrast, Apple contends that “prediction” need not be capable of 9 receiving updates, and it, therefore, proposes a construction of “a value that indicates that likelihood that the data speculative execution of a load instruction will result in a mis-speculation,” but does not necessarily contemplate a revision to that value based on regular updates. (Id. at ¶ 68.) b) Other Agreed-Upon Terms The parties agree to the following constructions of claim terms:  “data speculation circuit” (claims 1 and 9): “a circuit that detects data dependence between load and store instructions and that detects misspeculation by load instructions”  “mis-speculation” (claims 1, 6, and 9): “when a load instruction that is dependent for its data on a store instruction appearing earlier in the program order is in fact executed before the store instruction wrote its data to a memory address shared with the load instruction”  “in fact executed” (claims 1 and 9): “when a load instruction has actually accessed a memory address that has not yet been updated by a store instruction appearing earlier in the program order”  “predictor” (claim 1): “a circuit that receives a mis-speculation indication from the data speculation circuit to produce a prediction” D. State of the Prior Art i. Overview By 1995, out-of-order processing was well-known in the field of computer architecture design. Also by 1995, techniques for detecting data dependence were wellknown in the art. On this much, the parties are in agreement. Apple further maintains that by 1995, data speculation was well-known in the art, as were techniques for detecting and recovering from mis-speculations. WARF disputes 10 this, asserting that the prior art techniques do not resemble the solutions proposed by the ’752 patent inventors. Apple also contends that by 1995, prediction techniques to improve the accuracy of speculation in an out-of-order processor were well-known in the art. WARF also disputes this, and in particular contends that the techniques disclosed in the prior art did not satisfy the “prediction” claimed in the ’752 patent -- the heart of the parties’ dispute addressed in the opinion below. Finally, Apple also maintains that by 1995, data speculation involving LOAD and STORE instructions was well-known in the art. WARF disputes that too, arguing the prior art techniques bore no resemblance to the solutions proposed in the ’752 patent.9 ii. The Steely Patent The “Steely patent” is titled “Memory Reference Tagging” and names Simon C. Steely, Jr., David J. Sager and David B. Fite, Jr. as inventors. The application was filed on August 12, 1995, and claims priority to an earlier application filed on November 12, 1992. The Steely patent issued as U.S. Patent No. 5,619,662 on April 8, 1997, and was assigned to DEC. As such, it is prior art to the ’752 patent. Apple contends that the Steely patent anticipates claims 1-3, 5, 6, and 9 of the ’752 patent. 9 Apple proposes finding of facts about other prior art references, including: a technique developed by Digital Equipment Corporation (“DEC”); U.S. Patent No. 5,666,506 (“Hesson”); and a commercial processor known as the Alpha 21264 or “EV6.” As best as the court can discern, however, these prior art references are only material to Apple’s motion for summary judgment on the objective prong of WARF’s willful infringement claim. As discussed below, the court reserves on that based on any arguments not developed fully at summary judgment, waiting instead to hear the evidence of infringement and invalidity to be introduced during the first phase of the trial. 11 Pertinent to this anticipation defense, all claims of the ’752 patent require a “prediction” associated with a LOAD instruction or with a LOAD/STORE pair. Apple maintains that the Steely patent describes a processor that executes instructions out of order and uses a prediction to determine whether to allow speculation for LOAD and STORE instructions. WARF asserts that Steely fails to disclose any “prediction” capable of receiving ongoing updates -- or even a “prediction” under Apple’s proposed construction of that claim term. E. Person of Ordinary Skill in the Art The parties dispute what characteristics a person of ordinary skill in the art would possess, though this dispute is not material to the parties’ respective motions for summary judgment, or at least the reasons for this court’s disposition of those motions. Apple maintains that for purposes of the ’752 patent, a person of ordinary skill in the art would have a Ph.D. in electrical engineering, computer engineering or computer science with a focus on computer architecture or microprocessor design; or an M.S. or B.S. degree in electrical engineering, computer engineering or computer science with significant work experience relating to computer architecture or microprocessor design. WARF maintains that a person of ordinary skill in the art would have at least a bachelor’s degree in electrical engineering or computer science, and at least three to five years of experience in computer design and computer architecture. Alternately, WARF asserts a person of ordinary skill in the art would have a master’s degree in electrical engineering or computer science, and at least two to three years of experience in 12 computer design and computer architecture. The experience could be derived from either industry or academia. F. IPR Decision Finally, Apple filed a petition with the United States Patent and Trademark Office’s Patent Trial and Appeal Board (“PTAB”) seeking inter partes review (“IPR”) of all claims of the ’752 patent. In the petition, Apple argued that claims 1-9 are invalid as obvious in view of Hesson and Steely, relying on a declaration of its expert Dr. Colwell. On April 15, 2015, after briefing by Apple and WARF, the PTAB denied Apple’s petition “as to all challenged claims,” finding that Apple “has not shown . . . that there is a reasonable likelihood that it will prevail” on its obviousness theory for any claim of the ’752 patent. (4/17/2015 Declaration of Christopher Abernathy (“4/17/15 Abernathy Decl.”), Ex. A (dkt. #151-1) p.3.) In particular, the PTAB construed “prediction” as “a variable that indicates the likelihood that the data speculative execution of a load instruction will result in a misspeculation.” (Id. at p.10.) In so finding, the PTAB reasoned: We agree that in the ’752 patent, the mis-speculation prediction at any point in time is a function of the misspeculation history of load-store instruction pairs. Thus, the prediction is a variable. The fact that the prediction has a particular value at each point in time is merely an indication of its functional relationship and does not change the nature of the prediction from a variable to a constant value. (Id.)10 10 In Wisconsin Alumni Research Foundation v. Intel Corp., No. 08-cv-78-bbc (W.D. Wis. filed Feb. 5, 2008), Judge Crabb similarly construed “prediction” in the same patent to mean “a variable that 13 OPINION I. Claim Construction “It is a ‘bedrock principle’ of patent law that ‘the claims of a patent define the invention to which the patentee is entitled the right to exclude.’” Phillips v. AWH Corp., 415 F.3d 1303, 1312 (Fed. Cir. 2005) (en banc) (quoting Innova/Pure Water, Inc. v. Safari Water Filtration Sys., Inc., 381 F.3d 1111, 1115 (Fed. Cir. 2004)). The court exclusively determines claim construction as a matter of law. Markman v. Westview Instruments, Inc., 517 U.S. 370, 372 (1996). The words of the claims are always the “appropriate starting point” for proper construction, Comark Commc’ns, Inc. v. Harris Corp., 156 F.3d 1182, 1186 (Fed. Cir. 1998), with the court asking “how a person of ordinary skill in the art understands a claim term” as an “objective baseline from which to begin claim interpretation,” Phillips, 415 F.3d at 1313. “Importantly, the person of ordinary skill in the art is deemed to read the claim term not only in the context of the particular claim in which the disputed term appears, but in the context of the entire patent, including the specification.” Id. In fact, “[t]he best source for understanding a technical term is the specification from which it arose, informed, as needed, by the prosecution history.” Multiform Desiccants, Inc. v. Medzam, Ltd., 133 F.3d 1473, 1478 (Fed. Cir. 1998). As the Federal Circuit has recognized, indicates the likelihood that the data speculative execution of a load instruction will result in misspeculation,” and later clarified that “a ‘prediction’ must be capable of receiving ongoing updates.” Wis. Alumni Research Found. v. Intel Corp., No. 08-cv-78-bbc, 2008 WL 4279975, at *7 (W.D. Wis. Sept. 18, 2008); Wis. Alumni Research Found. v. Intel Corp., 656 F. Supp. 2d 898, 922 (W.D. Wis. 2009). While the court agrees with Apple that this decision has no binding effect on this court (Def.’s Opening Br. (dkt. #118) 23), any more than the PTAB’s decision does, the court will obviously consider Judge Crabb’s and the PTAB’s reasoning and analysis in the opinion below. 14 however, “there is sometimes a fine line between reading a claim in light of the specification, and reading a limitation into the claim from the specification.” Liebel- Flarsheim Co. v. Medrad, Inc., 358 F.3d 898, 904 (Fed. Cir. 2004) (quoting Comark Commc’ns, 156 F.3d at 1186-87). “[A]n inherent tension exists as to whether a statement is a clear lexicographic definition or a description of a preferred embodiment. The problem is to interpret claims ‘in view of the specification’ without unnecessarily importing limitations from the specification into the claims.” E-Pass Techs., Inc. v. 3Com Corp., 343 F.3d 1364, 1369 (Fed. Cir. 2003). In addition to intrinsic evidence like the specification and prosecution history, the Federal Circuit has “authorized district courts to rely on extrinsic evidence, which ‘consists of all evidence external to the patent and prosecution history, including expert and inventor testimony, dictionaries, and learned treatises.’” Phillips, 415 F.3d at 1317 (quoting Markman v. Westview Instruments, Inc., 52 F.3d 967, 980 (Fed. Cir. 1995)). “However, while extrinsic evidence ‘can shed useful light on the relevant art,’ [the Federal Circuit has] explained that it is ‘less significant than the intrinsic record in determining ‘the legally operative meaning of claim language.’” Id. (quoting C.R. Bard, Inc. v. U.S. Surgical Corp., 388 F.3d 858, 862 (Fed. Cir. 2004)). Accordingly, the court can consider extrinsic evidence in construing patent claims, but it must do so in the context of the intrinsic evidence and while keeping in mind the flaws inherent in each type of extrinsic evidence. Id. at 1318. 15 As previously mentioned, the parties dispute the proper construction of only one claim term, “prediction,” which appears in claims 1, 2, 3, 5 and 9 of the ’752 patent. 11 The parties propose the following constructions for that term: “Prediction” Plaintiff WARF’s Proposed Construction Defendant Apple’s Proposed Construction “A variable that indicates the likelihood “A value that indicates the likelihood that that the data speculative execution of a the data speculative execution of a load load instruction will result in a mis- instruction will result in a mis-speculation” speculation” A “prediction” must be capable of receiving A “prediction” need not be capable of ongoing updates. receiving ongoing updates. The obvious and sole substantive difference between the parties’ competing constructions is whether the prediction must be capable of change (while implicit in the “value”“variable” dispute, the second sentence of each definition makes that disagreement explicit). According to WARF, a “prediction” must be able to receive updates -- in other words, it must be dynamic. Apple, on the other hand, argues that a “prediction” may be dynamic, but it may also be static -- that is, incapable of receiving ongoing updates and changing to reflect those updates. The language of claim 1 reads as follows: In a processor capable of executing program instructions in an execution order differing from their program order, the processor further having a data speculation circuit for detecting data dependence between instructions and detecting a mis-speculation where a data consuming instruction dependent for its data on a data producing 11 The parties also mention a possible dispute about the meaning of “table,” but neither party sought construction at summary judgment. 16 instruction of earlier program order, is in fact executed before the data producing instruction, a data speculation decision circuit comprising: a) a predictor receiving a mis-speculation indication from the data speculation circuit to produce a prediction associated with the particular data consuming instruction and based on the mis-speculation indication; and b) a prediction threshold detector preventing data speculation for instructions having a prediction within a pre-determined range. (’752 patent, 14:36-52 (emphasis added).) The court can dispense with one of Apple’s arguments at the outset. Apple points out that none of the claims at issue contain an express limitation requiring the prediction to be updated on an ongoing basis, suggesting that this means predictions need not be capable of update. (Def.’s Br. Support Summ. J. (dkt. #118) 19-20.) As appealing as that simple construction might be, since it would alleviate the need for further analysis, the lack of an express limitation actually requires further inquiry: if the claims contained an additional limitation requiring a dynamic prediction, construing the term “prediction” to be intrinsically dynamic would render that limitation superfluous. See Digital-Vending Servs. Int’l, LLC v. Univ. of Phoenix, Inc., 672 F.3d 1270, 1275 (Fed. Cir. 2012) (discussing the “well-established rule that claims are interpreted with an eye toward giving effect to all terms in the claim”) (internal quotation marks and citation omitted); cf. LSI Indus., Inc. v. ImagePoint, Inc., 279 F. App’x 964, 972 (Fed. Cir. 2008) (“Some claims specifically recite ‘an illuminated display device,’ while others recite only ‘a display device.’ . . . Thus, the language of the claims counsels against imposing an illumination limitation on the display device term because it would make the limitation superfluous where it explicitly 17 appears.”); Phillips, 415 F.3d at 1314 (“To take a simple example, the claim in this case refers to ‘steel baffles,’ which strongly implies that the term ‘baffles’ does not inherently mean objects made of steel.”). Thus, although the claims include no limitations explicitly requiring predictions to be dynamic, the word “prediction” itself still might (or might not) include that requirement depending on the claim language, specification, prosecution history and extrinsic evidence. WARF relies heavily on the fact that the claimed “data speculation decision circuit” prevents data speculation for instructions having a prediction “within a predetermined range.” According to WARF, the claimed function of determining whether a prediction falls within a given range makes sense only if, “at any given time after the ‘prediction’ is produced, it might be ‘within a predetermined range’ or it might not be.” (Pl.’s Br. Opp’n Summ. J. (dkt. #148) 8 (emphasis in original).) Apple argues in response that this interpretation narrows the claims in a way not supported by the text or the state of the prior art. In particular, Apple points out that prior art in the field, including an article entitled “A Study of Branch Prediction Strategies” by James E. Smith (Decl. of Bryan S. Conley, Ex. 12 (dkt. #124-12) [hereinafter “Smith” or the “Smith article”]), used the word “prediction” in the context of speculation strategies tracking single past events, rather than a dynamic history of such events.12 According to Apple, 12 The Smith article deals with control speculation, rather than data speculation. Control speculation involves “branch prediction.” In the words of the ’752 patent, it “might involve executing an instruction that follows a branch instruction without knowing the outcome of the branch (and thus whether the following instruction should have been executed or was branched around).” (’752 patent, 2:32-36.) 18 the claims certainly permit a dynamic prediction but are also broad enough to encompass a prediction incapable of receiving updates. In Wisconsin Alumni Research Foundation v. Intel Corp., 656 F. Supp. 2d 898 (W.D. Wis. 2009), this court relied in part on the same language WARF cites, finding that: Claim 1 describes “producing” a “prediction” from a “misspeculation indication” generated in a data speculation circuit and determining whether that “prediction” is “within a predetermined range” to decide whether to prevent data speculation. Thus, the claim language itself establishes that a “prediction” is something other than a stored “indication” and is capable of having a “range” of values . . . On its face, this language suggests that a prediction must be capable of changing over time. Id. at 922. Revisiting this same claim language here, the court again finds the contemplated use of a predetermined range of values to assess whether instructions should be permitted to speculate favors WARF’s narrower interpretation. By way of example, imagine a pair of instructions that mis-speculates for the first time. The parties agree that a single misspeculation is enough to produce a prediction; thus, in this instance, the predictor of the invention would receive that mis-speculation indication from the data speculation circuit and use it to produce a prediction of “1,” representing the single mis-speculation. Under Apple’s construction, the development of the prediction can end here, because it need not be capable of further updates. Thus, the prediction would be set permanently at its initial value of “1.” Under this approach, the next time the instructions execute, there is no need for the prediction threshold detector to assess whether the prediction falls within a “pre19 determined range.” In a static situation, there are only two possibilities: either there is a prediction with a value of “1,” because the instructions have mis-speculated a single time; or there is no prediction, because the instructions have not yet mis-speculated and, therefore, the prediction has not yet been created. Thus, under Apple’s construction, the question for the prediction threshold detector is a binary determination of whether a prediction exists at all, rather than whether a prediction “falls within a given range.” Indeed, there would be no need for “a data speculation decision circuit” in claim 1 at all, feeding ongoing mis-speculation outcomes, since the “data speculation circuit” itself would provide the single piece of information required for a static prediction. Effectively, the prediction threshold detector would prevent data speculation.13 Said another way, Apple’s construction would read out the words “within a predetermined range” from subsection (b) of claim 1, or at least render them superfluous in the context of “predictions” incapable of receiving updates; in those cases, the prediction threshold detector would prevent data speculation “for instructions having a prediction,” full stop. Of course, as Apple argues, a “range” can consist of a single value, which could technically allow for a “predetermined range” including only the value 1. Superficially, this provides some support for Apple’s construction, but it still does not explain why it would ever be necessary to compare an existing prediction to a range of 1 for so-called 13 Theoretically, it is possible that the prediction of “1” would not fall within the predetermined range and the instruction would be allowed to execute regardless of the previous mis-speculation. But if that were so, the invention would appear to serve no purpose, since the prediction would not prevent speculation and could not change, much less improve, the processor’s performance over time. Likely for this reason, Apple does not advance this argument, so the court does not consider it further. 20 static “predictions.” The choice is still binary -- either there is no prediction or the prediction is set to 1 -- and so the notion of “comparison” remains a poor fit for the kind of theoretical static “predictions” Apple posits, regardless of whether the predetermined range is set to encompass multiple values or a single value. The remainder of the ’752 patent further supports WARF’s construction. The brief summary of the invention describes a three-tiered approach for determining when an instruction should execute. The first tier encompasses instructions with no history of mis-speculation; they may execute “without further inquiry.” (’752 patent, 3:66.) The second tier implicates instructions that have previously mis-speculated. At that point, according to the description, the invention employs a predictor “to determine whether the instruction should be executed or delayed.” (Id. at 4:1-4.) If the prediction were static, however, the mere fact of its existence would be enough to prevent execution. In contrast, the predictor as described in the ‘752 patent instead uses “the past history of mis-speculations” to determine whether the instruction may execute, allowing those that are “typically not dependent” to execute immediately. (Id. at 4:1-5 (emphasis added).) This language, too, suggests a prediction capable of update; it makes little sense to speak of instructions that are “typically not dependent” when a single instance of misspeculation could, under Apple’s construction, foreclose future speculative execution without the possibility of updates to reflect what typically occurs. If the predictor ultimately delays the instruction, the third tier then employs a synchronization table to determine when the instruction should execute, delaying it “until after the execution of the particular data producing instruction” on which it depends. (Id. at 4:5-7, 27-28.) 21 Furthermore, this three-tiered approach appears in the brief summary of the invention, rather than as a description of a single embodiment, making it more persuasive as a source of support for WARF’s narrower construction. See C.R. Bard, Inc. v. U.S. Surgical Corp., 388 F.3d 858, 864 (Fed. Cir. 2004) (“Statements that describe the invention as a whole, rather than statements that describe only preferred embodiments, are more likely to support a limiting definition of a claim term. . . . Statements that describe the invention as a whole are more likely to be found in certain sections of the specification, such as the Summary of the Invention.”). While less persuasive given its location in the patent, the detailed description of the invention provides further context suggesting that a “prediction” must be dynamic. As this court recognized in describing the preferred embodiment of the invention in Intel, “the specification explains in unequivocal terms that ‘[t]he prediction provided by the predictor circuit 33 . . . is updated based on historical mis-speculations detected by the data speculation circuit 30. For this reason, the data speculation circuit 30 must communicate with the predictor circuit 33 on an ongoing basis.’” Intel, 656 F. Supp. 2d at 922 (quoting ’752 patent, 8:7-11) (emphasis added). WARF also points to other examples supporting its position in the description of the preferred embodiment, including the description of the way the prediction normally “is incremented and decremented” such that “the higher the prediction 109, the more likelihood of misspeculation[.]” (’752 patent, 11:29-35 (emphasis added).) Indeed, throughout the description of the preferred embodiment, the specification consistently refers to the prediction as dynamic. (See, e.g., id. at 12:14-17 (“[T]he prediction that there was a need 22 to synchronize was wrong and so at process block 120 the prediction 109 is decremented toward the do not synchronize state.”) (emphasis added); 12:52-54 (“[T]he prediction 109 is updated toward the synchronize condition indicating that the prediction that there was a need to synchronize was correct[.]”) (emphasis added); 12:67-13:3 (“If [a misspeculation occurs and the pair is already in the prediction table] then at process block 302, the prediction 109 is updated toward synchronize so that this mis-speculation may be avoided in the future.”) (emphasis added).) Acknowledging, as it must, that the preferred embodiment describes a dynamic prediction that receives ongoing updates (Def.’s Br. Support Summ. J. (dkt. #118) 21; Def.’s Br. Opp’n Summ. J. (dkt. #140) 16 ), Apple relies on the general principle that “it is improper to read limitations from a preferred embodiment described in the specification -- even if it is the only embodiment -- into the claims absent a clear indication that the patentee intended the claims to be so limited.” Liebel-Flarsheim Co., 358 F.3d at 913. But this is not a case in which the “claim language is sufficiently broad that it can be read to encompass features not described in the written description, either by general characterization or by example in any of the illustrative embodiments.” Id. at 905. Rather, as described above, the claims themselves suggest that the contemplated “prediction” must be capable of change; the preferred embodiment merely provides further support for that conclusion. Use of the preferred embodiment as context, rather than as a source of limitations that do not otherwise appear in the claims, is permissible. Compare, e.g., Teleflex Inc. v. Ficosa N. Am. Corp., 299 F.3d 1313, 1327-28 (Fed. Cir. 2002) (district court erred in 23 holding that “clip” was limited to a “single pair of legs,” even where that was the only embodiment described, where claim language did not support that limitation, specification and prosecution history included no statements of restriction and the ordinary meaning of “clip” was not so restricted), with Toro Co. v. White Consolidated Indus., Inc., 199 F.3d 1295, 1301-02 (Fed. Cir. 1999) (holding that construction of “including” required attachment between structures where that was the only embodiment disclosed and where nothing in the remainder of the specification supported an unattached embodiment; “[T]he specification describes the advantages of the unitary structure as important to the invention. . . . No other, broader concept was described as embodying the applicant’s invention, or shown in any of the drawings, or presented for examination.”). Still, Apple argues that the ’752 patent does expressly contemplate alternative embodiments of the invention, pointing out that the detailed description of the invention states: It will be understood that the prediction 109 may be obtained by methods other than simply incrementing it in value for each speculation as described herein. For example, various weighting schemes can be provided to cause the predictor circuit 33, for example, to be less sensitive to the earliest mis-speculations. More complex pattern matching techniques may also be used, for example, to catch situations where mis-speculations occur in groups or regular patterns. (’752 patent, 14:6-14.) Apple contends that a person of ordinary skill in the art would read this discussion to allow for alternative embodiments in which a prediction is not updated on an ongoing basis and urges the court not to “improperly exclude a disclosed embodiment” by adopting WARF’s construction. See Broadcom Corp. v. Emulex Corp., 732 24 F.3d 1325, 1333 (Fed. Cir. 2013). The problem with this argument is that the patent simply does not disclose the embodiment Apple advocates. As a beginning point, neither of the alternative embodiments disclosed in the specification contemplate a static “prediction.” To the contrary, both proposed alternatives implicitly contemplate arrangements involving dynamic predictions. Both schemes that assign a different weight to later mis-speculations and techniques that identify mis-speculations occurring in groups or regular patterns assume a developing history of mis-speculations that the predictor circuit can use to obtain its prediction. There is no need to weight different instances of mis-speculation if the prediction is static and will never be updated to reflect those weights.14 There is likewise no need to develop complex matching techniques to identify patterns in mis-speculation if the prediction can never take that information into account in determining how likely an instruction is to mis-speculate. Adopting WARF’s construction, therefore, does not exclude a “disclosed embodiment” from the scope of the claims. Nor is there support for Apple’s proposed construction in the intrinsic evidence, Apple’s arguments to the contrary notwithstanding. For instance, Apple argues that the specification makes clear that a single mis-speculation is enough to produce a prediction. (Def.’s Br. Support Summ. J. (dkt. #118) 18.) This is true enough, but it has no bearing 14 Apple uses the mention of “weighting schemes” to propose its own take on an alternative embodiment as well -- a “weighting scheme that always prevents speculation by a load instruction for which mis-speculation recovery would be especially costly.” (Def.’s Br. Support Summ. J. (dkt. #118) 23.) Such a weighting scheme would not, however, be a means of producing a “prediction.” Both parties agree that a prediction indicates the likelihood that a pair of instructions will mis-speculate; Apple’s embodiment has nothing to do with the likelihood of mis-speculation, but rather assesses whether the costs associated with a single mis-speculation are prohibitive, regardless of how likely or unlikely that mis-speculation is. 25 on whether a prediction must be capable of update after its creation -- nor does the fact that the claim language “does not specify any minimum number of times that the instruction must mis-speculate before the ‘prediction’ is above the threshold required to prevent speculation.” (Id.) Apple also argues that a “prediction” must be construed as a “value” (i.e., static) rather than a “variable” (i.e., dynamic) because the specification “explicitly describes the prediction as a ‘value.’” (Id. at 15-16.) But the examples it cites speak of the prediction being set to a “default value” or being “incremented in value.” All this confirms is that the prediction is some number that has a value; it does not suggest the value of that prediction cannot change. To the contrary, the portions of the specification Apple cites refer to “incrementing” the value of the prediction, suggesting that it can and does change. Thus, the court again adopts Judge Crabb’s conclusion in Intel that “[n]either the claim language nor the specification supports defendant’s proposed construction that a ‘prediction’ may include values that are fixed once to indicate a single incident of mis-speculation.” 656 F. Supp. 2d at 922. Finally, Apple contends that its construction finds support in extrinsic evidence, citing to the Smith article discussed above, as well as the reports of its two experts, Dr. August and Dr. Colwell. Both reports, however, primarily rehash Apple’s legal arguments by purporting to analyze the language in the specification and claims. (See August Report (dkt. #103) ¶¶ 136-47; Colwell Report (dkt. #104) ¶¶ 131-41.) The court rejects Apple’s positions on those issues, and so, too, expert reports that echo those same arguments. 26 Apple is, therefore, left with the Smith article and another paper, “Memory Dependence Prediction [U]sing Store Sets,” by George Z. Chrysos and Joel S. Emer (the “Chrysos paper”) (Conley Decl., Ex. 1 (dkt. #143-1)), both of which Apple contends describe techniques that produce static predictions. Even if Apple’s characterization were accurate, these two extrinsic references are wholly underwhelming compared to the language of the patent itself and contrary intrinsic evidence. Moreover, extrinsic evidence “can be used only to help the court come to the proper understanding of the claims”; it cannot be used to vary or contradict the claim language or specification. Vitronics Corp. v. Conceptronic, Inc., 90 F.3d 1576, 1584 (Fed. Cir. 1996). For all these reasons, the court finds WARF’s proposed construction of the term “prediction” compelling and will construe that term as requiring a prediction that is capable of receiving updates. II. Invalidity A. Anticipation by Steely On summary judgment, both parties devote most of their invalidity briefing to the question of whether the ’752 patent is invalid as anticipated by the Steely patent, U.S. Patent No. 5,619,662. Evaluating a claim of anticipation involves a two-step inquiry. The first step requires proper construction of the meaning and scope of the claims. Power Mosfet Techs., L.L.C. v. Siemens AG, 378 F.3d 1396, 1406 (Fed. Cir. 2004). “The second step in the analysis requires a comparison of the properly construed term to the prior art[.]” Id. To demonstrate anticipation, “the proponent must show ‘that the four corners of a single, prior art document describe every element of the claimed invention.’” 27 Net MoneyIN, Inc. v. VeriSign, Inc., 545 F.3d 1359, 1369 (Fed. Cir. 2008) (quoting Xerox Corp. v. 3Com Corp., 458 F.3d 1310, 1322 (Fed. Cir. 2006)). Although anticipation is ultimately a question of fact, “it may be decided on summary judgment if the record reveals no genuine dispute of material fact.” Leggett & Platt, Inc. v. VUTEk, Inc., 537 F.3d 1349, 1352 (Fed. Cir. 2008) (quoting Golden Bridge Tech., Inc. v. Nokia, Inc., 527 F.3d 1318, 1321 (Fed. Cir. 2008)). Both parties have moved for summary judgment of anticipation in light of Steely, and each relies on its own, preferred construction of the disputed term “prediction.” (Def.’s Br. Support Summ. J. (dkt. #118) 25-37; Pl.’s Br. Support Summ. J. (dkt. #120) 38-64.) Having just rejected Apple’s construction of that term, Apple’s motion for summary judgment will be denied. Even if the court adopts WARF’s construction, however, Apple maintains numerous disputed issues of fact preclude entry of summary judgment against it on grounds of anticipation. (Def.’s Br. Opp’n Summ. J. (dkt. #140) 27.) It is to that question the court now turns. i. Background of the Steely Patent The Steely patent is entitled “Memory Reference Tagging” and describes a processor that “includes a memory reference tagging store associated with the instruction scheduler so that the scheduler can reorder memory reference instructions without knowing the actual memory location addressed by the memory reference instruction.” (U.S. Patent No. 5,619,662 (dkt. #131-4) Abstract.) Most relevant to the issue of anticipation, Steely discloses four different techniques in which a “write buffer” assigns “memory reference tags” involving a mis-speculation to load and store instructions. Each 28 of those techniques appears in the section of the patent entitled “Memory Reference Tagging.” (See id. at 47:35-49:8.) In the first technique, a mis-speculation generates a memory reference tag from a portion of the address in memory that resulted in the LOAD-STORE collision. (Id. at 48:2-4.) Once that portion is placed in the memory tag store, every time an instruction is retrieved from memory to be executed, the memory reference tag circuit “will provide the tag bits to be used by the instruction scheduler.” (Id. at 48:30-33.) If the instructions appear with identical tag bits (indicating a previous mis-speculation), the instruction scheduler will not reorder them. (Id. at 48:33-36.) In the second technique, the pair of instructions after a mis-speculation is tagged not with a portion of the memory address, but instead with “a problem number which could be a number provided from a counter.” (Id. at 48:55-57.) As a result, “[t]wo memory reference instructions with the same address and number will not reorder.” (Id. at 48:57-59.) “However, if the two memory reference instructions have a different number, the instructions will reorder.” (Id. at 48:59-61.) The counter does not appear to increment with respect to that for the same pair of instructions once it has assigned the “problem number”; rather, it increments when a mis-speculation occurs with respect to a different pair of instructions. For instance, if a pair of instructions mis-speculates and is assigned the problem number 0, the next pair to mis-speculate might be assigned the problem number 1. The third technique is to assign an instruction a “bit to indicate that an instruction should not be reordered.” (Id. at 48:62-63.) Thus, using this technique, “for 29 a store that previously caused a problem in the write buffer, the instruction is tagged with a bit indicating that the ISCHED 38 [instruction scheduler] cannot reorder memory reference instructions around the Instruction tagged with the bit.” (Id. at 48:63-67.) The final technique is to “turn off reordering” entirely in response to a misapplication under certain circumstances. (Id. at 49:4-5.) For example, the patent suggests turning off reordering when entering a subroutine, based on the general observation that “during a subroutine call, there are some initial stores and some exiting loads” and “[i]t would not be desirable to reorder the exiting loads before the initial stores.” (Id. at 49:4-9.) ii. Analysis Construing “prediction” as a dynamic (updating) “variable that indicates the likelihood that the data speculative execution of a load instruction will result in a misspeculation,” the remaining question for deciding the anticipation issue before the court is whether Steely discloses a prediction that can change over time. In Intel, this court found that it did not, holding that the requirement of a dynamic prediction was “fatal to defendant’s contention that the four techniques disclosed in the ’662 patent anticipate the ’752 patent.” 656 F. Supp. 2d at 922. WARF urges the court to adopt the same result here, arguing that Steely’s four techniques do not disclose predictions that update on an ongoing basis. According to WARF, those techniques simply involve tagging instructions to reflect a single mis-speculation event, without providing a mechanism to update those tags. See also Intel, 656 F. Supp. 2d at 922 (“For each [technique], the tag is 30 designed to indicate only that a mis-speculation has occurred, not keep track of misspeculations on an ongoing basis.”). Unsurprisingly, Apple objects to this characterization. Apple instead contends that the outcome of Steely’s tag comparison “can change over time for the same pair of load and store instructions.” (Def.’s Br. Opp’n Summ. J. (dkt. #140) 29.) According to Apple’s expert, Dr. Colwell, that can occur if, after the write buffer assigns the tags, additional mis-speculations involving one of the pair of tagged instructions occur. As an example, Dr. Colwell presumes a situation in which a load instruction, “Inst 1011,” and store instruction, “Inst 1007,” have been tagged with the same memory address of “10010,” such that Steely would prevent speculation. (Colwell Report (dkt. #104) ¶ 303.) Colwell then posits another situation in which a different load instruction, Inst 1012, also mis-speculates with store instruction Inst 1007: Another load instruction, for instance Inst 1012, may later be reordered ahead of store instruction Inst 1007, both Inst 1012 and Inst 1007 accessing the same memory address, this address ending in a different set of 5 bits, for instance “00110.” Inst 1007 would then be associated with the tag “00110,” which would no longer be identical to the tag “10010” associated with the load instruction Inst 1011. Because the tags for Inst 1007 and Inst 1011 are no longer identical, Steely predicts they are not dependent and may reorder them. Thus, the “prediction” disclosed by Steely is a “variable” that is “capable of receiving ongoing updates,” as required by WARF’s proposed construction of the term “prediction.” (Id.) Whether Steely actually discloses this means of “updating” its tags within the four corners of the patent is certainly open to debate. Apple asks the court to infer as much, 31 based on the fact that: (1) the memory reference tag store is large enough to store just one tag per instruction; and (2) the patent describes how tags for each mis-speculation “will be stored” regardless of other tags that may already exist for those instructions. According to Apple, these two facts demonstrate that Steely overwrites previously stored tags, thereby “updating” the result of any comparison that Steely performs between the two. The flaws in this argument are multiple. Essentially, Apple and its experts assume what amounts to a defect in Steely, which prevents it from assigning more than one tag preventing mis-speculation to a single instruction (in Dr. Colwell’s example, Inst. 1007), even though the example posits tags with different memory addresses (Inst. 10010 and Inst. 00110), depending on the store instruction with which the load instruction 1007 is paired (here, Inst. 1011 and Inst. 1012). Not only is this assumption contradicted by the language in Steely, see U.S. Patent no. 5,619,662, at 47:37-43 (“The memory reference tag store . . . provides at least one bit associated with said instruction . . .”) (emphasis added), but it would undermine the whole purpose of Steely, which is to prevent future mis-speculations, since it would result in a never-ending loop for load instructions causing multiple mis-speculations each time the 10010 and 00110 tags overwrite one another.15 15 The inventor of Steely does appear to have admitted in his deposition that in his view, this is how his invention would function, though he was asked the question out of context and without being asked about the obvious defect this would appear to create in his patent. (See Def.’s Br. Opp’n Summ. J. (dkt. #140) 31.) As WARF points out, this is why after-the-fact testimony of the inventor is of limited relevance when unsupported by the patent itself. See Howmedica Osteonics Corp. v. Wright Med. Tech., Inc., 540 F.3d 1337, 1346 (Fed. Cir. 2008) (“The testimony of an inventor ‘cannot be relied on to change the meaning of the claims.’”) (quoting Markman, 52 32 More importantly for summary judgment purposes, even assuming one might infer that such overwriting occurs and constitutes “updating,” the above-described “tag replacement system” would hardly constitute a “prediction” as this court has construed the term. Properly construed, a “prediction” must communicate the likelihood of misspeculation and must be capable of update. Using the example offered by Dr. Colwell for the sake of simplicity, Inst 1011 and Inst 1007 proved to be dependent and were, accordingly, tagged with the same memory address. Thereafter, another load instruction, Inst 1012, also proves to be dependent on store instruction Inst 1007. Accordingly, Steely overwrites the first tag on Inst 1007, tagging it to match Inst 1012, but that is not so much an “update” of the comparison between Inst 1007 and Inst 1011 as it is the wholesale elimination of that comparison. By Apple’s and Dr. Colwell’s own description, no record of the previous mis-speculation remains; the next time the tags are compared under Steely, they fail to reflect that any mis-speculation has occurred in the past and, therefore, fail to communicate the likelihood that the data speculative execution of the load instruction 1011 and store instruction 1007 will result in a mis-speculation. In contrast, the invention of the ’752 patent incrementally increases the prediction for each mis-speculation associated with an instruction pair, while it decrements the prediction associated with a pair of instructions when they do not mis-speculate, thereby updating its assessment of the likelihood of mis-speculations in the future. Steely’s tag replacement system, even as explained by Dr. Colwell, discards the prediction associated with a pair of instructions when a different pair of instructions misF.3d at 983). What matters for purposes of anticipation is what the patent actually discloses, not what the inventor says it would do in a situation the patent does not clearly address. 33 speculates. While this data elimination admittedly yields a change in the result of the tag comparison, that change has nothing to do with updating the likelihood that the first pair of instructions will mis-speculate again in the future. Accordingly, no reasonable jury could find that the Steely patent discloses each and every limitation of the ’752 patent as properly construed. The court will, therefore, grant summary judgment to WARF on defendant’s Steely anticipation defense and counterclaim. B. Indefiniteness Finally, Apple contends that claims 5 and 6 of the ’752 patent are invalid as indefinite. “[A] patent is invalid for indefiniteness if its claims, read in light of the specification delineating the patent, and the prosecution history, fail to inform, with reasonable certainty, those skilled in the art about the scope of the invention.” Nautilus, Inc. v. Biosig Instruments, Inc., 134 S. Ct. 2120, 2124 (2014). A party raising an indefiniteness challenge, like other invalidity challenges, bears the burden of proving that invalidity by clear and convincing evidence. Microsoft Corp. v. i4i Ltd. P’ship, 131 S. Ct. 2238, 2242 (2011); see also 35 U.S.C. § 282. Here, Apple contends that claims 5 and 6 of the ’752 patent should be held indefinite under Nautilus solely because certain terms in those claims lack an antecedent basis. Claim 5 is a dependent claim and reads: The data speculation decision circuit of claim 2 wherein the instruction synchronization circuit includes a synchronization table associating the certain data consuming instructions and the certain data producing instructions each with a flag value indicating whether the respective certain data producing instruction has been executed and wherein the instruction 34 synchronization circuit delays the particular data consuming instruction only: i) when the prediction associated with the data consuming instruction is within a predetermined range; and ii) when the particular data consuming instruction is in the prediction table; and iii) when the flag indicates the particular data producing instruction has not been executed. (’752 patent, 15:7-20 (emphasis added).) Similarly, Claim 6 likewise depends from claim 2 and reads: The data speculation decision circuit of claim 2 wherein the instruction synchronization circuit creates an entry in the synchronization table including the particular data consuming instructions and data producing instructions and the flag value only after a mis-speculation indicating is received for the particular data consuming instruction and the particular data producing instruction. (’752 patent, 15:21-27 (emphasis added).) Apple focuses on the italicized portions of each of the above claims in making its § 112 argument. According to Apple, the use of the definite article “the” in each of the above italicized instances suggests that the terms that article introduces must refer to specific claim elements already previously discussed. (Def.’s Br. Support Summ. J. (dkt. #118) 39.) See also, e.g., Warner-Lambert Co. v. Apotex Corp., 316 F.3d 1348, 1356 (Fed. Cir. 2003) (“[I]t is a rule of law well established that the definite article ‘the’ particularizes the subject which it precedes. It is a word of limitation as opposed to the indefinite or generalizing force of ‘a’ or ‘an.’”) (quoting Am. Bus Ass’n v. Slater, 231 F.3d 1, 4-5 (D.C. Cir. 2000)). As Apple points out, the italicized terms above do not appear elsewhere in 35 claims 5 and 6 themselves, or in claims 1 and 2, on which both claims 5 and 6 ultimately depend. In Apple’s view, this makes it impossible for a person of skill in the art to determine the scope of claims 5 and 6, rendering them indefinite. In Halliburton Energy Services, Inc. v. M-I LLC, 514 F.3d 1244 (Fed. Cir. 2008), the Federal Circuit held that “a claim could be indefinite if a term does not have proper antecedent basis where such basis is not otherwise present by implication or the meaning is not reasonably ascertainable.” Id. at 1249; see also Energizer Holdings, Inc. v. Int’l Trade Comm’n, 435 F.3d 1366, 1370 (Fed. Cir. 2006) (citing Slimfold Mfg. Co. v. Kinkead Indus., Inc., 810 F.2d 1113, 1116 (Fed. Cir. 1987)). The specification can, however, provide sufficient context for a person skilled in the field of the art to understand the claim to render it definite. See, e.g., In re Skvorecz, 580 F.3d 1262, 1268 (Fed. Cir. 2009) (“We agree with Mr. Skvorecz that the clause ‘welded to said wire legs at the separation’ does not require further antecedent basis in claim 1, for a person skilled in the field of the invention would understand the claim when viewed in the context of the specification.”). Here, the terms in question are “reasonably ascertainable” in light of the patent’s specification. Taking first the terms “the certain data consuming instructions” and “the certain data producing instructions” in claim 5, the patent’s specification summarizes the invention and notes that the invention’s instruction synchronization circuit: may also include a synchronization table associating certain data consuming instructions and certain data producing instructions, each with a flag indicating whether the respective data producing instruction has been executed. The instruction synchronization circuit delays the subsequent instances of the certain data consuming instruction only when the prediction 36 associated with the data consuming instruction is within a predetermined range and when the particular data consuming instruction is in the prediction table and when the flag indicates that particular data producing instruction has not been executed. (’752 patent, 4:54-65 (emphasis added).) As WARF points out, this portion of the specification tracks the language of claim 5 almost exactly. There is no reason why a person of ordinary skill in the art would not read “the certain data consuming instructions” and “the certain data producing instructions” to be those included in the synchronization table in light of the specification. At the very least, the brief summary of the invention allows one skilled in the art to proceed with “reasonable certainty,” as Nautilus requires. The term “the prediction table” in subsection (ii) of claim 5 would likewise inform a person of ordinary skill in the art that the “prediction table” is contained in the instruction synchronization circuit. As the brief summary of the invention states, “[t]he instruction synchronization circuit may include a prediction table listing certain data consuming instructions and certain data producing instructions each associated with a prediction.” (’752 patent, 4:39-42 (emphasis added).) The instruction synchronization circuit then employs the entries in that prediction table in determining whether to delay subsequent instances of the data consuming instruction -- the instruction must be in the prediction table for delay to take place. (Id. at 4:48-53.) As for claim 6, the “synchronization table” is the one that “may” be included in the instruction synchronization circuit (which is explicitly claimed in independent claim 2) and “associate[s] certain data consuming instructions and certain data producing 37 instructions, each with a flag indicating whether the respective data producing instruction has been delayed.” (’752 patent, 4:54-58.) The “flag value” likewise takes its meaning from this portion of the specification, which indicates that each pair of instructions in the synchronization table has “a flag indicating whether the respective data producing instruction has been executed.” (See id.) The invention then uses the flag to determine when to delay execution of subsequent instances of the data consuming instruction. (Id. at 4:58-65.) A person of ordinary skill in the art would understand the scope of “the flag value” in claim 6 in light of this relatively clear context. (Id.) Importantly, because Apple does not dispute that the specification offers context for the claim terms it identifies, that argument is waived. See Fresenius USA, Inc. v. Baxter Int’l, Inc., 582 F.3d 1288, 1296 (Fed. Cir. 2009) (“If a party fails to raise an argument before the trial court, or presents only a skeletal or undeveloped argument to the trial court, we may deem that argument waived on appeal.”); Jordan v. Binns, 712 F.3d 1123, 1134 (7th Cir. 2013) (undeveloped arguments considered waived); Ultratec, Inc. v. Sorenson Commc’ns, Inc., No. 13-cv-346-bbc, 2014 WL 3565409, at *1 (W.D. Wis. July 17, 2014). Regardless, Apple takes an entirely different tack, one which requires a bit of explanation. According to Apple, in light of the antecedent basis problems in claims 5 and 6, a person of ordinary skill in the art might simply look to the specification to understand the scope of the invention. However, Apple argues, she might also assume that claims 5 and 6 do not, in fact, depend from claim 2 but instead were intended to depend from claims 3 and 5, respectively, which would provide the requisite antecedent 38 basis for the identified terms, but would also include additional limitations by virtue of depending from different claims. (Def.’s Br. Opp’n Summ. J. (dkt. #140) 41.) The court does not find Apple’s argument persuasive. Apple cites no cases in which courts found indefiniteness due solely to a lack of antecedent basis, at least where the specification so clearly delineates the structure of what the patent intended to claim. Instead, Apple cites Novo Industries, L.P. v. Micro Molds Corp., 350 F.3d 1348 (Fed. Cir. 2003), for the proposition that claims are indefinite where “in light of the mistakes in the claims there is no clear choice as to how to interpret their scope.” (Def.’s Br. Opp’n Summ. J. (dkt. #140) 46.) But Novo involved an obvious typographical error amenable to no fewer than four possible interpretations (at least one of which would have significant substantive implications for the scope of the claims).16 Novo does not support this court reading in a typographical error to create ambiguity where the specification otherwise indisputably provides context to delineate the scope of the invention “with reasonable certainty.” Nautilus, 134 S. Ct. at 2124. The other case Apple cites, Automed Technologies, Inc. v. Microfil, LLC, 244 F. App’x 354 (Fed. Cir. 2007), is similarly unhelpful to its indefiniteness argument. In Automed, the Federal Circuit vacated and remanded a grant of summary judgment of non- 16 In Novo, the claim included a “stop means formed on a rotatable with said support finger.” 350 F.3d at 1352 (emphasis removed). Novo suggested correcting the claim either by deleting the words “a rotatable with” or by deleting the words “with said.” Id. at 1357. The district court raised another possibility by changing the word “a” to “and.” Id. And Micro Molds proposed as a fourth possibility that a word, such as “skirt” or “disk,” might have been erroneously omitted, which would add an additional substantive limitation to the claims. Id. Because the Federal Circuit “[could not] know what correction [was] necessarily appropriate or how the claim should be interpreted,” it concluded that the claim was necessarily indefinite “in its present form.” Id. No comparable indefiniteness is even arguable in this case. 39 infringement because the district court had based its ruling on a finding that the accused systems lacked a “controller” -- a limitation that was actually absent from the asserted claims. Id. at 359. In the midst of that discussion, the Federal Circuit observed: We also note that claim 27 of the ‘671 patent, which recites “the controller,” appears to be mistakenly dependent on claim 20, in which this term finds no antecedent basis. . . . Because claim 21 - and not claim 20 - recites a “controller” limitation, perhaps claim 27 was intended to depend from claim 21. Id. Even so, the Federal Circuit said nothing about that potential error rendering claim 27 indefinite. Rather, it “[left] to AutoMed any corrective action it deem[ed] necessary.” Id. The Federal Circuit’s observation that claim 27 might have been intended to depend from claim 21, not claim 20, certainly does not compel, or even do much to support, a finding of indefiniteness in this case. Accordingly, the court finds that the specification provides ample guidance as to what elements the claims are referencing when they refer to “the certain data consuming instructions,” “the certain data producing instructions” and “the prediction table” (claim 5), as well as “the synchronization table” and “the flag value” (claim 6). Even the authority upon which defendants rely indicates that a lack of antecedent basis renders a claim indefinite only if “it would be unclear as to what element the limitation was making reference.” Manual of Patent Examining Procedure § 2173.05(e) (9th ed. 2014); see also Halliburton, 514 F.3d at 1249. That is simply not the case here. 40 III. Willful Infringement WARF has alleged a claim that Apple’s infringement was willful, thereby permitting (but not requiring) the court to award enhanced damages. 35 U.S.C. § 284 (“[T]he court may increase the damages up to three times the amount found or assessed.”); Beatrice Foods Co. v. New Eng. Printing & Lithographing Co., 923 F.2d 1576, 1578 (Fed. Cir. 1991) (“It is well-settled that enhancement of damages must be premised on willful infringement or bad faith.”) (citations omitted). Apple seeks summary judgment on this claim on the basis that WARF cannot as a matter of law meet the threshold for proving willfulness on an objective basis. To establish willful infringement, WARF “must show by clear and convincing evidence” (1) that “the infringer acted despite an objectively high likelihood that its actions constituted infringement of a valid patent,” and (2) that “this objectively-defined risk . . . was either known or so obvious that it should have been known to the accused infringer.” In re Seagate Tech., 497 F.3d at 1371. The former “objective determination of recklessness” is a question for the court, not the jury. Bard Peripheral Vascular, Inc. v. W.L. Gore & Assocs., Inc., 682 F.3d 1003, 1006-07 (Fed. Cir. 2012). “[T]he ‘objective’ prong of Seagate tends not to be met where an accused infringer relies on a reasonable defense to a charge of infringement.” Id. at 1005-06 (internal citation and quotation marks omitted); see also Spine Solutions, Inc. v. Medtronic Sofamor Danek USA, Inc., 620 F.3d 1305, 1319 (Fed. Cir. 2010) (overturning jury’s finding of willful infringement, finding that defendant raised a “substantial question as to the obviousness” of the patent in suit); Douglas Dynamics, 747 F. Supp. 2d at 1112 (granting 41 summary judgment on willful infringement claim where there was “reasonable difference of opinion” and a “close question”). In cursory fashion, Apple’s opening brief advances a wide range of arguments for seeking summary judgment on this objective prong. Some of the bases were fully briefed for review on the merits – namely, Apple’s claim construction of “prediction,” its related argument on anticipation by Steely and its indefiniteness defense and counterclaim as to claims 5 and 6. The court will take up Apple’s motion on these bases in the discussion below. Other bases, including ones on which Apple bears the burden of proof like obviousness, were not, however, the subject of the parties’ motions for summary judgment. While the court appreciates that it is WARF’s burden to demonstrate that Apple’s defenses to infringement or claims of invalidity are not objectively reasonable, Apple’s scattershot approach in its motion renders the task near impossible to resolve on summary judgment. Perhaps if Apple had identified two or three of its strongest arguments, this may have been a manageable task. Instead, Apple’s treatment of each basis is limited to a paragraph or two in its opening brief and reflects ships passing in the night in reply to WARF’s responses.17 In any event, WARF did come forward with evidence and law that, despite Apple’s attempt to refute it in reply, could lead to a finding that Apple’s belief that it either did not infringe the ’752 patent or that the 17 Perhaps most telling, the few defenses that Apple moved on the merits do not offer grounds for the court to find for Apple on the objective prong of WARF’s willful infringement claim. 42 patent was invalid was not objectively reasonable.18 As such, the court will await a more robust demonstration of the merits of Apple’s defenses and WARF’s infringement claims at trial.19 Returning to those bases which were fully briefed for review on the merits, Apple’s claim construction is arguably “objectively reasonable” if viewed purely in a vacuum. Apple presented some evidence that “prediction” can describe a static value in the context of computer circuits and speculation, for example, in the form of the Smith article and Chrysos paper; they are also correct that the patent does not explicitly define “prediction,” ostensibly leaving at least some room for debate. The problem is that nothing in the patent -- not the claim language, not the specification, not the purpose of the invention -supports Apple’s construction. As discussed above, the claim language from the outset suggests that a prediction must be dynamic in the context of this particular invention. The specification, including both the brief summary of the invention and the detailed description of the embodiments, further supports this construction. And Apple’s resort to extrinsic evidence fails to render its arguments to the contrary any more reasonable, 18 Certainly, Apple seems to make an objectively reasonable argument as to claims 1 and 2 being obvious, and perhaps as to claims 3 and 9, but the court cannot say on this record whether the supposed links drawn between Steely, Hesson, Chen and EV6 are obvious or pure sophistry. Similarly, while Apple raises a number of arguments that appear to objectively establish noninfringement on a literal basis, it has left the court unconvinced as to infringement under the doctrine of equivalents. 19 To clarify, while the jury is deliberating on liability, the court can take up any additional evidence and argument relevant to the objective prong of WARF’s willful infringement claim and likely will render a decision on the objective prong before the parties present any evidence on the subjective prong during the second phase of the trial (assuming the jury finds infringement and does not find invalidity). 43 given that extrinsic evidence cannot be used to vary the intrinsic evidence under settled principles of claim construction. While superficially appealing, not unlike a siren’s song, Apple’s construction crashes against the rocks of the patent language itself and intrinsic evidence. Given how strongly the patent itself supports WARF’s narrower construction, and how little Apple has to offer in support of its broader one, Apple’s position is not objectively reasonable. Compare Cohesive Techs., Inc. v. Waters Corp., 543 F.3d 1351, 1374 (Fed. Cir. 2008) (no willfulness where disputed term “was susceptible to a reasonable construction under which [the] products did not infringe”), with SSL Services, LLC v. Citrix Systems, Inc., 769 F.3d 1073, 1091 (Fed. Cir. 2014) (affirming district court’s finding of willful infringement, in part, because defendant’s non-infringement defense based on an unwarranted limitation of a claim term was not objectively reasonable); cf. Raylon, LLC v. Complus Data Innovations, Inc., 700 F.3d 1361, 1369 (Fed. Cir. 2012) (finding position on claim construction frivolous under Rule 11 where proffered construction was “contrary to all the intrinsic evidence and does not conform to the standard canons of claim construction”). Finally, while Judge Crabb granted summary judgment to the defendant in Intel on WARF’s willful infringement claim, she did so on a basis unrelated to claim construction and one not before this court. Intel, 656 F. Supp. 2d at 924 (finding defendant’s licensing defense objectively reasonable). Accordingly, the court will deny defendant’s motion for summary judgment on plaintiff’s willful infringement claim that depends on Apple’s claim construction, finding this defense objectively unreasonable. 44 As for Apple’s anticipation challenge to the validity of the ’752 patent based solely on Steely, the court finds this defense not objectively reasonable as well, though it will reserve on any obviousness defense involving Steely. Much of Apple’s anticipation argument depended upon its claim construction, which was not objectively reasonable as discussed above. Admittedly, Apple attempted to maintain an anticipation defense even under WARF’s construction, but its dependence on Steely’s purported, defective “tag overwriting” scheme is likewise unreasonable, given that this dubious overwriting defect would simply dispose of previous predictions, rather than “updating” them to reflect an increased likelihood of future mis-speculation. The court will also deny Apple’s motion with respect to its indefiniteness defense. Apple points to no case in which a lack of antecedent basis led to a finding of indefiniteness despite clear context providing that basis in the specification. Even the case law Apple cites explain that there is no invalidity for indefiniteness so long as the antecedent basis is present by implication, and Apple waived any contention that the specification did not serve to provide such context. As a whole then, this defense was not objectively reasonable, and Apple cannot use it to escape the possibility of enhanced damages. ORDER IT IS ORDERED that: 1) Plaintiff Wisconsin Alumni Research Foundation’s construction of the disputed term “prediction” is ADOPTED as described in this opinion. 45 2) Defendant and counter claimant Apple, Inc.’s motion for summary judgment (dkt. #116) is DENIED as to its counterclaims and defenses of anticipation by Steely and indefiniteness, and DENIED as to plaintiff’s willful infringement claim premised on (1) Apple’s claim construction, (2) anticipation by Steely, and (3) indefiniteness of claims 5 and 6. The court RESERVES on the motion in all other respects. 3) Plaintiff’s motion for summary judgment (dkt. #117) is GRANTED. Entered this 5th day of August, 2015. BY THE COURT: /s/ __________________________________ WILLIAM M. CONLEY District Judge 46

Disclaimer: Justia Dockets & Filings provides public litigation records from the federal appellate and district courts. These filings and docket sheets should not be considered findings of fact or liability, nor do they necessarily reflect the view of Justia.


Why Is My Information Online?