The New York Times Company v. Microsoft Corporation et al

Filing 344

OPINION & ORDER re: 236 LETTER MOTION for Conference to Compel addressed to Magistrate Judge Ona T. Wang from Elana Nightingale Dawson, Vera Ranieri, and Michelle Ybarra dated September 3, 2024. filed by OpenAI, Inc., OpenAI OpCo L LC, OpenAI Holdings, LLC, OpenAI LP, OpenAI Global LLC, OpenAI LLC, OpenAI GP, LLC, OAI Corporation, LLC. This case is about whether Defendant trained their LLMs using Plaintiff's copyrighted material, and whether that use constitute s copyright infringement. (ECF 170, 158-168). It is not a referendum on the benefits of Gen AI, on Plaintiff's business practices, or about whether any of Plaintiff's employees use Gen AI at work. The broad scope of document production sou ght here is simply not relevant to Defendant's purported fair use defense. For example, if a copyright holder sued a video game manufacturer for copyright infringement, the copyright holder might be required to produce documents relating to thei r interactions with that video game manufacturer, but the video game manufacturer would not be entitled to wide-ranging discovery concerning the copyright holder's employees' gaming history, statements about video games generally, or even t heir licensing of different content to other video game manufacturers. Accordingly, because Defendant has failed to demonstrate the relevance of the information sought, Defendant's motion to compel is DENIED. The Clerk of Court is respectfully directed to close ECF 236. (Signed by Magistrate Judge Ona T. Wang on 11/22/2024) (sgz)

Download PDF
UNITED STATES DISTRICT COURT SOUTHERN DISTRICT OF NEW YORK --------------------------------------------------------------x THE NEW YORK TIMES COMPANY, Plain??, -against- MICROSOFT CORPORATION, OPENAI, INC., et al., Defendants. : : : : : : : : : : 23-cv-11195 (SHS) (OTW) OPINION & ORDER --------------------------------------------------------------x ONA T. WANG, United States Magistrate Judge: I. BACKGROUND The New York Times (the “Times” or “Plain??”) brought this ac?on alleging, inter alia, that Defendants unlawfully used Plain??’s copyrighted works to train Defendants’ largelanguage models (“LLMs”). Defendant OpenAI, Inc. (“Defendant”) seeks to compel 1 produc?on of: (1) the Times’s use of nonpar?es’ genera?ve ar??cial intelligence (“Gen AI”) tools; (2) the Times’s crea?on and use of its own Gen AI products; and (3) the Times’s posi?on regarding Gen AI (e.g., posi?ons expressed outside of li?ga?on, knowledge about the training of third-party Gen AI tools using the Time’s works). (ECF 236). Defendant asserts that this outstanding discovery is relevant to their fair use defense. (ECF 236). Plain?? asserts that the disputed discovery concerning Plain??’s interac?ons with their own and nonpar?es’ Gen AI tools are Plain?? has already provided or agreed to produce: (1) documents regarding the Times’s use of the Defendants’ Gen AI tools in repor?ng or presenta?on of content, and documents regarding the Times’s trainings about Defendants’ Gen AI products; (2) documents rela?ng to the Times’s A.I. Ini?a?ves program; and (3) nonprivileged documents and communica?ons with third par?es about the Defendants’ use of Times content in their Gen AI products and this li?ga?on and whether to license Times works to OpenAI. (ECF 238). 1 neither relevant nor propor?onal to the needs of the case. (ECF 238). Because Defendant has not demonstrated the relevance of the informa?on sought, their mo?on to compel is DENIED. II. LEGAL STANDARD Federal Rule of Civil Procedure 26(b)(1) permits discovery of “any nonprivileged matter that is relevant to any party’s claim or defense and proportional to the needs of the case.” The party moving to compel, here OpenAI, “bears the initial burden of demonstrating relevance and proportionality.” See Winfield v. City of New York, No. 15-CV-5236 (LTS) (KHP), 2018 WL 840085, at *3 (S.D.N.Y. Feb. 12, 2018). “Motions to compel and motions to quash a subpoena are both entrusted to the sound discretion of the court.” Howard v. City of New York, No. 12CV-933 (JMF), 2013 WL 174210, at *1 (S.D.N.Y. Jan. 16, 2013). III. DISCUSSION The Copyright Act (the “Act”) allows for certain “fair” uses of copyrighted works and sets out four non-exclusive factors for courts to consider in determining whether a particular use is “fair”: I. II. III. IV. the purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes; the nature of the copyrighted work; the amount and substantiality of the portion used in relation to the copyrighted work as a whole; and the effect of the use upon the potential market for or value of the copyrighted work. Hachette Book Group, Inc. v. Internet Archive, 115 F.4th 163, 178-79 (2d Cir. 2024). Each of these factors requires scru?ny of a defendant’s purported use of the copyrighted work(s), and whether that defendant’s use may cons?tute “fair use” under the Act. The factors do not require a court to examine statements or comments a copyright holder may have made about a 2 defendant’s general industry, whether the copyright holder has used tools in the defendant’s general industry, whether the copyright holder has admited that other uses of its copyrights may or may not cons?tute fair use, or whether the copyright holder has entered into business rela?onships with other en??es in the defendant’s industry. Defendant argues that the discovery they seek is relevant to “the Times’s own claim that the mere existence of this technology is a threat to its business model and the enterprise of journalism.” (See ECF 236, at 2). However, the “statement” referenced by Defendant is not a claim or defense; it is a heading in the Amended Complaint: “GenAI Products Threaten HighQuality Journalism,” which precedes paragraphs 47 through 54. (ECF 170, at 14). This sec?on discusses the Times’s protec?on of its own journalis?c content, the limited content available to search engines, and prior discussions with Defendant to “explore the possibility of an amicable resolu?on,” which apparently were unsuccessful. (ECF 170 ¶ 54). There is no wholesale indictment of Gen AI tools, nor is there any sugges?on that the Times allows third par?es unfetered, unpaid access to its copyrighted journalis?c content. 2 The AC is ?ghtly focused on Defendant’s par?cular Gen AI products and their alleged use of the Times’s copyrighted content. None of the cases cited by Defendant support the asser?on that the discovery sought is relevant to their fair use defense or to the heading in the Amended Complaint. For example, Nor is any broader discovery warranted based on Defendant’s specula?ve and conclusory asser?on that “if the Times knew about mul?ple third par?es using the Times’s works to train genera?ve AI tools but did nothing, that would suggest recogni?on by the Times of the reasons that such training is protected by fair use – e.g. that no workable market exists for licensing the volume of data required; that it o?ers signi?cant public bene?ts; and that it stands to achieve purposes dis?nct from that of its underlying works.” [sic]. (ECF 236 at 3) (emphasis added). Moreover, the Times is already producing documents about its knowledge and awareness of Defendant’s training. See supra, n. 1. 2 3 Google v. Oracle does not support a modi?ca?on of the fourth fair use factor to include discovery about Plain??’s views on or statements about the “public bene?ts” of Gen AI in journalism. 593 U.S. 1, 35-36. Rather, the Supreme Court suggested a more nuanced view of the market e?ects, one that requires considera?on of the importance of the “public bene?ts the copying will likely produce” to “copyright’s concern for the crea?ve produc?on of new expression” and a balancing against the poten?al loss to the copyright owner, “taking into account … the nature of the source of the loss.” Id. at 35-36 (internal quota?ons omited). Discovery regarding the loss to the copyright owner would consist of documents concerning licensing discussions, which the Times has already agreed to produce, (see, supra n. 1), and discovery from Defendant on how its use might “kill demand for the original.” C.f. Oracle, 593 U.S. at 35 (“But a poten?al loss of revenue is not the whole story. We here must consider not just the amount but also the source of the loss. As we pointed out in Campbell, a lethal parody, like a scathing theatre review, may kill demand for the original… Yet this kind of harm, even if directly translated into foregone dollars, is not cognizable under the Copyright Act.”) (internal quota?ons omited). Similarly, discovery concerning the “public bene?ts [from] the copying” would be directed to the Defendant and the public bene?ts of its copying, not whether nonpar?es’ Gen AI tools (which presumably were developed without copying) serve a general public bene?t. The Second Circuit took the same approach in Am. Geophysical Un. v. Texaco Inc., focusing on how Texaco’s copying, and its use of those copies, met (or did not meet) the fair use factors. 60 F.3d 913, 927 (2d Cir. 1995) (“Since we are concerned with the claim of fair use in copying the eight individual ar?cles from [the journal] Catalysis, the analysis under the fourth 4 factor must focus on the e?ect of Texaco’s photocopying upon the poten?al market for or value of these individual ar?cles.”). The copyright holder’s other use or licensing of their own works to other nonpar?es was simply not at issue in the fair use determina?on, and Google and Texaco do not support a ?nding of relevance here for the same. Similarly, Andy Warhol Foundation for the Visual Arts, Inc. v. Goldsmith does not stand for the proposi?on that “the Times’s crea?on, use and posi?ons on [others’ Gen AI] generally is directly relevant” to Defendant’s fair use defense. (ECF 236 at 1) (“[T]he technology yields transforma?ve and produc?ve bene?ts for the enterprise of journalism speci?cally.”) (emphasis added). Whether nonpar?es’ Gen AI tools confer bene?ts on the journalism industry is not relevant to a determina?on of whether Defendant’s acts—i.e., the alleged copying involving Defendant’s Gen AI tools—cons?tute fair use. 3 The fair use factors are concerned with “the copier’s use of an original work.” See Andy Warhol Foundation for the Visual Arts, Inc. v. Goldsmith, 598 U.S. 508, 528 (2023). IV. CONCLUSION This case is about whether Defendant trained their LLMs using Plain??’s copyrighted material, and whether that use cons?tutes copyright infringement. (ECF 170, ¶¶ 158-168). It is not a referendum on the bene?ts of Gen AI, on Plain??’s business prac?ces, or about whether any of Plain??’s employees use Gen AI at work. The broad scope of document produc?on sought here is simply not relevant to Defendant’s purported fair use defense. For example, if a OpenAI seems to suggest that if the Times’s journalists use any form of Gen AI tools in their work, that Gen AI then “bene?ts” journalism, and if Gen AI tools “bene?t” journalism, that “bene?t” would be relevant to OpenAI’s fair use defense. But the Supreme Court speci?cally states that a discussion of “public bene?ts” must relate to the bene?ts from the copying. Oracle, 593 U.S. at 35. 3 5 copyright holder sued a video game manufacturer for copyright infringement, the copyright holder might be required to produce documents rela?ng to their interac?ons with that video game manufacturer, but the video game manufacturer would not be en?tled to wide-ranging discovery concerning the copyright holder’s employees’ gaming history, statements about video games generally, or even their licensing of di?erent content to other video game manufacturers. Accordingly, because Defendant has failed to demonstrate the relevance of the informa?on sought, Defendant’s mo?on to compel is DENIED. The Clerk of Court is respec?ully directed to close ECF 236. SO ORDERED. s/ Ona T. Wang Ona T. Wang United States Magistrate Judge Dated: November 22, 2024 New York, New York 6

Disclaimer: Justia Dockets & Filings provides public litigation records from the federal appellate and district courts. These filings and docket sheets should not be considered findings of fact or liability, nor do they necessarily reflect the view of Justia.


Why Is My Information Online?