The New York Times Company v. Microsoft Corporation et al
Filing
344
OPINION & ORDER re: 236 LETTER MOTION for Conference to Compel addressed to Magistrate Judge Ona T. Wang from Elana Nightingale Dawson, Vera Ranieri, and Michelle Ybarra dated September 3, 2024. filed by OpenAI, Inc., OpenAI OpCo L LC, OpenAI Holdings, LLC, OpenAI LP, OpenAI Global LLC, OpenAI LLC, OpenAI GP, LLC, OAI Corporation, LLC. This case is about whether Defendant trained their LLMs using Plaintiff's copyrighted material, and whether that use constitute s copyright infringement. (ECF 170, 158-168). It is not a referendum on the benefits of Gen AI, on Plaintiff's business practices, or about whether any of Plaintiff's employees use Gen AI at work. The broad scope of document production sou ght here is simply not relevant to Defendant's purported fair use defense. For example, if a copyright holder sued a video game manufacturer for copyright infringement, the copyright holder might be required to produce documents relating to thei r interactions with that video game manufacturer, but the video game manufacturer would not be entitled to wide-ranging discovery concerning the copyright holder's employees' gaming history, statements about video games generally, or even t heir licensing of different content to other video game manufacturers. Accordingly, because Defendant has failed to demonstrate the relevance of the information sought, Defendant's motion to compel is DENIED. The Clerk of Court is respectfully directed to close ECF 236. (Signed by Magistrate Judge Ona T. Wang on 11/22/2024) (sgz)
UNITED STATES DISTRICT COURT
SOUTHERN DISTRICT OF NEW YORK
--------------------------------------------------------------x
THE NEW YORK TIMES COMPANY,
Plain??,
-against-
MICROSOFT CORPORATION, OPENAI, INC.,
et al.,
Defendants.
:
:
:
:
:
:
:
:
:
:
23-cv-11195 (SHS) (OTW)
OPINION & ORDER
--------------------------------------------------------------x
ONA T. WANG, United States Magistrate Judge:
I.
BACKGROUND
The New York Times (the “Times” or “Plain??”) brought this ac?on alleging, inter alia,
that Defendants unlawfully used Plain??’s copyrighted works to train Defendants’ largelanguage models (“LLMs”). Defendant OpenAI, Inc. (“Defendant”) seeks to compel 1 produc?on
of: (1) the Times’s use of nonpar?es’ genera?ve ar??cial intelligence (“Gen AI”) tools; (2) the
Times’s crea?on and use of its own Gen AI products; and (3) the Times’s posi?on regarding Gen
AI (e.g., posi?ons expressed outside of li?ga?on, knowledge about the training of third-party
Gen AI tools using the Time’s works). (ECF 236). Defendant asserts that this outstanding
discovery is relevant to their fair use defense. (ECF 236). Plain?? asserts that the disputed
discovery concerning Plain??’s interac?ons with their own and nonpar?es’ Gen AI tools are
Plain?? has already provided or agreed to produce: (1) documents regarding the Times’s use of the Defendants’
Gen AI tools in repor?ng or presenta?on of content, and documents regarding the Times’s trainings about
Defendants’ Gen AI products; (2) documents rela?ng to the Times’s A.I. Ini?a?ves program; and (3) nonprivileged
documents and communica?ons with third par?es about the Defendants’ use of Times content in their Gen AI
products and this li?ga?on and whether to license Times works to OpenAI. (ECF 238).
1
neither relevant nor propor?onal to the needs of the case. (ECF 238). Because Defendant has
not demonstrated the relevance of the informa?on sought, their mo?on to compel is DENIED.
II.
LEGAL STANDARD
Federal Rule of Civil Procedure 26(b)(1) permits discovery of “any nonprivileged matter
that is relevant to any party’s claim or defense and proportional to the needs of the case.” The
party moving to compel, here OpenAI, “bears the initial burden of demonstrating relevance and
proportionality.” See Winfield v. City of New York, No. 15-CV-5236 (LTS) (KHP), 2018 WL
840085, at *3 (S.D.N.Y. Feb. 12, 2018). “Motions to compel and motions to quash a subpoena
are both entrusted to the sound discretion of the court.” Howard v. City of New York, No. 12CV-933 (JMF), 2013 WL 174210, at *1 (S.D.N.Y. Jan. 16, 2013).
III.
DISCUSSION
The Copyright Act (the “Act”) allows for certain “fair” uses of copyrighted works and sets
out four non-exclusive factors for courts to consider in determining whether a particular use is
“fair”:
I.
II.
III.
IV.
the purpose and character of the use, including whether such
use is of a commercial nature or is for nonprofit educational
purposes;
the nature of the copyrighted work;
the amount and substantiality of the portion used in relation to
the copyrighted work as a whole; and
the effect of the use upon the potential market for or value of
the copyrighted work.
Hachette Book Group, Inc. v. Internet Archive, 115 F.4th 163, 178-79 (2d Cir. 2024). Each of
these factors requires scru?ny of a defendant’s purported use of the copyrighted work(s), and
whether that defendant’s use may cons?tute “fair use” under the Act. The factors do not
require a court to examine statements or comments a copyright holder may have made about a
2
defendant’s general industry, whether the copyright holder has used tools in the defendant’s
general industry, whether the copyright holder has admited that other uses of its copyrights
may or may not cons?tute fair use, or whether the copyright holder has entered into business
rela?onships with other en??es in the defendant’s industry.
Defendant argues that the discovery they seek is relevant to “the Times’s own claim that
the mere existence of this technology is a threat to its business model and the enterprise of
journalism.” (See ECF 236, at 2). However, the “statement” referenced by Defendant is not a
claim or defense; it is a heading in the Amended Complaint: “GenAI Products Threaten HighQuality Journalism,” which precedes paragraphs 47 through 54. (ECF 170, at 14). This sec?on
discusses the Times’s protec?on of its own journalis?c content, the limited content available to
search engines, and prior discussions with Defendant to “explore the possibility of an amicable
resolu?on,” which apparently were unsuccessful. (ECF 170 ¶ 54). There is no wholesale
indictment of Gen AI tools, nor is there any sugges?on that the Times allows third par?es
unfetered, unpaid access to its copyrighted journalis?c content. 2 The AC is ?ghtly focused on
Defendant’s par?cular Gen AI products and their alleged use of the Times’s copyrighted
content.
None of the cases cited by Defendant support the asser?on that the discovery sought is
relevant to their fair use defense or to the heading in the Amended Complaint. For example,
Nor is any broader discovery warranted based on Defendant’s specula?ve and conclusory asser?on that “if the
Times knew about mul?ple third par?es using the Times’s works to train genera?ve AI tools but did nothing, that
would suggest recogni?on by the Times of the reasons that such training is protected by fair use – e.g. that no
workable market exists for licensing the volume of data required; that it o?ers signi?cant public bene?ts; and that
it stands to achieve purposes dis?nct from that of its underlying works.” [sic]. (ECF 236 at 3) (emphasis added).
Moreover, the Times is already producing documents about its knowledge and awareness of Defendant’s training.
See supra, n. 1.
2
3
Google v. Oracle does not support a modi?ca?on of the fourth fair use factor to include
discovery about Plain??’s views on or statements about the “public bene?ts” of Gen AI in
journalism. 593 U.S. 1, 35-36. Rather, the Supreme Court suggested a more nuanced view of
the market e?ects, one that requires considera?on of the importance of the “public bene?ts
the copying will likely produce” to “copyright’s concern for the crea?ve produc?on of new
expression” and a balancing against the poten?al loss to the copyright owner, “taking into
account … the nature of the source of the loss.” Id. at 35-36 (internal quota?ons omited).
Discovery regarding the loss to the copyright owner would consist of documents concerning
licensing discussions, which the Times has already agreed to produce, (see, supra n. 1), and
discovery from Defendant on how its use might “kill demand for the original.” C.f. Oracle, 593
U.S. at 35 (“But a poten?al loss of revenue is not the whole story. We here must consider not
just the amount but also the source of the loss. As we pointed out in Campbell, a lethal parody,
like a scathing theatre review, may kill demand for the original… Yet this kind of harm, even if
directly translated into foregone dollars, is not cognizable under the Copyright Act.”) (internal
quota?ons omited). Similarly, discovery concerning the “public bene?ts [from] the copying”
would be directed to the Defendant and the public bene?ts of its copying, not whether
nonpar?es’ Gen AI tools (which presumably were developed without copying) serve a general
public bene?t.
The Second Circuit took the same approach in Am. Geophysical Un. v. Texaco Inc.,
focusing on how Texaco’s copying, and its use of those copies, met (or did not meet) the fair
use factors. 60 F.3d 913, 927 (2d Cir. 1995) (“Since we are concerned with the claim of fair use
in copying the eight individual ar?cles from [the journal] Catalysis, the analysis under the fourth
4
factor must focus on the e?ect of Texaco’s photocopying upon the poten?al market for or value
of these individual ar?cles.”). The copyright holder’s other use or licensing of their own works
to other nonpar?es was simply not at issue in the fair use determina?on, and Google and
Texaco do not support a ?nding of relevance here for the same.
Similarly, Andy Warhol Foundation for the Visual Arts, Inc. v. Goldsmith does not stand
for the proposi?on that “the Times’s crea?on, use and posi?ons on [others’ Gen AI] generally is
directly relevant” to Defendant’s fair use defense. (ECF 236 at 1) (“[T]he technology yields
transforma?ve and produc?ve bene?ts for the enterprise of journalism speci?cally.”) (emphasis
added). Whether nonpar?es’ Gen AI tools confer bene?ts on the journalism industry is not
relevant to a determina?on of whether Defendant’s acts—i.e., the alleged copying involving
Defendant’s Gen AI tools—cons?tute fair use. 3 The fair use factors are concerned with “the
copier’s use of an original work.” See Andy Warhol Foundation for the Visual Arts, Inc. v.
Goldsmith, 598 U.S. 508, 528 (2023).
IV.
CONCLUSION
This case is about whether Defendant trained their LLMs using Plain??’s copyrighted
material, and whether that use cons?tutes copyright infringement. (ECF 170, ¶¶ 158-168). It is
not a referendum on the bene?ts of Gen AI, on Plain??’s business prac?ces, or about whether
any of Plain??’s employees use Gen AI at work. The broad scope of document produc?on
sought here is simply not relevant to Defendant’s purported fair use defense. For example, if a
OpenAI seems to suggest that if the Times’s journalists use any form of Gen AI tools in their work, that Gen AI
then “bene?ts” journalism, and if Gen AI tools “bene?t” journalism, that “bene?t” would be relevant to OpenAI’s
fair use defense. But the Supreme Court speci?cally states that a discussion of “public bene?ts” must relate to the
bene?ts from the copying. Oracle, 593 U.S. at 35.
3
5
copyright holder sued a video game manufacturer for copyright infringement, the copyright
holder might be required to produce documents rela?ng to their interac?ons with that video
game manufacturer, but the video game manufacturer would not be en?tled to wide-ranging
discovery concerning the copyright holder’s employees’ gaming history, statements about
video games generally, or even their licensing of di?erent content to other video game
manufacturers.
Accordingly, because Defendant has failed to demonstrate the relevance of the
informa?on sought, Defendant’s mo?on to compel is DENIED.
The Clerk of Court is respec?ully directed to close ECF 236.
SO ORDERED.
s/ Ona T. Wang
Ona T. Wang
United States Magistrate Judge
Dated: November 22, 2024
New York, New York
6
Disclaimer: Justia Dockets & Filings provides public litigation records from the federal appellate and district courts. These filings and docket sheets should not be considered findings of fact or liability, nor do they necessarily reflect the view of Justia.
Why Is My Information Online?