Kumandan et al v. Google LLC et al
Filing
453
ORDER RE MODIFIED SAMPLING PROTOCOL. Signed by Judge Susan van Keulen on 10/17/2024. (svklc1, COURT STAFF) (Filed on 10/17/2024)
1
2
3
4
UNITED STATES DISTRICT COURT
5
NORTHERN DISTRICT OF CALIFORNIA
6
7
IN RE GOOGLE ASSISTANT PRIVACY
LITIGATION.
Case No. 19-cv-04286-BLF (SVK)
8
ORDER RE MODIFIED SAMPLING
PROTOCOL
9
10
United States District Court
Northern District of California
11
12
Pursuant to Judge Freeman’s Order Regarding Motion for Relief from Non-Dispositive
13
Pretrial Order (Dkt. No. 370), the Court has revisited the scope of the October 20, 2022 Sampling
14
Order (Dkt. No. 331) in light of the Class Certification Order (Dkt. 360). Given that only the
15
Purchaser Class was certified to pursue breach of contract and Unfair Competition Law claims
16
predicated on breach of contract and a violation of California Business and Professions Code
17
§ 22576, the proportionality analysis has shifted dramatically since the Court issued the initial
18
Sampling Order.
19
As a result of these developments, the undersigned requested updated briefing regarding an
20
appropriate sampling protocol and held hearings on August 27 and October 3, 2024. See Dkt. 438,
21
443, 451, 452. Through this process, certain parameters on sampling were revised, however, the
22
Parties remained in vastly different universes as to sample size of queries, with Plaintiffs arguing
23
for a sample in excess of 20 million queries (see, e.g., Dkt. 452 at 1) and Google offering to
24
sample merely 12,500 queries. Dkt. 450 at 16:2-9. At the October 3 hearing, the Court
25
admonished both sides that substantial compromise would be required and, if after one final effort
26
the parties could not agree, the Court would determine the sample size. Id. at 44:15-19. Both
27
sides were amenable to a final effort and, if they could not agree, accepting a number set by the
28
Court. Id. at 43:15-44:3. The Parties’ further submissions reflected some, but not enough,
1
movement, with Plaintiffs reluctantly suggesting 2.6 million (Dkt. 452 at 2) and Google moving to
2
104,000 (Dkt. 451 at 1). Having reviewed the transcripts of the hearings on this issue, the Parties’
3
final submissions, and the declaration of Google’s expert Jonathan Borck (Dkt. 451-2), the Court
4
determines that a sample size of 260,000 queries is appropriate and proportional to the needs of
5
this case.
6
Accordingly, the Court ORDERS as follows:
7
1.
Google will produce a sample of speech log data within 45 days of the date of this
2.
Over the 26 quarters that the Parties have agreed are relevant, Google will perform
8
9
10
United States District Court
Northern District of California
11
Order.
sampling for 1 randomly selected day per quarter.
3.
For each sample day, Google will randomly select 10,000 Google Assistant
12
queries that were (1) hotword-initiated; and (2) made on Google Assistant Enabled Devices
13
manufactured by Google. The total number of queries to be sampled across all 26 quarters is
14
260,000.
15
4.
For the set of randomly sampled queries for each sample day, Google will provide
16
(1) the number of queries that did not contain a hot word in the “top hypothesis” field, and (2) the
17
number of those queries that were sent to a human reviewer.
18
19
5.
Google shall also provide Plaintiffs with the raw speech log data for all 104,000
queries, with the following modifications:
20
a.
Google may anonymize the raw data before producing it to Plaintiffs.
21
b.
Google may redact the content of each query, leaving only the hotword
22
visible. Where there is no hotword, the entire query will be redacted.
23
SO ORDERED.
24
Dated: October 17, 2024
25
26
SUSAN VAN KEULEN
United States Magistrate Judge
27
28
2
Disclaimer: Justia Dockets & Filings provides public litigation records from the federal appellate and district courts. These filings and docket sheets should not be considered findings of fact or liability, nor do they necessarily reflect the view of Justia.
Why Is My Information Online?