Kumandan et al v. Google LLC et al

Filing 453

ORDER RE MODIFIED SAMPLING PROTOCOL. Signed by Judge Susan van Keulen on 10/17/2024. (svklc1, COURT STAFF) (Filed on 10/17/2024)

Download PDF
1 2 3 4 UNITED STATES DISTRICT COURT 5 NORTHERN DISTRICT OF CALIFORNIA 6 7 IN RE GOOGLE ASSISTANT PRIVACY LITIGATION. Case No. 19-cv-04286-BLF (SVK) 8 ORDER RE MODIFIED SAMPLING PROTOCOL 9 10 United States District Court Northern District of California 11 12 Pursuant to Judge Freeman’s Order Regarding Motion for Relief from Non-Dispositive 13 Pretrial Order (Dkt. No. 370), the Court has revisited the scope of the October 20, 2022 Sampling 14 Order (Dkt. No. 331) in light of the Class Certification Order (Dkt. 360). Given that only the 15 Purchaser Class was certified to pursue breach of contract and Unfair Competition Law claims 16 predicated on breach of contract and a violation of California Business and Professions Code 17 § 22576, the proportionality analysis has shifted dramatically since the Court issued the initial 18 Sampling Order. 19 As a result of these developments, the undersigned requested updated briefing regarding an 20 appropriate sampling protocol and held hearings on August 27 and October 3, 2024. See Dkt. 438, 21 443, 451, 452. Through this process, certain parameters on sampling were revised, however, the 22 Parties remained in vastly different universes as to sample size of queries, with Plaintiffs arguing 23 for a sample in excess of 20 million queries (see, e.g., Dkt. 452 at 1) and Google offering to 24 sample merely 12,500 queries. Dkt. 450 at 16:2-9. At the October 3 hearing, the Court 25 admonished both sides that substantial compromise would be required and, if after one final effort 26 the parties could not agree, the Court would determine the sample size. Id. at 44:15-19. Both 27 sides were amenable to a final effort and, if they could not agree, accepting a number set by the 28 Court. Id. at 43:15-44:3. The Parties’ further submissions reflected some, but not enough, 1 movement, with Plaintiffs reluctantly suggesting 2.6 million (Dkt. 452 at 2) and Google moving to 2 104,000 (Dkt. 451 at 1). Having reviewed the transcripts of the hearings on this issue, the Parties’ 3 final submissions, and the declaration of Google’s expert Jonathan Borck (Dkt. 451-2), the Court 4 determines that a sample size of 260,000 queries is appropriate and proportional to the needs of 5 this case. 6 Accordingly, the Court ORDERS as follows: 7 1. Google will produce a sample of speech log data within 45 days of the date of this 2. Over the 26 quarters that the Parties have agreed are relevant, Google will perform 8 9 10 United States District Court Northern District of California 11 Order. sampling for 1 randomly selected day per quarter. 3. For each sample day, Google will randomly select 10,000 Google Assistant 12 queries that were (1) hotword-initiated; and (2) made on Google Assistant Enabled Devices 13 manufactured by Google. The total number of queries to be sampled across all 26 quarters is 14 260,000. 15 4. For the set of randomly sampled queries for each sample day, Google will provide 16 (1) the number of queries that did not contain a hot word in the “top hypothesis” field, and (2) the 17 number of those queries that were sent to a human reviewer. 18 19 5. Google shall also provide Plaintiffs with the raw speech log data for all 104,000 queries, with the following modifications: 20 a. Google may anonymize the raw data before producing it to Plaintiffs. 21 b. Google may redact the content of each query, leaving only the hotword 22 visible. Where there is no hotword, the entire query will be redacted. 23 SO ORDERED. 24 Dated: October 17, 2024 25 26 SUSAN VAN KEULEN United States Magistrate Judge 27 28 2

Disclaimer: Justia Dockets & Filings provides public litigation records from the federal appellate and district courts. These filings and docket sheets should not be considered findings of fact or liability, nor do they necessarily reflect the view of Justia.


Why Is My Information Online?