Bradburn et al v. North Central Regional Library District
Filing
41
STATEMENT OF Material FACTS re 39 MOTION for Summary Judgment filed by all plaintiffs. (Attachments: # 1 Exhibit A-F, # 2 Exhibit G-I, # 3 Exhibit J-T, # 4 Exhibit U Part 1, # 5 Exhibit U Part 2, # 6 Exhibit U Part 3, # 7 Exhibit V-Z, # 8 Exhibit AA-KK)(Caplan, Aaron)
Bradburn et al v. North Central Regional Library District
Doc. 41 Att. 8
Exhibit AA
567
Dockets.Justia.com
568
569
570
Exhibit BB
571
572
573
574
575
Exhibit CC
576
577
Exhibit DD
578
579
580
Exhibit EE
581
582
583
Exhibit FF
584
585
586
Exhibit GG
587
Introduction
i was asked by attorneys representing the North Central Regional Library District ("NCRL") to write this report and present testimony concerning it to the extent necessary. I agreed. Although I do not engage in litigation consulting as my primary means of earning income, I am being paid for this engagement at the rate of $400 per hour.
I was asked to explain how the NCRL filtering software works. I was also asked to assess the methods used in study of error rates in the filtering software NCRL uses as reported by Mr. Haselton. I was also asked to conduct a study of my own if I thought it would yield greater insight into whether the NCRL filters block more than the content they intended to block. I did conduct such a study, and report on the methods and results in this document.
the NCRL litigation. I had two main concerns with the study reported by Mr. Haselton. First, I was concerned that the set of URLs he tested might not be representative of those that NCRL library patrons actually visit. He selected a random sample of all possible domains. It seemed to me likely that library patrons would tend to visit more popular destinations than a random sample. For example, neither google.com nor yahoo.com appear in his random sample, but would almost certainly be viewed by library patrons. It also seemed to me likely that the blocking software would make fewer errors on more popular sites, because the Fortinet company that NCRL contracts with to provide the blocking service would invest more effort in correctly classifying more popular sites. Both these intuitions were born out in the results of the study I conducted.
I have also reviewed the report prepared by Bennett Haselton for purposes of
Second, I was concerned with the reliability ofMr. Haselton's classification ofURLs. He did the classification based on loosely defined criteria, without any check on the his assessments (no second rater or inter-rater reliability or test-retest reliability of reliability check). This is contrary to the accepted practice in the social sciences when trying to turn subjective human assessments into an objective, repeatable measure. In our test, using more rigorous methods, we were still not able to achieve perfect reliability in our categorization of urIs, suggesting that Mr. Haselton's classification may have been
even more error-prone.
Personal Background
I am a Professor at the University of Michigan School of Information. In 2002, I the error rates on health-related websites of several conducted an assessment of commercial Internet filters. That study was published in lAMA, the flagship peerreviewed journal of the American Medical Association. Appendix 1. A subsequent paper abstracting what we had learned about methods for conducting tests of filtering software was published in the Communications of the ACM, the flagship publication distributed to the Association for Computing Machinery. Appendix 2. More details of all members of my qualifications can be found in Appendix 3.
588
.5
Dr. Derek Hansen is an Assistant Professor at the University of Maryland. He served as a research assistant on the 2002 study of Internet filters, where he was responsible for rating a large number of web sites based on whether they contained health information or
not, and whether they contained pornography or not. Subsequently, as part of project at the University of
his Ph.D.
Michigan, he again had to develop a classification system for a corpus of texts (email messages and web pages). Though unrelated to filtering or pornography, that project gave him additional experience in creating a reliable categorization scheme and instructions for raters. More details of his qualifications can be found in Appendix 4.
Michigan School of Information. He has significant experience as a database and system administrator. He wrote the scripts for processing log files and created a web-based tool that allowed the raters to look at a large number of URLs and enter their assessments of them through a web-based form. More details of his qualifications can be found in Appendix 5.
Michael Hess has a master's degree from the University of
How NCRL's Filters Work
that NCRL has installed a FortiGate firewall/proxy unit, sold by the Fortinet company, in each of its branch libraries. The FortiGate is a small hardware, smaller than a typical laptop computer. My understanding is that all piece of computers in all of the branches access the Internet by connecting through these FortiGate units, in the maner that I describe below.
I have been told by NCRL staff
What Happens When a Patron Fetches a Page
To understand how the FortiGate affects the Internet activity of an NCRL patron, it is helpful to consider the sequence of steps that occur behind the scenes, invisible to an NCRL patron, each time a patron tries to visit a web page. A visit may be initiated either by directly entering a URL into the toolbar, by selecting a bookmarked favorite from a menu, or by following a link from another page. Regardless of how a visit to a web site is initiated, the same sequence of events occurs in the background. Fortinet provides the following diagram on its website to explain how its filtering works. My explanation is based on the explanation Fortinet provides to accompany the diagram on its website, with significant elaboration to explain terms that may not be familiar to non-technologists.
589
~
FortiGuard Web Filtering Service
Requested Web Site
.~
"'. ...
1
Erid Usor
4b ""3ê1
INTERNET
.~
æ:: Fi:F"riGF'ïE-
3b.. /., .,.
,/, . 4a
-,
,~' .. :-....
..~~
,J
Figure 1. The sequence of events in a potentially blocked request to visit a URL.
Even before the first step shown in the diagram, the patron's computer does a little work to decode the URL that the patron has requested to view. Consider a URL such as http://ww.yahoo.com/nfl. The first part, the letters "http" occurring before the II, constitute the service (or, more technically, the protocol). The http protocol is for connecting to a server to retrieve a web page. Other common protocols include https (for connecting to a server with an encrypted connection to securely retrieve a web page) and ftp (for downloading fies).
The next portion, ww.yahoo.com , including one or more periods and ending at the next I, is called the hostname or domain name. The Internet has a domain name system (sometimes called DNS) that allows the patron's computer to look up an address like
www.yahoo.cominordertofindoutacorrespondingIPaddresssuchas69.l47.1l4.2lO.
That numeric IP address uniquely identifies the destination host (the web server).
The third portion, Infl in this example, which begins after the domain name, is sometimes
called the filename or path or urlpath.
Step 1. The patron's computer attempts to establish a connection to the IP address of
the
destination host. It tries to send a message that it would like to "GET" whatever the server provides in response to this urlpath, such as an HTML document or an image file. Because the patron's computer accesses the Internet through the FortiGate unit, however, a few other things happen along the way, and the patron's computer may not receive the same response it would have received had it been connected directly to the Internet, without going through a filtering proxy/firewall like a FortiGate.
the URL has been requested recently, by this patron or another patron in the same branch, the FortiGate may already have a cached copy of Fortinets rating of the
Step 2. If
590
-¡
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
Exhibit HH
609
610
Exhibit II
611
612
613
Exhibit JJ
614
615
616
617
618
619
620
621
622
Exhibit KK
623
624
625
626
627
628
629
630
631
Disclaimer: Justia Dockets & Filings provides public litigation records from the federal appellate and district courts. These filings and docket sheets should not be considered findings of fact or liability, nor do they necessarily reflect the view of Justia.
Why Is My Information Online?