The Authors Guild, Inc. et al v. Hathitrust et al
DECLARATION of Joseph Petersen (redacted) in Support re: 100 MOTION for Summary Judgment.. Document filed by Hathitrust. (Petersen, Joseph)
KILPATRICK TOWNSEND & STOCKTON LLP
Joseph Petersen (JP 9071)
Robert Potter (RP 5757)
1114 Avenue of the Americas
New York, NY 10036
Telephone: (212) 775-8700
Facsimile: (212) 775-8800
Joseph M. Beck (admitted pro hac vice)
W. Andrew Pequignot (admitted pro hac vice)
Allison Scott Roach (admitted pro hac vice)
1100 Peachtree Street, Suite 2800
Atlanta, Georgia 30309-4530
Telephone: (404) 815-6500
Facsimile: (404) 815-6555
Attorneys for Defendants
UNITED STATES DISTRICT COURT
SOUTHERN DISTRICT OF NEW YORK
THE AUTHORS GUILD, INC., ET AL.,
Case No. 11 Civ. 6351 (HB)
HATHITRUST, ET AL.,
REPLY DECLARATION OF JOSEPH PETERSEN
IN SUPPORT OF THE LIBRARIES’ MOTION FOR SUMMARY JUDGMENT
I, Joseph Petersen, make the following declaration:
I am a member of the Bar of this Court and a partner at the law firm of Kilpatrick
Townsend & LLP, attorneys for the Defendants in the above-captioned action (the “Libraries”). I
make this Declaration, based on my own personal knowledge, in further support of the Libraries’
Motion for Summary Judgment.
Functional Objectives | HathiTrust Digital Library
Page 1 of 2
Search About HathiTrust
November 5, 2010
Functional Objectives – Short-term
• Page turner mechanism: HathiTrust supports an application for reading, downloading, and interacting with (e.g., zooming and rotating) texts and images in
HathiTrust. The page turner application interfaces with mechanisms such as the Rights Database and Shibboleth (a mechanism for inter-institutional
authentication) to provide appropriate access to materials, and integrates with services such as the Collection Builder, full text search, and the bibliographic
• Branding (overall initiative; individual libraries): HathiTrust supports branding in the repository in a number of ways:
◦ The pageturner prominently identifies the HathiTrust initiative;
◦ A watermark on every page identifies the digitizing agent; and
◦ A watermark on every page identifies the source library of the print material.
◦ The source of the print material is included in our feed of bibliographic identifiers so that institutions can import or update records with this information.
◦ The pageturner contains institution-specific branding, identifying to users at partners institutions that their institution is a member of HathiTrust.
• Format validation, migration and error-checking: Format validation and error-checking is performed for all content that enters HathiTrust. Although, to
date, no migration of content has been necessary to date, we believe that we have mitigated this need by choosing rich, flexible, standards-based formats.
HathiTrust stores a variety of technical and digital preservation metadata along with each object in order to aid in migration should it become necessary.
Strategies are in place to ensure and validate the integrity of HathiTrust materials on an ongoing basis.
• Development of APIs that will allow partner libraries to access information and integrate it into local systems individually: Several APIs have been
released for this purpose. Two key examples are a bibliographic API (Bib API (bib_api) ), which supports lookup and catalog integration, and a data API (Data
API (data_api) ), which provides machine access to the underlying data in a digital object. Information on all modes of content and metadata distribution
(including OAI and tab-delimited metadata files) can be found at http://www.hathitrust.org/data.
• Access mechanisms for persons with disabilities: HathiTrust has deployed an accessible interface that uses descriptive labeling, key tabs, and other
strategies to facilitate navigation and use by users with print disabilities (e.g., optimized for use with screen readers). HathiTrust has also deployed
authorization mechanisms that permit users who are certified as having print disabilities to access the full text of public domain and in copyright volumes in
HathiTrust. These mechanisms, which have been deployed at the University of Michigan, are sufficiently generalized to provide access at partner institutions
pending agreement on entitlement attributes (to be used in connection with Shibboleth) and institutional policies. A CIC working group chaired by Mark
Sandler has initiated work to help address these needs.
• Public ‘Discovery’ Interface for HathiTrust: HathiTrust released a temporary public version of a comprehensive bibliographic search application (i.e., a
catalog) in April 2009 and has worked through a collective process to define a HathiTrust view in WorldCat. The WorldCat implementation of the HathiTrust
catalog will be released as a pilot in November 2010.
• Ability to publish virtual collections: HathiTrust has created a Collection Builder (http://babel.hathitrust.org/cgi/mb) application that permits individuals to create
public (i.e., shared) and private collections. Collection Builder uses Shibboleth authentication for users at partner institutions, but also permits authentication
through the University of Michigan “friend (http://www.itd.umich.edu/itcsdocs/s4316/) ” system so that unaffiliated users can create and maintain collections.
• Mechanism for direct ingest of non-Google content: HathiTrust developed automated ingest mechanisms for book and journal content digitized by the
Internet Archive in April 2010. A technical and policy framework for ingest of other digitized book and journal content (e.g., digitized by partner institutions) is
being finalized currently. When this is complete, routine ingest of partner content will begin.
Functional Objectives – Long-term
• Compliance with required elements in the Trustworthy Repositories Audit and Certification (TRAC) criteria and checklist: The Center for Research
Libraries is conducting an independent assessment of the HathiTrust repository, based largely on the Trusted Repositories Audit and Certification (TRAC)
criteria. The assessment is targeted to be complete by the end of 2010. Information about HathiTrust's compliance with TRAC can be found at
http://www.hathitrust.org/standards (http://www.hathitrust.org/standards) .
• Robust discovery mechanisms like full-text cross-repository searching: An initial implementation of full-text search of the entire repository was released
on November 19, 2009. The launch of this service represented significant research and development, much of which is documented on the HathiTrust
website at http://www.hathitrust.org/large_scale_search (http://www.hathitrust.org/large_scale_search) and http://www.hathitrust.org/blogs/large-scale-search
• Development of an open service definition to make it possible for partner libraries to develop other secure access mechanisms and discovery
tools: HathiTrust has created a number of APIs (data) for this purpose, as well as a collaborative development environment for partners to improve existing,
and develop new applications.
• Support for formats beyond books and journals: HathiTrust is investigating issues relating to the storage and delivery of electronic publications (in the
ePub format in particular) and digital audio and image files (such as maps). Pilot projects in each of these areas are underway.
• Development of data mining tools for HathiTrust and use by HathiTrust of other analysis tools from other sources: HathiTrust has engaged multiple
strategies to support data mining in HathiTrust:
Functional Objectives | HathiTrust Digital Library
Page 2 of 2
1. Data Distribution: HathiTrust has made sample datasets (datasets) available to researchers for computational processing and analysis. The purpose of the
samples is to give researchers an idea of the structure of the repository ahead of broader distribution of the public domain in HathiTrust (planned for early
2011) and strategy 2 below.
2. SEASR integration: The SEASR development team is in the process of integrating SEASR into HathiTrust as a proof of concept.
3. HathiTrust Research Center: HathiTrust plans to create a Research Center equipped with a variety of tools and services to allow a broad variety of
analyses on the repository corpus.
About Help Feedback Take-Down Policy Privacy Contact