michael a. keller university librarian director of academic information resources

of 110/110
Research Libraries: Digital Intermediaries & Digital Archives -- Stanford’s plan, practice, & application Michael A. Keller University Librarian Director of Academic Information Resources Founder/Publisher of HighWire Press Publisher of Stanford University Press --- CALIS Conference, Chengdu, PRC 15 May 2007

Post on 14-Jan-2016

41 views

Category:

Documents

0 download

Embed Size (px)

DESCRIPTION

Research Libraries: Digital Intermediaries & Digital Archives -- Stanford’s plan, practice, & application. Michael A. Keller University Librarian Director of Academic Information Resources Founder/Publisher of HighWire Press Publisher of Stanford University Press --- - PowerPoint PPT Presentation

TRANSCRIPT

  • Research Libraries:Digital Intermediaries & Digital Archives--Stanfords plan, practice, & applicationMichael A. KellerUniversity LibrarianDirector of Academic Information ResourcesFounder/Publisher of HighWire PressPublisher of Stanford University Press---CALIS Conference, Chengdu, PRC15 May 2007

  • : --

    HighWire

    2007515

  • The CycleAuthor(ProfessorsTeachers)

    Publishers

    Reviewers(Professors)DistributorsBooksellersHosts ISPs(HighWire Press)LibrariesReaders(ProfessorsTeachersStudents)

  • )

    () (HighWire )()

  • The Cycle IntermediariesAuthor(ProfessorsTeachers)

    Publishers

    Reviewers(Professors)DistributorsBooksellersHosts ISPs(HighWire Press)LibrariesReaders(ProfessorsTeachersStudents)

  • ()

    () (HighWire )()

  • The Cycle -- DigitizationAuthor(ProfessorsTeachers)

    Publishers

    Reviewers(Professors)DistributorsBooksellersHosts ISPs(HighWire Press)LibrariesReaders(ProfessorsTeachersStudentsDigitization

  • -()

    () (HighWire )()

  • Stanford Strategic PositionsHighWire Press, a unit of Stanford University Libraries serves scholarly publishersIntersection of professors as authors, publishers, editors and reviewerswith librarians as information managers with information technologists as service Merging Libraries with Academic ComputingUndertaking digital archiving (LOCKSS/CLOCKSS, Stanford Digital Repository)Including Stanford University PressHttp://sulair.stanford.edu

  • HighWire (LOCKSS/CLOCKSS, )Http://sulair.stanford.edu

  • HighWire PressReceives digital manuscripts of articles including data supplements not found in print editionsProcesses, adding featuresPublishing before print editionSeveral image resolutionsHyperlinking citations to cited referencesAlerting servicesPDF & HTML versionsCitation mappingCorresponding with authors World-wide instantaneous delivery enabling many researchers to read simultaneously; no waiting for the print editionSome publishers abandoning print; more to followHttp://highwire.stanford.edu

  • HighWire

    PDF HTML Http://highwire.stanford.edu

  • Thumbnail image

  • Medium size image

  • Large size image

  • Toll Free Linking

  • Citation linking: destination

  • :

  • Prospective citing

  • Citation Map

  • Highwire Press 1

  • Highwire Press 2

  • Highwire Press 3

  • Stanfords HighWire Presssummary of strategic importanceEnables web versions of many high impact, highly cited scholarly journals publishedNetwork distribution makes instant distribution possible, making all readers equalProvides numerous services making research faster, better, more penetratingLinks readers and authorsEmbodies collaboration among publishers, librarians, information technologistsNot for profit, a Stanford enterprise

  • HighWire Press

  • "Science, Scholarship, and Internet Publishing: The HighWire Story" Syllabus Magazine, October 1998

    EXCERPT: "Scientists, scientific editors and publishers, scholarly society officers, and an enterprise unit of the Stanford University Libraries named HighWire Press have worked together over the past three and a half years to publish Internet editions of 70 influential scientific journals. Three significant accomplishments have resulted. First, there has evolved a mode of scholarly communication which serves readers, and facilitates research as much as it supports the clarity and validity of scientific discourse; this model has become a standard in Internet scholarly publishing. Second, an active community of scholarly editors and publishers has intensified the benefits of online scholarly publishing to the scientific, medical and technical communities at large. Third, the products of life sciences research in the advanced economies of Europe and North America are now more widely available than ever before, stimulating scientific and other cultural developments in other parts of the world."

  • HighWire 199810HighWire70

  • Bit rotObsolescenceFormatTechnologyDistribution and dissipationMigrations and transitionsPeople (2 20 years)Software (5 10 years)Hardware (3 5 years) The Challenge of Digital Preservation

  • (2 20 ) (5 10 ) (3 5 )

  • Digital LibrarySULAIR collections & resourcesDigitization artifacts

    Institutional RepositoryResearch data, Publications, dissertations, Learning objects, university assets

    External DepositorsOnline preservation and access Dark archiveThree Major Areas of Preservation Needs

  • , ,

  • Preservation-focused archiveReplicated content multiple copies, geographically distributedSecureAuditableModularTiered storage environmentonline, nearline, offlineVersion rather than delete Content-agnosticDesign Objectives & Assumptions

  • Preserving access to digital information over timethrough generations of technology obsolescence and change.

    Maintaining integrity of that information over timethrough generations of migration and reformatting. Core Repository FunctionalityRepository Services FunctionalityAll (or almost all) user-facing services Enhanced access & delivery through applicationsData mining, dry research, new indexing, e-science, etc.Federation

  • SDR: Core Repository vs. Repository Services

  • SDR:

  • Stanford Digital Repository (SDR): content agnostic, preservation repositoryNational Geospatial Digital Archive (NGDA) Geospatial dataSUL Digital Bookshelves (Google Books, internally digitized, vendors' e-books)Digital Library Applications

    (images, mss, media, Special Collections showcases)while specialty archives and applications provide focused digital content collection, access and value-added servicesInstitutional Repository

    (faculty- and student submitted papers, data, websites, etc.)SDR Serves As Common Preservation Infrastructure

  • SDR

    Stanford Digital Repository (SDR): content agnostic, preservation repositoryNational Geospatial Digital Archive (NGDA) Geospatial dataSUL Digital Bookshelves (Google Books, internally digitized, vendors' e-books)Digital Library Applications

    (images, mss, media, Special Collections showcases)Institutional Repository

    (faculty- and student submitted papers, data, websites, etc.)

  • SDR WorkflowConversion

    DigitalCollectionsGeospatialDataExternalCollectionsIngestVirus CheckIngestStorageLayerAccess LayerLunaBook ReaderDEWI (?)SDR

  • SDR

    Conversion

    DigitalCollectionsGeospatialDataExternalCollectionsIngestVirus CheckIngestStorageLayerAccess LayerLunaBook ReaderDEWI (?)SDR

  • SDR High-Level Architecture

  • SDR

  • SDR Component Diagram

    Person 1

    ConversionGenerates TM from existing MD

    Staging Areaacts as gatekeeper: virus checking, file & format validation

    Content

    Directory WatcherMonitors and validates gatekeeper; transfers files to ingest

    IngestValidates, Packages (with TM), Sends to storage

    Storage ManagerDirects objects to storage layers

    Archival Disk(Honeycomb)

    Archival Tape

    TSM

    L700

    A

    B

    C

    three tape copies

    All SDR MD

    Some objects + their MD

    Accessory

    Watches for access requests, consults storage manager, funnels content from disk & tape to Reconstructor

    Access DirectorAccess request db, checks cache for content, queues & tracks requests for objects

    ReconstructorReconstructs digital objects from AIPs to DIPs

    Fedora

    All SDR MD

    Disseminators

    Applications

    Applications

    Applications

    Users

    Logging

    Logging

    Secure Preservation Environment

    Delivery Cache

    SDR

  • SDR

    Person 1

    ConversionGenerates TM from existing MD

    Staging Areaacts as gatekeeper: virus checking, file & format validation

    Content

    Directory WatcherMonitors and validates gatekeeper; transfers files to ingest

    IngestValidates, Packages (with TM), Sends to storage

    Storage ManagerDirects objects to storage layers

    Archival Disk(Honeycomb)

    Archival Tape

    TSM

    L700

    A

    B

    C

    three tape copies

    All SDR MD

    Some objects + their MD

    Accessory

    Watches for access requests, consults storage manager, funnels content from disk & tape to Reconstructor

    Access DirectorAccess request db, checks cache for content, queues & tracks requests for objects

    ReconstructorReconstructs digital objects from AIPs to DIPs

    Fedora

    All SDR MD

    Disseminators

    Applications

    Applications

    Applications

    Users

    Logging

    Logging

    Secure Preservation Environment

    Delivery Cache

    SDR

  • SDR Physical TopologyMarch 2006

    Module(s)HardwareConversion, GatekeeperSun Fire X4100 Server4 TB Nexsan SATA DiskIngest, Storage code, Storage Request ProcessorSun Fire X4100 Server 4 TB Nexsan SATA DiskOnline storage32 TB Sun Honeycomb Storage SystemTape CopiesSun StorEdge L700 Tape Library, with LTO2 drivesIBM Tivoli Storage ManagerIron Mountain data protection planAccess Service, Access CacheSun Fire X4100 Server8 TB of Nexsan SATA Disk

  • SDR

  • Stanford Digital RepositoryManaged care for digital objects of all genres & formatsServes several strategic needsDigital LibraryInstitutional RepositoryEnterprise RepositoryA strategic development for research, teaching & learningWill provide a distinctive, competitive edge

  • SDR

  • What is LOCKSS?Lots Of Copies Keep Stuff SafeDigital Preservation InfrastructureDecentralized, Peer to Peer, Continuous Audit & RepairInternet computers chattering away among themselvesOpen Source163 LOCKSS Libraries in 18 countries http:lockss.stanford.edu

  • LOCKSS?Lots Of Copies Keep Stuff Safe163LOCKSS 18 http:lockss.stanford.edu

  • LOCKSS BoxesCollectionTitle 1Title 2PatronLOCKSS boxLOCKSS box

  • 12LOCKSS 1LOCKSS 2LOCKSS 12

  • LOCKSS BoxesTitle 1Title 2PatronLOCKSS boxLOCKSS boxPreservation

  • 1 2LOCKSS 1LOCKSS 2LOCKSS 12

  • Prevents the publisher from revoking access rights to back content AccessTitle 1Title 2PatronLOCKSS boxLOCKSS box

  • 1 2LOCKSS 1 LOCKSS 2

  • CLOCKSSControlled LOCKSSLimited network of library cachesLOCKSS technology underlies CLOCKSSShared governance model

  • CLOCKSS

    LOCKSSLOCKSSCLOCKSS

  • The CLOCKSS PrototypeTwo year demonstrator, ending in 2007Public reports of progress & outcomeDemonstration that this solution is credible for long termProof of scalability for publisher content & library deploymentFunded first by participants with recent grant support from NDIIPP (Library of Congress)http://www.clockss.org

  • CLOCKSS

    2007http://www.clockss.org

  • CLOCKSS ParticipantsCLOCKSS acting on behalf of wider community of libraries & publishers7 Libraries distributed across tectonic plates12 publishers, commercial & scholarly societiesNumbers & types sufficient to cover the basesCommitment based on stewardship of libraries & responsibility of publishers

  • CLOCKSS CLOCKSS712

  • LibrariesUniversity of EdinburghNew York Public LibraryIndiana UniversityRice UniversityUniversity of Virginia OCLCStanford University

    NB: more to be added on more tectonic plates

  • OCLC

    :

  • PublishersBlackwell PublishingElsevierNature Publishing GroupOxford University PressSAGE PublicationsSpringerTaylor and FrancisJohn Wiley & SonsAmerican Chemical AssociationAmerican Medical AssociationAmerican Physiological SocietyInstitute of PhysicsNB: aim to add all the rest

  • Blackwell PublishingElsevierNature Publishing GroupOxford University PressSAGE PublicationsSpringerTaylor and FrancisJohn Wiley & SonsAmerican Chemical AssociationAmerican Medical AssociationAmerican Physiological SocietyInstitute of PhysicsNB: aim to add all the rest

  • Equal PartnersLibrarians, with Publishers agreeing, retain stewardship role as societys memory institutionsPublishers have decided to trust & engage Libraries, committing to prospect of preservation for continuing accessBoth are exploring social & technical model in a 2 year test, working to build a full scale production systemCosts are equally shared, with addl funding from NDIIPP for audit & reporting

  • NDIIPP

  • CLOCKSS MissionCLOCKSS is a not-for-profit community partnership between publishers and libraries that is developing a distributed, validated, comprehensive archive that preserves and ensures continuing access to electronic scholarly content.

  • CLOCKSS

    CLOCKSS

  • CLOCKSS GovernanceJointly governed by founding library & publisher partnersEach partner represents an organization, but collectively sectors are representedUniversity libraries & Public librariesCommercial publishers & scholarly societiesNo single point of failure or institutional interest will hinder long term governanceConsensus driven, united for support of scholarly communication over the long termCLOCKSS seen as complimentary to national arrangements for legal deposit

  • CLOCKSS CLOCKSS

  • LOCKSS/CLOCKSSDistributed preservation functionCaches authorized e-content for local cachingEmpowers librariesInexpensive, easily implementedFlexible, open source applicationExpanding community of usersExpanding community of uses

  • LOCKSS/CLOCKSS

  • Other SULAIR Strategic ProgramsGoogle Book Search & other digitizationDevelopment of Bookless LibrariesCourseWork Sakai (Open Source) Course Management SystemMedia PreservationExpanding the East Asia LibraryExpanding the Middle Eastern Collection

  • Sakai

  • Thank [email protected] this presentation at http://china.highwire.org

  • [email protected]

    http://china.highwire.org

    Research data: supplemental data for published worksto meet granting agency requirementsarchiving shelved research projectsweb citations

    content auditscomponent auditssecurity auditprocess & procedural (aka PWC-style) audits