Your Scholarship.
Our World.
Preserving The Long Tail
Victoria Reich, Executive Director
LOCKSS Program
Stanford University Libraries
http://www.lockss.org/
1September	
  2015	
  
Content Is Web-based
Web size: 1200 PB
Internet Archive: 9 PB
Brewster Kahle
Founder and Director
Internet Archive
CONTENT IN CONTEXT
Scholarly communication is:
author’s words, data, software,
communication, identity,
related works, etc.
2
50% Preserved?
•  2010 ARL median
research library
receives ~80K serials
•  Keepers Registry
reports 28.5K
preserved; 10K in
progress
–  Not adjusted for risk
–  Not adjusted for
difficulty and cost
3
Biggest Threat to Content?
•  Obsolescence and/or failure
–  Formats
–  Media,
–  Hardware,
–  Software
•  Economic – national, organizational
•  Natural disasters
•  Humans
4
Reality
The rate of loss to future
researchers from “never
preserved” will vastly
exceed that from all
other causes.
Dr. David S. H. Rosenthal
http://blog.dshr.org/2014/12/talk-at-fall-cni.html
5
Philosophy
Preservation is an
active
community effort
LOTS OF COPIES
KEEP STUFF SAFE
LOTS OF COMMUNITIES
KEEP STUFF SAFE
6
The LOCKSS Program
•  Communities use
LOCKSS open
source software to
preserve & access
their scholarly
record
•  LOCKSS staff
provide services
and software
7
LOCKSS At Stanford
8
Preservation Architecture
•  Ingest
•  Preservation
•  Dissemination
•  Management
9
Formats Preserved
application/eps application/epub+zip application/javascript application/msword
application/octet-stream application/powerpoint application/pdf application/
postscript application/rss+xml application/rtf application/vnd.fdf application/
vnd.ms-excel application/vnd.ms-powerpoint application/vnd.ms-word
application/vnd.openxmlformats
officedocument.wordprocessingml.document application/vnd.rn-realmedia
application/wordperfect5.1 application/xhtml+xml application/x-javascript
application/xml application/x-msexcel application/x-research-info-systems
application/x-troff application/x-zip-compressed application/zip audio/mpeg
audio/x-mp3 audio/x-pn-realaudio chemical/x-mdl-molfile image/bmp image/
gif image/jpeg image/pjpeg image/png image/svg+xml image/tiff image/
vnd.microsoft.icon image/x-icon text/css text/html text/javascript text/plain
text/rtf text/x-bibtex text/x-js text/xml video/avi video/mp4 video/mp4v-es
video/mpeg video/quicktime video/x-msvideo video/x-ms-wmv
•  LOCKSS software migrates formats as needed
10
Automated
Cooperative Preservation
Identify and preserve authoritative version	
  
TRAC/ISO16363 Audit
•  1st Ever Perfect Score for Technologies,
Technical Infrastructure, Security
•  Equaled previous over all highest score (Scholars Portal)
•  Documentation is public
•  Linked from dshr.blog.org
•  Introduction
•  TRAC Audit: Process -
http://blog.dshr.org/2014/08/trac-audit-process.html
•  TRAC Audit: Lessons -
http://blog.dshr.org/2014/08/trac-audit-lessons.html
•  TRAC Audit: Do-It-Yourself Demos -
http://blog.dshr.org/2014/08/trac-audit-do-it-yourself-demos.html
12
Many LOCKSS Networks
•  1000’s of publishers
–  Subscription, open access, etc.
•  Ingest techniques
–  OAI/PMH, web crawling, file transfer, API, etc.
•  Preserved content types
–  Journals, books, databases, government documents,
thesis and dissertations, image collections…
•  Each with an organization & business model
•  Each with an appropriate access policy
13
A Few Networks
14
Global LOCKSS Network
15
Private LOCKSS Networks
!
!
!
!
!
!
!
v !
16
U.S. Government Documents
17
James Jacobs, Stanford
Canadian Government Information
18
Innovative Technology Award
19
5 Universities / 3 countries
20
Master and Ph.D. theses
Academic publications
Research data
Brazil’s Cariniana
21
PKP Private LOCKSS Network
22
Access From A LOCKSS Box
23
When The Publisher
Is Not Available
24
Appreciation
The LOCKSS Program’s simple and flexible
technical architecture is particularly well suited
to the rapidly evolving landscape of
e-journal publishing and scholarly practice.
Bernie Reilly, President
Council for Research Libraries,
2014
25
Research & Development
•  Internet Archive
–  IMLS to build web preservation APIs
–  Web infrastructure and collection building
•  Mellon Foundation
–  Preserving the future web
–  Emulation as a preservation strategy
•  IIPC
–  Preservation and access (Memento, INA’s LAP)
•  Library of Congress
–  Economics of long term storage
•  Force 11
–  Scholarly communication
•  University of California Santa Cruz
–  Storage technologies, Advise PhD students
•  4C
–  Preservation Costs
26
Thank you
Looking forward to the
conversation!
27

More Related Content

PPTX
'The Archive Layer, and the Atkinson Challenges' by John MacColl
PPTX
'Portico: Current Work and Future Plans' by Kate Wittenberg
PPT
'Constructing a national S&T literature preservation system' by Zhenxin Wu
PDF
'Digital Preservation of Academic Content: The CLOCKSS Archive' by Vicky Reich
PPTX
'Taking A Long View' by Peter Burnhill
PPTX
'HathiTrust's Long View: Perspectives on Preservation Strategies' by Mike Fur...
PPTX
'Scholars Portal: What's Now, What's Next' by Steve Marks
PPTX
'Building the Legal Deposit E-Journal Archive for the UK' by Andrew MacEwan
'The Archive Layer, and the Atkinson Challenges' by John MacColl
'Portico: Current Work and Future Plans' by Kate Wittenberg
'Constructing a national S&T literature preservation system' by Zhenxin Wu
'Digital Preservation of Academic Content: The CLOCKSS Archive' by Vicky Reich
'Taking A Long View' by Peter Burnhill
'HathiTrust's Long View: Perspectives on Preservation Strategies' by Mike Fur...
'Scholars Portal: What's Now, What's Next' by Steve Marks
'Building the Legal Deposit E-Journal Archive for the UK' by Andrew MacEwan

What's hot (20)

PPT
CLOCKSS: Time and Places for Community-Based Archiving
PPT
UKLA Content Development
PPTX
Does anybody care about digital preservation? Digital preservation from a per...
PPT
Wikimedia historic perspective
PPT
UKLA Update On Activities
PPTX
Repositories for OA, RDM and Beyond - Rory McNicholl
PPTX
‘Everything Available’ – a vision for the development of the British Library ...
PPTX
Using social media and quantitative metrics to engage the research community
PPTX
Integration - the heart of researcher centric research data management system...
PPTX
Open Journal Systems (OJS) Workshop
PDF
Using the Web as a Data Source: Challenges for Linked Science
PPTX
Open Access and Wikipedia : Taking accessible research to the global public"
PPTX
Wikimedia and research impact
PPTX
‘Everything Available’ – the strategy for the British Library’s research serv...
PPTX
Tim Martin - strategic perspectives on digital preservation from OCLC
PPTX
Electronic journals and digital preservation
PDF
What can libraries do for researchers?
PDF
Jabes 2008 - Session sponsor : OCLC en France : évolutions, projets et perspe...
PPT
Defining collections and creating their descriptions
PDF
Digital Humanities Clinics – Leading Dutch Librarians into DH. Lotte Wilms, N...
CLOCKSS: Time and Places for Community-Based Archiving
UKLA Content Development
Does anybody care about digital preservation? Digital preservation from a per...
Wikimedia historic perspective
UKLA Update On Activities
Repositories for OA, RDM and Beyond - Rory McNicholl
‘Everything Available’ – a vision for the development of the British Library ...
Using social media and quantitative metrics to engage the research community
Integration - the heart of researcher centric research data management system...
Open Journal Systems (OJS) Workshop
Using the Web as a Data Source: Challenges for Linked Science
Open Access and Wikipedia : Taking accessible research to the global public"
Wikimedia and research impact
‘Everything Available’ – the strategy for the British Library’s research serv...
Tim Martin - strategic perspectives on digital preservation from OCLC
Electronic journals and digital preservation
What can libraries do for researchers?
Jabes 2008 - Session sponsor : OCLC en France : évolutions, projets et perspe...
Defining collections and creating their descriptions
Digital Humanities Clinics – Leading Dutch Librarians into DH. Lotte Wilms, N...
Ad

Similar to 'Your Scholarship. Our World. Preserving the Long Tail' by Vicky Reich (20)

PPTX
Research into Practice case study 2: Library linked data implementations an...
PDF
Slides | Research data literacy and the library
PDF
Slides | Research data literacy and the library
PPTX
Boundless Opportunity
PPTX
Final Johnson Research Libraries and Computational Research
PDF
#ALAAC15 Linked Data Love
PPTX
Institutional Repository (IR) and Open Access in Academic Libraries
PPTX
Managing and sharing data
PDF
Change Management for Libraries
PPTX
Digital Scholarly Communication @Claremont Colleges
PPTX
Digitization and public libraries
PDF
Lcwebinar rise of-the_databrarian_73961
PPTX
Manage it locally to share it globally: RDM and Wikimedia Commons
PDF
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
PPTX
Benefits and practice of open science
PPTX
Stronger together: community initiatives in journal management
PPTX
NISO Virtual Conference: Web-Scale Discovery Services: Transforming Access to...
PDF
Kristi Holmes. A bird’s-eye view of scholarship at the individual, institutio...
PDF
Open Data - Principles and Techniques
PPTX
Linked Open Data for Cultural Heritage
Research into Practice case study 2: Library linked data implementations an...
Slides | Research data literacy and the library
Slides | Research data literacy and the library
Boundless Opportunity
Final Johnson Research Libraries and Computational Research
#ALAAC15 Linked Data Love
Institutional Repository (IR) and Open Access in Academic Libraries
Managing and sharing data
Change Management for Libraries
Digital Scholarly Communication @Claremont Colleges
Digitization and public libraries
Lcwebinar rise of-the_databrarian_73961
Manage it locally to share it globally: RDM and Wikimedia Commons
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
Benefits and practice of open science
Stronger together: community initiatives in journal management
NISO Virtual Conference: Web-Scale Discovery Services: Transforming Access to...
Kristi Holmes. A bird’s-eye view of scholarship at the individual, institutio...
Open Data - Principles and Techniques
Linked Open Data for Cultural Heritage
Ad

More from EDINA, University of Edinburgh (20)

PDF
The Making of the English Landscape:
PPTX
Spatial Data, Spatial Humanities
PDF
Land Cover Map 2015
PPTX
We have the technology... We have the data... What next?
PPTX
Reference Rot in Theses: A HiberActive Pilot - 10x10 session for Repository F...
PPTX
GeoForum EDINA report 2017
PPTX
If I Googled You, What Would I Find? Managing your digital footprint - Nicola...
PPTX
Moray housemarch2017
PPTX
Uniof stirlingmarch2017secondary
PPT
Uniof glasgow jan2017_secondary
PPTX
Managing your Digital Footprint : Taking control of the metadata and tracks a...
PPTX
Social media and blogging to develop and communicate research in the arts and...
PPTX
Enhancing your research impact through social media - Nicola Osborne
PPTX
Social Media in Marketing in Support of Your Personal Brand - Nicola Osborne
PPTX
Best Practice for Social Media in Teaching & Learning Contexts - Nicola Osborne
PPTX
SCURL and SUNCAT serials holdings comparison service
PPTX
Big data in Digimap
PPTX
Introduction to Edinburgh University Data Library and national data services
PPT
Digimap for Schools: Introduction to an ICT based cross curricular resource f...
PPTX
Digimap Update - Geoforum 2016 - Guy McGarva
The Making of the English Landscape:
Spatial Data, Spatial Humanities
Land Cover Map 2015
We have the technology... We have the data... What next?
Reference Rot in Theses: A HiberActive Pilot - 10x10 session for Repository F...
GeoForum EDINA report 2017
If I Googled You, What Would I Find? Managing your digital footprint - Nicola...
Moray housemarch2017
Uniof stirlingmarch2017secondary
Uniof glasgow jan2017_secondary
Managing your Digital Footprint : Taking control of the metadata and tracks a...
Social media and blogging to develop and communicate research in the arts and...
Enhancing your research impact through social media - Nicola Osborne
Social Media in Marketing in Support of Your Personal Brand - Nicola Osborne
Best Practice for Social Media in Teaching & Learning Contexts - Nicola Osborne
SCURL and SUNCAT serials holdings comparison service
Big data in Digimap
Introduction to Edinburgh University Data Library and national data services
Digimap for Schools: Introduction to an ICT based cross curricular resource f...
Digimap Update - Geoforum 2016 - Guy McGarva

Recently uploaded (20)

PDF
WHAT NURSES SAY_ COMMUNICATION BEHAVIORS ASSOCIATED WITH THE COMP.pdf
PDF
Compact First Student's Book Cambridge Official
PPTX
Diploma pharmaceutics notes..helps diploma students
PPTX
Theoretical for class.pptxgshdhddhdhdhgd
DOCX
EDUCATIONAL ASSESSMENT ASSIGNMENT SEMESTER MAY 2025.docx
PDF
BSc-Zoology-02Sem-DrVijay-Comparative anatomy of vertebrates.pdf
PDF
Diabetes Mellitus , types , clinical picture, investigation and managment
PDF
Physical pharmaceutics two in b pharmacy
PDF
faiz-khans about Radiotherapy Physics-02.pdf
PPTX
Thinking Routines and Learning Engagements.pptx
PDF
Disorder of Endocrine system (1).pdfyyhyyyy
PPTX
pharmaceutics-1unit-1-221214121936-550b56aa.pptx
PDF
LATAM’s Top EdTech Innovators Transforming Learning in 2025.pdf
PDF
Chevening Scholarship Application and Interview Preparation Guide
DOCX
THEORY AND PRACTICE ASSIGNMENT SEMESTER MAY 2025.docx
PPTX
IT infrastructure and emerging technologies
PDF
Health aspects of bilberry: A review on its general benefits
PDF
GSA-Past-Papers-2010-2024-2.pdf CSS examination
PDF
CHALLENGES FACED BY TEACHERS WHEN TEACHING LEARNERS WITH DEVELOPMENTAL DISABI...
PPTX
principlesofmanagementsem1slides-131211060335-phpapp01 (1).ppt
WHAT NURSES SAY_ COMMUNICATION BEHAVIORS ASSOCIATED WITH THE COMP.pdf
Compact First Student's Book Cambridge Official
Diploma pharmaceutics notes..helps diploma students
Theoretical for class.pptxgshdhddhdhdhgd
EDUCATIONAL ASSESSMENT ASSIGNMENT SEMESTER MAY 2025.docx
BSc-Zoology-02Sem-DrVijay-Comparative anatomy of vertebrates.pdf
Diabetes Mellitus , types , clinical picture, investigation and managment
Physical pharmaceutics two in b pharmacy
faiz-khans about Radiotherapy Physics-02.pdf
Thinking Routines and Learning Engagements.pptx
Disorder of Endocrine system (1).pdfyyhyyyy
pharmaceutics-1unit-1-221214121936-550b56aa.pptx
LATAM’s Top EdTech Innovators Transforming Learning in 2025.pdf
Chevening Scholarship Application and Interview Preparation Guide
THEORY AND PRACTICE ASSIGNMENT SEMESTER MAY 2025.docx
IT infrastructure and emerging technologies
Health aspects of bilberry: A review on its general benefits
GSA-Past-Papers-2010-2024-2.pdf CSS examination
CHALLENGES FACED BY TEACHERS WHEN TEACHING LEARNERS WITH DEVELOPMENTAL DISABI...
principlesofmanagementsem1slides-131211060335-phpapp01 (1).ppt

'Your Scholarship. Our World. Preserving the Long Tail' by Vicky Reich

  • 1. Your Scholarship. Our World. Preserving The Long Tail Victoria Reich, Executive Director LOCKSS Program Stanford University Libraries http://www.lockss.org/ 1September  2015  
  • 2. Content Is Web-based Web size: 1200 PB Internet Archive: 9 PB Brewster Kahle Founder and Director Internet Archive CONTENT IN CONTEXT Scholarly communication is: author’s words, data, software, communication, identity, related works, etc. 2
  • 3. 50% Preserved? •  2010 ARL median research library receives ~80K serials •  Keepers Registry reports 28.5K preserved; 10K in progress –  Not adjusted for risk –  Not adjusted for difficulty and cost 3
  • 4. Biggest Threat to Content? •  Obsolescence and/or failure –  Formats –  Media, –  Hardware, –  Software •  Economic – national, organizational •  Natural disasters •  Humans 4
  • 5. Reality The rate of loss to future researchers from “never preserved” will vastly exceed that from all other causes. Dr. David S. H. Rosenthal http://blog.dshr.org/2014/12/talk-at-fall-cni.html 5
  • 6. Philosophy Preservation is an active community effort LOTS OF COPIES KEEP STUFF SAFE LOTS OF COMMUNITIES KEEP STUFF SAFE 6
  • 7. The LOCKSS Program •  Communities use LOCKSS open source software to preserve & access their scholarly record •  LOCKSS staff provide services and software 7
  • 9. Preservation Architecture •  Ingest •  Preservation •  Dissemination •  Management 9
  • 10. Formats Preserved application/eps application/epub+zip application/javascript application/msword application/octet-stream application/powerpoint application/pdf application/ postscript application/rss+xml application/rtf application/vnd.fdf application/ vnd.ms-excel application/vnd.ms-powerpoint application/vnd.ms-word application/vnd.openxmlformats officedocument.wordprocessingml.document application/vnd.rn-realmedia application/wordperfect5.1 application/xhtml+xml application/x-javascript application/xml application/x-msexcel application/x-research-info-systems application/x-troff application/x-zip-compressed application/zip audio/mpeg audio/x-mp3 audio/x-pn-realaudio chemical/x-mdl-molfile image/bmp image/ gif image/jpeg image/pjpeg image/png image/svg+xml image/tiff image/ vnd.microsoft.icon image/x-icon text/css text/html text/javascript text/plain text/rtf text/x-bibtex text/x-js text/xml video/avi video/mp4 video/mp4v-es video/mpeg video/quicktime video/x-msvideo video/x-ms-wmv •  LOCKSS software migrates formats as needed 10
  • 11. Automated Cooperative Preservation Identify and preserve authoritative version  
  • 12. TRAC/ISO16363 Audit •  1st Ever Perfect Score for Technologies, Technical Infrastructure, Security •  Equaled previous over all highest score (Scholars Portal) •  Documentation is public •  Linked from dshr.blog.org •  Introduction •  TRAC Audit: Process - http://blog.dshr.org/2014/08/trac-audit-process.html •  TRAC Audit: Lessons - http://blog.dshr.org/2014/08/trac-audit-lessons.html •  TRAC Audit: Do-It-Yourself Demos - http://blog.dshr.org/2014/08/trac-audit-do-it-yourself-demos.html 12
  • 13. Many LOCKSS Networks •  1000’s of publishers –  Subscription, open access, etc. •  Ingest techniques –  OAI/PMH, web crawling, file transfer, API, etc. •  Preserved content types –  Journals, books, databases, government documents, thesis and dissertations, image collections… •  Each with an organization & business model •  Each with an appropriate access policy 13
  • 20. 5 Universities / 3 countries 20 Master and Ph.D. theses Academic publications Research data
  • 22. PKP Private LOCKSS Network 22
  • 23. Access From A LOCKSS Box 23
  • 24. When The Publisher Is Not Available 24
  • 25. Appreciation The LOCKSS Program’s simple and flexible technical architecture is particularly well suited to the rapidly evolving landscape of e-journal publishing and scholarly practice. Bernie Reilly, President Council for Research Libraries, 2014 25
  • 26. Research & Development •  Internet Archive –  IMLS to build web preservation APIs –  Web infrastructure and collection building •  Mellon Foundation –  Preserving the future web –  Emulation as a preservation strategy •  IIPC –  Preservation and access (Memento, INA’s LAP) •  Library of Congress –  Economics of long term storage •  Force 11 –  Scholarly communication •  University of California Santa Cruz –  Storage technologies, Advise PhD students •  4C –  Preservation Costs 26
  • 27. Thank you Looking forward to the conversation! 27