Vicky and I were invited to contribute to a festschrift celebrating Cliff Lynch's retirement from the Coalition for Networked Information. We decided to focus on his role in the long-running controversy over how digital information was to be preserved for the long haul.
Below the fold is our contribution, before it was copy-edited for portal: Libraries and the Academy.
I'm David Rosenthal, and this is a place to discuss the work I'm doing in Digital Preservation.
Showing posts with label format obsolescence. Show all posts
Showing posts with label format obsolescence. Show all posts
Thursday, April 10, 2025
Monday, April 7, 2025
Paul Evan Peters Award Lecture
At the Spring 2025 Membership Meeting of the Coalition for Networked Information, Vicky and I received the Paul Evan Peters Award.
You can tell this is an extraordinary honor from the list of previous awardees, and the fact that it is the first time it has been awarded in successive years. Part of the award is the opportunity to make an extended presentation to open the meeting. Our talk was entitled Lessons From LOCKSS, and the abstract was:
Below the fold is the text with links to the sources, information that appeared on slides but was not spoken, and much additional information in footnotes.
You can tell this is an extraordinary honor from the list of previous awardees, and the fact that it is the first time it has been awarded in successive years. Part of the award is the opportunity to make an extended presentation to open the meeting. Our talk was entitled Lessons From LOCKSS, and the abstract was:
Vicky and David will look back over their two decades with the LOCKSS Program. Vicky will focus on the Program's initial goals and how they evolved as the landscape of academic communication changed. David will focus on the Program's technology, how it evolved, and how this history reveals a set of seductive, persistent but impractical ideas.CNI has posted the video of the entire opening plenary to YouTube. Don Waters' generous introduction starts at 14:28 and Vicky starts talking at 20:00.
Below the fold is the text with links to the sources, information that appeared on slides but was not spoken, and much additional information in footnotes.
Tuesday, November 24, 2020
I Rest My Case
Jeff Rothenberg's seminal 1995 Ensuring the Longevity of Digital Documents focused on the threat of the format in which the documents were encoded becoming obsolete, and rendering its content inaccessible. This was understandable, it was a common experience in the preceeding decades. Rothenberg described two different approaches to the problem, migrating the document's content from the doomed format to a less doomed one, and emulating the software that accessed the document in a current environment.
The Web has dominated digital content since 1995, and in the Web world formats go obsolete very slowly, if at all, because they are in effect network protocols. The example of IPv6 shows how hard it is to evolve network protocols. But now we are facing the obsolescence of a Web format that was very widey used as the long effort to kill off Adobe's Flash comes to fruition. Fortunately, Jason Scott's Flash Animations Live Forever at the Internet Archive shows that we were right all along. Below the fold, I go into the details.
The Web has dominated digital content since 1995, and in the Web world formats go obsolete very slowly, if at all, because they are in effect network protocols. The example of IPv6 shows how hard it is to evolve network protocols. But now we are facing the obsolescence of a Web format that was very widey used as the long effort to kill off Adobe's Flash comes to fruition. Fortunately, Jason Scott's Flash Animations Live Forever at the Internet Archive shows that we were right all along. Below the fold, I go into the details.
Thursday, April 4, 2019
Digitized Historical Documents
![]() |
| Source |
Tuesday, December 5, 2017
International Digital Preservation Day
The Digital Preservation Coalition's International Digital Preservation Day was marked by a wide-ranging collection of blog posts. Below the fold, some links to and comments on, a few of them.
Wednesday, November 1, 2017
Randall Munroe Says It All
![]() |
| The latest XKCD is a succinct summation of the situation, especially the mouse-over. |
Thursday, February 16, 2017
Postel's Law again
Eight years ago I wrote:
In RFC 793 (1981) the late, great Jon Postel laid down one of the basic design principles of the Internet, Postel's Law or the Robustness Principle:Recently, discussion on a mailing list I'm on focused on the downsides of Postel's Law. Below the fold, I try to explain why most of these downsides don't apply to the "accept" side, which is the side that matters for digital preservation.
"Be conservative in what you do; be liberal in what you accept from others."Its important not to lose sight of the fact that digital preservation is on the "accept" side of Postel's Law,
Thursday, May 26, 2016
Abby Smith Rumsey's "When We Are No More"
Back in March I attended the launch of Abby Smith Rumsey's book When We Are No More. I finally found time to read it from cover to cover, and can recommend it. Below the fold are some notes.
Tuesday, November 3, 2015
Emulation & Virtualization as Preservation Strategies
I'm very grateful that funding from the Mellon Foundation on behalf of themselves, the Sloan Foundation and IMLS allowed me to spend much of the summer researching and writing a report, Emulation and Virtualization as Preservation Strategies (37-page PDF, CC-By-SA). I submitted a draft last month, it has been peer-reviewed and I have addressed the reviewers comments. It is also available on the LOCKSS web site.
I'm old enough to know better than to give a talk with live demos. Nevertheless, I'll be presenting the report at CNI's Fall membership meeting in December complete with live demos of a number of emulation frameworks. TheTL;DR executive summary of the report is below the fold.
I'm old enough to know better than to give a talk with live demos. Nevertheless, I'll be presenting the report at CNI's Fall membership meeting in December complete with live demos of a number of emulation frameworks. The
Wednesday, September 16, 2015
"The Prostate Cancer of Preservation" Re-examined
My third post to this blog, more than 8 years ago, was entitled Format Obsolescence: the Prostate Cancer of Preservation. In it I argued that format obsolescence for widely-used formats such as those on the Web, would be rare. If it ever happened, would be a very slow process allowing plenty of time for preservation systems to respond.
Thus devoting a large proportion of the resources available for preservation to obsessively collecting metadata intended to ease eventual format migration was economically unjustifiable, for three reasons. First, the time value of money meant that paying the cost later would allow more content to be preserved. Second, the format might never suffer obsolescence, so the cost of preparing to migrate it would be wasted. Third, if the format ever did suffer obsolescence, the technology available to handle it when obsolescence occurred would be better than when it was ingested.
Below the fold, I ask how well the predictions have held up in the light of subsequent developments?
Thus devoting a large proportion of the resources available for preservation to obsessively collecting metadata intended to ease eventual format migration was economically unjustifiable, for three reasons. First, the time value of money meant that paying the cost later would allow more content to be preserved. Second, the format might never suffer obsolescence, so the cost of preparing to migrate it would be wasted. Third, if the format ever did suffer obsolescence, the technology available to handle it when obsolescence occurred would be better than when it was ingested.
Below the fold, I ask how well the predictions have held up in the light of subsequent developments?
Tuesday, February 17, 2015
Vint Cerf's talk at AAAS
Vint Cerf gave a talk entitled Digital Vellum at the AAAS meeting last Friday that has received a lot of attention in the media, including follow-up pieces by other writers, and even drew the attention of Dave Farber's famed IP list. I have some doubts about how accurately the press
has reported his talk, which isn't available via the
AAAS meeting website. I am commenting on the reports, not
the talk. But, as The Register points out, Cerf has been making similar points for some time. I did find a TEDx talk he titled Bit Rot on YouTube, uploaded a year ago. Below the fold is my take.
Monday, March 31, 2014
The Half-Empty Archive
Cliff Lynch invited me to give one of UC Berkeley iSchool's "Information Access Seminars" entitled The Half-Empty Archive. It was based on my brief introductory talk at ANADP II last November, an expanded version given as a staff talk at the British Library last January, and the discussions following both. An edited text with links to the sources is below the fold.
Wednesday, March 5, 2014
Windows XP
The idea that format migration is integral to digital preservation was for a long time reinforced by people's experience of format incompatibility in Microsoft's Office suite. Microsoft's business model used to depend on driving the upgrade cycle by introducing gratuitous forward incompatibility, new versions of the software being set up to write formats that older versions could not render. But what matters for digital preservation is backwards incompatibility; newer versions of the software being unable to render content written by older versions. Six years ago the limits of Microsoft's ability to introduce backwards incompatibility were dramatically illustrated when they tried to remove support for some really old formats.
The reason for this fiasco was that Microsoft greatly over-estimated its ability to impose the costs of migrating old content on their customers, and the customer's ability to resist. Old habits die hard. Microsoft is trying to end support of Windows XP and Office 2003 on April 8 but it isn't providing cost-effective upgrade paths for what is now Microsoft's fastest-growing installed base. Joel Hruska writes:
The reason for this fiasco was that Microsoft greatly over-estimated its ability to impose the costs of migrating old content on their customers, and the customer's ability to resist. Old habits die hard. Microsoft is trying to end support of Windows XP and Office 2003 on April 8 but it isn't providing cost-effective upgrade paths for what is now Microsoft's fastest-growing installed base. Joel Hruska writes:
Microsoft has come under serious fire for some significant missteps in this process, including a total lack of actual upgrade options. What Microsoft calls an upgrade involves completely wiping the PC and reinstalling a fresh OS copy on it — or ideally, buying a new device. Microsoft has misjudged how strong its relationship is with consumers and failed to acknowledge its own shortcomings. Not providing an upgrade utility is one example — but so is the general lack of attractive upgrade prices or even the most basic understanding of why users haven't upgraded.This resistance to change has obvious implications for digital preservation.
Saturday, April 27, 2013
Software obsolescence doesn't imply format obsolescence
Tim Anderson at The Register celebrates the 20th anniversary of Mosaic:
Using the DOSBox emulator (the Megabuild version which has network connectivity via an emulated NE2000 NIC) I ran up Windows 3.11 with Trumpet Winsock and got Mosaic 1.0 running.This illustrates two important points:
- Tim had no trouble resuscitating a 20-year-old software environment using off-the-shelf emulation.
- The 20-year-old browser struggled to make sense of today's web. But today's browsers have no difficulty at all with vintage web pages.
Thursday, April 4, 2013
Talk at Spring 2013 CNI
Kris Carpenter Negulescu and I gave talks at the Spring 2013 CNI meeting in a project briefing entitled "Its Not Your Grandfather's Web Any Longer". They were based on the workshop we ran at the 2012 IIPC meeting at the Library of Congress looking at the problems of harvesting and preserving the future Web. I talked about the problems the workshop identified and Kris talked about the solutions people are working on. Below the fold is an edited text of my part of the talk with links to the sources.
Tuesday, February 12, 2013
Rothenberg still wrong
Last March Jeff Rothenberg gave a keynote entitled Digital Preservation in Perspective:How far have we come, and what's next? to the Future Perfect 2012 conference at the wonderful, must-visit Te Papa Tongarewa museum in Wellington, New Zealand. The video is here. The talk only recently came to my attention, for which I apologize.
I have long argued, for example in my 2009 CNI keynote, that while Jeff correctly diagnosed the problems of digital preservation in the pre-Web era, the transition to the Web that started in the mid-90s made those problems largely irrelevant. Jeff's presentation is frustrating, in that it shows how little his thinking has evolved to grapple with the most significant problems facing digital preservation today. Below the fold is my critique of Jeff's keynote.
I have long argued, for example in my 2009 CNI keynote, that while Jeff correctly diagnosed the problems of digital preservation in the pre-Web era, the transition to the Web that started in the mid-90s made those problems largely irrelevant. Jeff's presentation is frustrating, in that it shows how little his thinking has evolved to grapple with the most significant problems facing digital preservation today. Below the fold is my critique of Jeff's keynote.
Thursday, November 8, 2012
Format Obsolescence In The WIld?
The Register has a report that, at a glance, looks like one of the long-sought instances of format obsolescence in the wild:
Andrew Brown asked to see the echocardiogram of his ticker, which was taken eight years ago. He was told that although the scan is still on file in the Worcestershire Royal hospital, it will cost a couple of grand to recreate the data as an image because it is stored in a format that can no longer be read by the hospital's computers.But looked at more closely below the fold we see that it isn't so simple.
Saturday, October 13, 2012
Cleaning up the "Formats through tIme" mess
As I said in this comment on my post Formats through time, time pressure meant that I made enough of a mess of it to need a whole new post to clean up. Below the fold is my attempt to remedy the situation.
Tuesday, October 9, 2012
Formats through time
Two interesting and important recent studies provide support for the case I've been making for at least the last 5 years that Jeff Rothenberg's pre-Web analysis of format obsolescence is itself obsolete. Details below the fold.
Labels:
format migration,
format obsolescence,
normalization
Monday, April 11, 2011
Technologies Don't Die
Kevin Kelly finds the same reaction of incredulity when he pointed out that physical technologies do not die as I did when I pointed out that digital formats are not becoming obsolete. Robert Krulwich of NPR challenged Kelly, but had to retire defeated when he and the NPR listeners failed to find any but trivial examples of dead technology.
And, in related news, The Register has two articles on a working 28-year-old Seagate ST-412 disk drive from an IBM 5156 PC expansion box. They point out, as I have, that disk drives are not getting faster as fast as they are getting bigger:
And, in related news, The Register has two articles on a working 28-year-old Seagate ST-412 disk drive from an IBM 5156 PC expansion box. They point out, as I have, that disk drives are not getting faster as fast as they are getting bigger:
The 3TB Barracuda still has one read/write head per platter surface and each head now has 300,000MB to look after, whereas the old ST-412 heads each have just 5MB to look after.Revised 4/12/11 to make clear that the disk drive still works.
The Barracuda will take longer today to read or write an entire platter surface's capacity than the 28-year-old ST-412 will. We have increased capacity markedly but disk I/O has become a bottleneck at the platter surface level, and is set to remain that way. The Register
Subscribe to:
Comments (Atom)

