Showing posts with label format obsolescence. Show all posts
Showing posts with label format obsolescence. Show all posts

Thursday, April 10, 2025

Cliff Lynch's festschrift

Vicky and I were invited to contribute to a festschrift celebrating Cliff Lynch's retirement from the Coalition for Networked Information. We decided to focus on his role in the long-running controversy over how digital information was to be preserved for the long haul.

Below the fold is our contribution, before it was copy-edited for portal: Libraries and the Academy.

Monday, April 7, 2025

Paul Evan Peters Award Lecture

At the Spring 2025 Membership Meeting of the Coalition for Networked Information, Vicky and I received the Paul Evan Peters Award.

You can tell this is an extraordinary honor from the list of previous awardees, and the fact that it is the first time it has been awarded in successive years. Part of the award is the opportunity to make an extended presentation to open the meeting. Our talk was entitled Lessons From LOCKSS, and the abstract was:
Vicky and David will look back over their two decades with the LOCKSS Program. Vicky will focus on the Program's initial goals and how they evolved as the landscape of academic communication changed. David will focus on the Program's technology, how it evolved, and how this history reveals a set of seductive, persistent but impractical ideas.
CNI has posted the video of the entire opening plenary to YouTube. Don Waters' generous introduction starts at 14:28 and Vicky starts talking at 20:00.

Below the fold is the text with links to the sources, information that appeared on slides but was not spoken, and much additional information in footnotes.

Tuesday, November 24, 2020

I Rest My Case

Jeff Rothenberg's seminal 1995 Ensuring the Longevity of Digital Documents focused on the threat of the format in which the documents were encoded becoming obsolete, and rendering its content inaccessible. This was understandable, it was a common experience in the preceeding decades. Rothenberg described two different approaches to the problem, migrating the document's content from the doomed format to a less doomed one, and emulating the software that accessed the document in a current environment.

The Web has dominated digital content since 1995, and in the Web world formats go obsolete very slowly, if at all, because they are in effect network protocols. The example of IPv6 shows how hard it is to evolve network protocols. But now we are facing the obsolescence of a Web format that was very widey used as the long effort to kill off Adobe's Flash comes to fruition. Fortunately, Jason Scott's Flash Animations Live Forever at the Internet Archive shows that we were right all along. Below the fold, I go into the details.

Thursday, April 4, 2019

Digitized Historical Documents

Source
Josh Marshall of Talking Points Memo trained as a historian. From that perspective, he has a great post entitled Navigating the Deep Riches of the Web about the way digitization and the Web have transformed our access to historical documents. Below the fold, I bestow both praise and criticism.

Tuesday, December 5, 2017

International Digital Preservation Day

The Digital Preservation Coalition's International Digital Preservation Day was marked by a wide-ranging collection of blog posts. Below the fold, some links to and comments on, a few of them.

Wednesday, November 1, 2017

Randall Munroe Says It All

The latest XKCD is a succinct summation of the situation, especially the mouse-over.

Thursday, February 16, 2017

Postel's Law again

Eight years ago I wrote:
In RFC 793 (1981) the late, great Jon Postel laid down one of the basic design principles of the Internet, Postel's Law or the Robustness Principle:
"Be conservative in what you do; be liberal in what you accept from others."
Its important not to lose sight of the fact that digital preservation is on the "accept" side of Postel's Law,
Recently, discussion on a mailing list I'm on focused on the downsides of Postel's Law. Below the fold, I try to explain why most of these downsides don't apply to the "accept" side, which is the side that matters for digital preservation.

Thursday, May 26, 2016

Abby Smith Rumsey's "When We Are No More"

Back in March I attended the launch of Abby Smith Rumsey's book When We Are No More. I finally found time to read it from cover to cover, and can recommend it. Below the fold are some notes.

Tuesday, November 3, 2015

Emulation & Virtualization as Preservation Strategies

I'm very grateful that funding from the Mellon Foundation on behalf of themselves, the Sloan Foundation and IMLS allowed me to spend much of the summer researching and writing a report, Emulation and Virtualization as Preservation Strategies (37-page PDF, CC-By-SA). I submitted a draft last month, it has been peer-reviewed and I have addressed the reviewers comments. It is also available on the LOCKSS web site.

I'm old enough to know better than to give a talk with live demos. Nevertheless, I'll be presenting the report at CNI's Fall membership meeting in December complete with live demos of a number of emulation frameworks. The TL;DR executive summary of the report is below the fold.

Wednesday, September 16, 2015

"The Prostate Cancer of Preservation" Re-examined

My third post to this blog, more than 8 years ago, was entitled Format Obsolescence: the Prostate Cancer of Preservation. In it I argued that format obsolescence for widely-used formats such as those on the Web, would be rare. If it ever happened, would be a very slow process allowing plenty of time for preservation systems to respond.

Thus devoting a large proportion of the resources available for preservation to obsessively collecting metadata intended to ease eventual format migration was economically unjustifiable, for three reasons. First, the time value of money meant that paying the cost later would allow more content to be preserved. Second, the format might never suffer obsolescence, so the cost of preparing to migrate it would be wasted. Third, if the format ever did suffer obsolescence, the technology available to handle it when obsolescence occurred would be better than when it was ingested.

Below the fold, I ask how well the predictions have held up in the light of subsequent developments?

Tuesday, February 17, 2015

Vint Cerf's talk at AAAS

Vint Cerf gave a talk entitled Digital Vellum at the AAAS meeting last Friday that has received a lot of attention in the media, including follow-up pieces by other writers, and even drew the attention of Dave Farber's famed IP list. I have some doubts about how accurately the press has reported his talk, which isn't available via the AAAS meeting website. I am commenting on the reports, not the talk. But, as The Register points out, Cerf has been making similar points for some time. I did find a TEDx talk he titled Bit Rot on YouTube, uploaded a year ago. Below the fold is my take.

Monday, March 31, 2014

The Half-Empty Archive

Cliff Lynch invited me to give one of UC Berkeley iSchool's "Information Access Seminars" entitled The Half-Empty Archive. It was based on my brief introductory talk at ANADP II last November, an expanded version given as a staff talk at the British Library last January, and the discussions following both. An edited text with links to the sources is below the fold.

Wednesday, March 5, 2014

Windows XP

The idea that format migration is integral to digital preservation was for a long time reinforced by people's experience of format incompatibility in Microsoft's Office suite. Microsoft's business model used to depend on driving the upgrade cycle by introducing gratuitous forward incompatibility, new versions of the software being set up to write formats that older versions could not render. But what matters for digital preservation is backwards incompatibility; newer versions of the software being unable to render content written by older versions. Six years ago the limits of Microsoft's ability to introduce backwards incompatibility were dramatically illustrated when they tried to remove support for some really old formats.

The reason for this fiasco was that Microsoft greatly over-estimated its ability to impose the costs of migrating old content on their customers, and the customer's ability to resist. Old habits die hard. Microsoft is trying to end support of Windows XP and Office 2003 on April 8 but it isn't providing cost-effective upgrade paths for what is now Microsoft's fastest-growing installed base. Joel Hruska writes:
Microsoft has come under serious fire for some significant missteps in this process, including a total lack of actual upgrade options. What Microsoft calls an upgrade involves completely wiping the PC and reinstalling a fresh OS copy on it — or ideally, buying a new device. Microsoft has misjudged how strong its relationship is with consumers and failed to acknowledge its own shortcomings. Not providing an upgrade utility is one example — but so is the general lack of attractive upgrade prices or even the most basic understanding of why users haven't upgraded.
This resistance to change has obvious implications for digital preservation.

Saturday, April 27, 2013

Software obsolescence doesn't imply format obsolescence

Tim Anderson at The Register celebrates the 20th anniversary of Mosaic:
Using the DOSBox emulator (the Megabuild version which has network connectivity via an emulated NE2000 NIC) I ran up Windows 3.11 with Trumpet Winsock and got Mosaic 1.0 running.
This illustrates two important points:
  • Tim had no trouble resuscitating a 20-year-old software environment using off-the-shelf emulation.
  • The 20-year-old browser struggled to make sense of today's web. But today's browsers have no difficulty at all with vintage web pages.
The fact that the software that originally interpreted the content is obsolete (a) does not meant that there is significant difficulty in running it, and (b) does not mean that you need to use emulation to run it in order to interpret the content, because the obsolescence of the software does not imply the obsolescence of the format. Backwards compatibility is a feature of the Web, for reasons I have been pointing out for many years.

Thursday, April 4, 2013

Talk at Spring 2013 CNI

Kris Carpenter Negulescu and I gave talks at the Spring 2013 CNI meeting in a project briefing entitled "Its Not Your Grandfather's Web Any Longer". They were based on the workshop we ran at the 2012 IIPC meeting at the Library of Congress looking at the problems of harvesting and preserving the future Web. I talked about the problems the workshop identified and Kris talked about the solutions people are working on. Below the fold is an edited text of my part of the talk with links to the sources.

Tuesday, February 12, 2013

Rothenberg still wrong

Last March Jeff Rothenberg gave a keynote entitled Digital Preservation in Perspective:How far have we come, and what's next? to the Future Perfect 2012 conference at the wonderful, must-visit Te Papa Tongarewa museum in Wellington, New Zealand. The video is here. The talk only recently came to my attention, for which I apologize.

I have long argued, for example in my 2009 CNI keynote, that while Jeff correctly diagnosed the problems of digital preservation in the pre-Web era, the transition to the Web that started in the mid-90s made those problems largely irrelevant. Jeff's presentation is frustrating, in that it shows how little his thinking has evolved to grapple with the most significant problems facing digital preservation today. Below the fold is my critique of Jeff's keynote.

Thursday, November 8, 2012

Format Obsolescence In The WIld?

The Register has a report that, at a glance, looks like one of the long-sought instances of format obsolescence in the wild:
Andrew Brown asked to see the echocardiogram of his ticker, which was taken eight years ago. He was told that although the scan is still on file in the Worcestershire Royal hospital, it will cost a couple of grand to recreate the data as an image because it is stored in a format that can no longer be read by the hospital's computers.
But looked at more closely below the fold we see that it isn't so simple.

Saturday, October 13, 2012

Cleaning up the "Formats through tIme" mess

As I said in this comment on my post Formats through time, time pressure meant that I made enough of a mess of it to need a whole new post to clean up. Below the fold is my attempt to remedy the situation.

Tuesday, October 9, 2012

Formats through time

Two interesting and important recent studies provide support for the case I've been making for at least the last 5 years that Jeff Rothenberg's pre-Web analysis of format obsolescence is itself obsolete. Details below the fold.

Monday, April 11, 2011

Technologies Don't Die

Kevin Kelly finds the same reaction of incredulity when he pointed out that physical technologies do not die as I did when I pointed out that digital formats are not becoming obsolete. Robert Krulwich of NPR challenged Kelly, but had to retire defeated when he and the NPR listeners failed to find any but trivial examples of dead technology.

And, in related news, The Register has two articles on a working 28-year-old Seagate ST-412 disk drive from an IBM 5156 PC expansion box. They point out, as I have, that disk drives are not getting faster as fast as they are getting bigger:
The 3TB Barracuda still has one read/write head per platter surface and each head now has 300,000MB to look after, whereas the old ST-412 heads each have just 5MB to look after.

The Barracuda will take longer today to read or write an entire platter surface's capacity than the 28-year-old ST-412 will. We have increased capacity markedly but disk I/O has become a bottleneck at the platter surface level, and is set to remain that way. The Register
Revised 4/12/11 to make clear that the disk drive still works.