I started blogging about the transition the Web is undergoing from a document to a programming model, from static to dynamic content, some time ago. This transition has very fundamental implications for Web archiving; what exactly does it mean to preserve something that is different every time you look at it? Not to mention the vastly increased cost of ingest, because executing a program takes a lot more, a potentially unlimited amount of, computation than simply parsing a document.
The transition has big implications for search engines too; they also have to execute rather than parse. Web developers have a strong incentive to make their pages search engine friendly, so although they have enthusiastically embraced Javascript they have often retained a parse-able path for search engine crawlers to follow. We have watched academic journals adopt Javascript, but so far very few have forced us to execute it to find their content.
Adam Audette and his collaborators at Merkle | RKG have an interesting post entitled We Tested How Googlebot Crawls Javascript And Here’s What We Learned. It is aimed at the SEO (Search Engine Optimzation) world but it contains a lot of useful information for Web archiving. The TL;DR is that Google (but not yet other search engines) is now executing the Javascript in ways that make providing an alternate, parse-able path largely irrelevant to a site's ranking. Over time, this will mean that the alternate paths will disappear, and force Web archives to execute the content.
I'm David Rosenthal, and this is a place to discuss the work I'm doing in Digital Preservation.
Showing posts with label CNI2009spring. Show all posts
Showing posts with label CNI2009spring. Show all posts
Tuesday, May 19, 2015
Tuesday, November 26, 2013
In-browser emulation
Jeff Rothenberg's ground-breaking 1995 article Ensuring the Longevity of Digital Documents described and compared two techniques to combat format obsolescence; format migration and emulation, concluding that emulation was the preferred approach. As time went by and successive digital preservation systems went into production it became clear that almost all of them rejected Jeff's conclusion, planning to use format migration as their preferred response to format obsolescence. Follow me below the fold for a discussion on why this happened and whether it still makes sense.
Monday, July 13, 2009
Spring CNI Plenary: The Video
CNI has now posted the video of Cliff Lynch's introduction, my plenary presentation, and the questions.
I gave a significantly shortened version of this talk at the Sun PASIG meeting in Malta June 26.
How Are We Ensuring the Longevity of Digital Documents? from CNI Video Editor on Vimeo.
I gave a significantly shortened version of this talk at the Sun PASIG meeting in Malta June 26.
Wednesday, May 6, 2009
Sheila Morrissey's comment
Portico's Sheila Morrissey posted a valuable comment on the post that provided the background and sources for my CNI plenary. It set out the conventional wisdom against which I was arguing, but at such length that I felt it was inhibiting discussion. It was also difficult to respond to by adding a comment, among other reasons because there was no easy way to connect my responses to their targets in the comment. I therefore saved the text of Sheila's comment, deleted it from the original post, and reproduced it below the fold, together with my responses. Portico has posted a version of her comment here.
Friday, April 10, 2009
Spring CNI Plenary: The Remix
This post provides the text of the slides, sources and commentary for the opening plenary that I just gave at the CNI Spring Task Force meeting. The actual slides are available here (PDF). Follow me below the fold for the full details.
Monday, March 30, 2009
Spring CNI Plenary
I can finally reveal the mysterious talk I referred to in this comment; it is the opening plenary at CNI's Spring Task Force meeting one week from today. In essence, the talk is a look back at Jeff Rothenberg's 1995 Scientific American article "Ensuring the Longevity of Digital Documents" which asks:
CNI will post the slides after the talk, and I plan to post here a commentary on them providing links to sources and additional details. You will be able to see how the discussions here were a very valuable resource. Thank you all.
- What led Jeff to his dire predictions?
- Would one make the same dire predictions now?
- If not, what dire predictions would one make, and why?
CNI will post the slides after the talk, and I plan to post here a commentary on them providing links to sources and additional details. You will be able to see how the discussions here were a very valuable resource. Thank you all.
Subscribe to:
Comments (Atom)