Posts

Showing posts with the label link rot

2023-09-05: Paper Summary: "Gone, Gone, but Not Really, and Gone, But Not Forgotten: A Typology of Website Recoverability" (Reyes Ayala TempWeb '23)

Image
  Brenda Reyes Ayala, " Gone, Gone, but Not Really, and Gone, But Not Forgotten: A Typology of Website Recoverability " , 13th Temporal Web Analytics Workshop ( TempWeb '23 ) in Companion Proceedings of the Web Conference 2023 (WWW '23) , Apr. 2023 (Texas, USA), pp. 1208-1213, doi: 10.1145/3543873.3587671 . We often come across web pages where we see ‘Error 404’, which means the server is unable to retrieve the requested page. Moreover, we also encounter web pages where the content significantly changes through time, moving away fro m the original referenced content. Such disappearance of web resources is a common phenomenon on the web. Web resources can disappear or change for a variety of reasons , such as server crashes, expired domains, hacking, creators abandoning websites and moving web resources to a different location. Disappearance of resources from the web is broadly termed as reference rot, which has two components - link rot and content drift . Link r...

2016-10-24: Fun with Fictional Web Sites and the Internet Archive

Image
As we celebrate the 20th anniversary of the Internet Archive , I realize that using Memento and the Wayback Machine has become second nature when solving certain problems, not only in my research, but also in my life. Those who have read my Master's Thesis, Avoiding Spoilers on Mediawiki Fan Sites Using Memento , know that I am a fan of many fictional television shows and movies. URIs are discussed in these fictional worlds , and sometimes the people making the fiction actually register these URIs , seen in the example below, creating an additional vector for fans to find information on their favorite characters and worlds. Real web site at http://www.piedpiper.com/ for the fictional company Pied Piper from HBO's TV series  Silicon Valley Unfortunately, interest in maintaining these URIs fades once the television show is cancelled or the movie is no longer showing. As noted in my thesis, the advent of services like Netflix and Hulu allow fans to watch old televisio...

2016-09-09: Summer Fellowship at the Harvard Library Innovation Lab Trip Report

Image
Myself standing at the main entrance of Langdell Hall I was honored with the great opportunity of collaborating with the Harvard Library Innovation Lab (LIL) as a Fellow this Summer. Located at Langdell Hall, Harvard Law School , the Library Innovation Lab develops solutions to solve serious problems facing libraries. It consists of an eclectic group of Lawyers, Librarians, and Software Developers engaged in projects such as  Perma.cc , Caselaw Access Project  (CAP), The Nuremberg Project , among many others .  The LIL Team To help prevent link ro t, Perma.cc creates permanent reliable links for web resources. The Caselaw Access Project is an ambitious project which strives to make all US case laws freely accessible online. The current collection to be digitized stands at over 42,000 volumes (nearly 40 million pages) . The Nuremberg Project is concerned with the digitization of LIL's collection about the Nuremberg trials.  How Harvard digitized nearl...

2015-02-17: Fixing Links on the Live Web, Breaking Them in the Archive

Image
On February 2nd, 2015, Rene Voorburg announced the JavaScript utility robustify.js . The robustify.js code, when embedded in the HTML of a web page, helps address the challenge with link rot by detecting when a clicked link will return an HTTP 404 and uses the Memento Time Travel Service to discover mementos of the URI-R. Robustify.js assigns an onclick event to each anchor tag in the HTML. The event occurs, robustify.js makes an Ajax call to a service to test the HTTP response code of the target URI. When an HTTP 404 response code is detected by robustify.js, it uses Ajax to make a call to a remote server, uses the Memento Time Travel Service to find mementos of the URI-R, and uses a JavaScript alert to let the user know that JavaScript will redirect the user to the memento. Our recent studies have shown that JavaScript -- particularly Ajax -- normally makes preservation more difficult, but robustify.js is a useful utility that is easily implemented to solve an importan...