Posts

Showing posts with the label web archive

2023-10-10: In appreciation of the "ridiculous and unworkable" projects that make the Internet great and research possible

Image
  https://xkcd.com/2085/   The Internet Archive is hosting their annual celebration this week ( October 12, 2023 ), and I wanted to take this opportunity to both 1) encourage your attendance (virtual for most of us, but if you're in San Francisco, you can attend in person), and 2) express my appreciation and gratitude for continued existence of the Internet Archive, their evolving products and services, and their support of the research community.    The ongoing devolvement of Twitter into 4chan has caused me to reflect on the platforms, services, and corpuses on which I have built a research program over the last 20+ years.  Discussing the Twitter situation will be the topic of a future post, but here I want to laud the Internet Archive, specifically the Wayback Machine, and by extension, the suite of other public web archives, such as Archive.Today , Arquivo.pt , and the many members of IIPC .  In the past I've referred to the Internet Archive as th...

2023-05-25: Generative Archive Restoration

Image
  Rise of the Machines! Machine Learning just cannot seem to keep itself out of news cycles. The third version of OpenAI's generative dialogue language model, ChatGPT, had tech giants all around scrambling in quite a circus trying to push out their own versions. Google's size and stagnation in recent years had it seeing red and bringing Sergey Brin back into the fold to aid in rushing out its own chat bot, Bard . Microsoft, comparatively, has been humming along for a while now with its own research in the AI agent space but a various headlines  hint that its past and present efforts might not be paying off as well as they were hoping. You.com  is a relatively new search engine leveraging machine learning in its own chat assistant YouChat  and other services in an attempt to push the frontiers of a search engine through multi-modal search with integrated artificial intelligence enhancements. These efforts are all seeking to shake up how we seek and retrieve in...