We interrupt our regularly scheduled blogging for this special announcement. Go read Michael Nelson's post Why we need multiple web archives: the case of blog.reidreport.com right now! Its a detailed account in several updates of the forensic analysis of Joy-Ann Reid claim that either her blog or the Internet Archive was hacked. Michael's work landed him a spot on CNN at 0930 April 29th. He did an excellent job of explanation. Half an hour later Reid walked back her claims.
Michael is right about the importance of multiple independent Web archives; once again the Lots Of Copies Keep Stuff Safe principle. But the economics of this multiplicity are problematic.
I'm David Rosenthal, and this is a place to discuss the work I'm doing in Digital Preservation.
Monday, April 30, 2018
Thursday, April 26, 2018
Cryptographers On Blockchains
David Gerard's April 21st blog post is a real linkfest. Below the fold, commentary on four of the links.
Tuesday, April 24, 2018
All Your Tweets Are Belong To Kannada
![]() |
| Gerd Badur CC BY-SA 3.0, Source |
47% of mementos of Barack Obama's Twitter page were in non-English languages, almost half of which were in Kannada alone. While language diversity in web archives is generally a good thing, in this case though, it is disconcerting and counter-intuitive.Kannada is an Indian language spoken by only about 38 million people. Below the fold, some commentary.
Thursday, April 12, 2018
Your Tax Dollars At Work
When I was writing Pre-publication Peer Review Subtracts Value, Springer wanted to charge me $39.95 for access to Comparing Published Scientific Journal Articles to Their Pre-print Versions by Martin Klein et al. This despite the fact that the copyright notice said:
This is a U.S. government work and its text is not subject to copyright protection in the United StatesFortunately, you can now follow the link to the final version at arXiv.org. I'm not the only one annoyed by the publishers charging for access to papers not subject to copyright. Below the fold, some more on this scam.
Tuesday, April 10, 2018
Natural Redundancy
Most uncompressed files contain significant redundancy, which is why they can be made smaller by a compression algorithm; they work by reducing redundancy. The better the algorithm, the less redundancy left in the output. If the files are then stored for the long term, they need to be protected, for example by erasure coding, which adds some redundancy back. In Exploiting Source Redundancy to Improve the Rate of Polar Codes, Ying Wang, Krishna R. Narayanan and Anxiao (Andrew) Jiang of Texas A&M explore using the original redundancy to reduce the amount of protection redundancy needed for a given level of reliability. Below the fold, some commentary.
Monday, April 9, 2018
John Perry Barlow RIP
![]() |
| By Mohamed Nanabhay from Qatar CC BY 2.0 |
The Economist, The Guardian and the New York Times had good obituaries, but they mentioned only his Declaration of the Independence of Cyberspace among his writings. It was undoubtedly an important rallying-cry at the time, but it should not be allowed to overshadow his other cyberspace-related writings, thankfully collected by the EFF in the John Perry Barlow Library. Below the fold, the one I would have chosen.
Thursday, April 5, 2018
Emulating Stephen Hawking's Voice
Jason Fagone at the San Francisco Chronicle has a fascinating story of heroic, successful (and timely) emulation in The Silicon Valley quest to preserve Stephen Hawking’s voice. It's the story of a small team which started work in 2009 trying to replace Hawking's voice synthesizer with more modern technology. Below the fold, some details to get you to read the whole article
Subscribe to:
Comments (Atom)

