Tobias Mann's Los Alamos Taps Seagate To Put Compute On Spinning Rust describes progress in the concept of computational storage. I first discussed this in my 2010 JCDL keynote, based on 2009's FAWN, the Fast Array of Wimpy Nodes by David Anderson et al from Carnegie-Mellon. Their work started from the observation that moving data from storage to memory for processing by a CPU takes time and energy, and the faster you do it the more energy it takes. So the less of it you do, the better. Below the fold I start from FAWN and end up with the work under way at Los Alamos.
I'm David Rosenthal, and this is a place to discuss the work I'm doing in Digital Preservation.
Showing posts with label object storage. Show all posts
Showing posts with label object storage. Show all posts
Thursday, April 13, 2023
Tuesday, September 11, 2018
What Does Data "Durability" Mean
![]() |
| Source |
No amount of nines can prevent data loss.Friend may be right that these are the top 5 causes of data loss, but over the timescale of preservation as opposed to storage they are far from the only ones. In Requirements for Digital Preservation Systems: A Bottom-Up Approach we listed 13 of them. Below the fold, some discussion of the meaning and usefulness of durability claims.
There is one very important and inconvenient truth about reliability: Two-thirds of all data loss has nothing to do with hardware failure.
The real culprits are a combination of human error, viruses, bugs in application software, and malicious employees or intruders. Almost everyone has accidentally erased or overwritten a file. Even if your cloud storage had one million nines of durability, it can’t protect you from human error.
Friday, March 3, 2017
Notes from FAST17
As usual, I attended Usenix's File and Storage Technologies conference. Below the fold, my comments on the presentations I found interesting.
Tuesday, December 13, 2016
The Medium-Term Prospects for Long-Term Storage Systems
Back in May I posted The Future of Storage, a brief talk written for a DARPA workshop of the same name. The participants were experts in one or another area of storage technology, so the talk left out a lot of background that a more general audience would have needed. Below the fold, I try to cover the same ground but with this background included, which makes for a long post.
This is an enhanced version of a journal article that has been accepted for publication in Library Hi Tech, with images that didn't meet the journal's criteria, and additional material reflecting developments since submission. Storage technology evolution can't be slowed down to the pace of peer review.
This is an enhanced version of a journal article that has been accepted for publication in Library Hi Tech, with images that didn't meet the journal's criteria, and additional material reflecting developments since submission. Storage technology evolution can't be slowed down to the pace of peer review.
Tuesday, May 3, 2016
Talk at Seagate
I gave a talk at Seagate as part of a meeting to prepare myself for an upcoming workshop on The Future of Storage. It pulls together ideas from many
previous posts. Below the fold, a text of the talk with links to the sources that has been edited to reflect some of what I learnt from the discussions.
Tuesday, March 22, 2016
The Dawn of DAWN?
At the 2009 SOSP David Anderson and co-authors from C-MU presented FAWN, the Fast Array of Wimpy Nodes. It inspired me to suggest, in my 2010 JCDL keynote, that the cost savings FAWN realized without performance penalty by distributing computation across a very large number of very low-power nodes might also apply to storage.
The following year Ian Adams and Ethan Miller of UC Santa Cruz's Storage Systems Research Center and I looked at this possibility more closely in a Technical Report entitled Using Storage Class Memory for Archives with DAWN, a Durable Array of Wimpy Nodes. We showed that it was indeed plausible that, even at then current flash prices, the total cost of ownership over the long term of a storage system built from very low-power system-on-chip technology and flash memory would be competitive with disk while providing high performance and enabling self-healing.
Although flash remains more expensive than hard disk, since 2011 the gap has narrowed from a factor of about 12 to about 6. Pure Storage recently announced FlashBlade, an object storage fabric composed of large numbers of blades, each equipped with:
FlashBlade clearly isn't DAWN. Each blade is much bigger, much more powerful and much more expensive than a DAWN node. No-one could call a node with an 8-core Xeon, 2 ARMs, and 52TB of flash "wimpy", and it'll clearly be too expensive for long-term bulk storage. But it is a big step in the direction of the DAWN architecture.
DAWN exploits two separate sets of synergies:
The following year Ian Adams and Ethan Miller of UC Santa Cruz's Storage Systems Research Center and I looked at this possibility more closely in a Technical Report entitled Using Storage Class Memory for Archives with DAWN, a Durable Array of Wimpy Nodes. We showed that it was indeed plausible that, even at then current flash prices, the total cost of ownership over the long term of a storage system built from very low-power system-on-chip technology and flash memory would be competitive with disk while providing high performance and enabling self-healing.
Although flash remains more expensive than hard disk, since 2011 the gap has narrowed from a factor of about 12 to about 6. Pure Storage recently announced FlashBlade, an object storage fabric composed of large numbers of blades, each equipped with:
- Compute – 8-core Xeon system-on-a-chip – and Elastic Fabric Connector for external, off-blade, 40GbitE networking,
- Storage – NAND storage with 8TB or 52TB raw capacity of raw capacity and on-board NV-RAM with a super-capacitor-backed write buffer plus a pair of ARM CPU cores and an FPGA,
- On-blade networking – PCIe card to link compute and storage cards via a proprietary protocol.
FlashBlade clearly isn't DAWN. Each blade is much bigger, much more powerful and much more expensive than a DAWN node. No-one could call a node with an 8-core Xeon, 2 ARMs, and 52TB of flash "wimpy", and it'll clearly be too expensive for long-term bulk storage. But it is a big step in the direction of the DAWN architecture.
DAWN exploits two separate sets of synergies:
- Like FlashBlade, it moves the computation to where the data is, rather then moving the data to where the computation is, reducing both latency and power consumption. The further data moves on wires from the storage medium, the more power and time it takes. This is why Berkeley's Aspire project's architecture is based on optical interconnect technology, which when it becomes mainstream will be both faster and lower-power than wires. In the meantime, we have to use wires.
- Unlike FlashBlade, it divides the object storage fabric into a much larger number of much smaller nodes, implemented using the very low-power ARM chips used in cellphones. Because the power a CPU needs tends to grow faster than linearly with performance, the additional parallelism provides comparable performance at lower power.
Tuesday, April 21, 2015
The Ontario Library Research Cloud
One of the most interesting sessions at the recent CNI was on the Ontario Library Research Cloud (OLRC). It is a collaboration between universities in Ontario to provide a low-cost, distributed, mutually owned private storage cloud with adequate compute capacity for uses such as text-mining. Below the fold, my commentary on their presentations.
Friday, November 21, 2014
Steve Hetzler's "Touch Rate" Metric
Steve Hetzler of IBM gave a talk at the recent Storage Valley Supper Club on a new, scale-free metric for evaluating storage performance that he calls "Touch Rate". He defines this as the proportion of the store's total content that can be accessed per unit time. This leads to some very illuminating graphs that I discuss below the fold.
Thursday, August 21, 2014
Is This The Dawn of DAWN?
More than three years ago, Ian Adams, Ethan Miller and I were inspired by a 2009 paper FAWN: A Fast Array of Wimpy Nodes from David Andersen et al at C-MU. They showed how a fabric of nodes, each with a small amount of flash memory and a very low-power processor, could process key-value queries as fast as a network of beefy servers using two orders of magnitude less power.
We put forward a storage architecture called DAWN: Durable Array of Wimpy Nodes, similar hardware but optimized for long-term storage. Its advantages were small form factor, durability, and very low running costs. We argued that these would outweigh the price premium for flash over disk. Recent developments are starting to make us look prophetic - details below the fold.
We put forward a storage architecture called DAWN: Durable Array of Wimpy Nodes, similar hardware but optimized for long-term storage. Its advantages were small form factor, durability, and very low running costs. We argued that these would outweigh the price premium for flash over disk. Recent developments are starting to make us look prophetic - details below the fold.
Labels:
green preservation,
networking,
object storage,
seagate,
storage media
Tuesday, August 19, 2014
TRAC Audit: Do-It-Yourself Demos
In my post TRAC Audit: Process I explained how we demonstrated the LOCKSS Polling and Repair Protocol to the auditors, and linked to the annotated logs we showed them. These demos have been included in the latest release of the LOCKSS software. Below the fold, and now in the documentation, are step-by-step instructions allowing you to replicate this demo.
Labels:
audit,
digital preservation,
networking,
object storage
Wednesday, May 21, 2014
DAWN is breaking
I posted last October on Seagate's announcement of Kinetic, their object storage architecture for Ethernet-connected hard drives (and ultimately other forms of storage). This is a conservative approach to up-levelling the interface to storage media, providing an object storage architecture with a fixed but generally useful set of operations. In that way it is similar to, but less ambitious than, our proposed DAWN architecture.
The other half of the disk drive industry has now responded with a much more radical approach. Western Digital's HGST unit has announced Ethernet connected drives that run Linux. This approach has some significant advantages:
The other half of the disk drive industry has now responded with a much more radical approach. Western Digital's HGST unit has announced Ethernet connected drives that run Linux. This approach has some significant advantages:
- It sounds great as a marketing pitch.
- It gets computing as close as possible to the data, which is the architecturally correct direction to be moving. This is something that DAWN does but Kinetic doesn't.
- It will be easy to make HGST's drives compatible with Seagate's by running an implementation of the Kinetic protocol on them.
- It provides a great deal of scope for researching and developing suitable protocols for communicating with storage media over IP.
- In many cases manufacturers find disks returned under warranty work fine; the cause of the failure was an unrepeatable bug in the disk firmware. Running Linux on the drive will provide a vastly increased scope for such failures, and make diagnosing them much harder for the manufacturer.
- If the interface between the Linux and the drive hardware emulates the existing SATA or other interface, the benefits of the architecture will be limited to some extent. On the other hand, to the extent it exposes more of the hardware it will increase the risk that applications will screw up the hardware.
- Kinetic's approach takes security of the communication with the drives seriously. HGST's "anything goes" approach leaves this up to the application.
Wednesday, October 30, 2013
Seagate's Ethernet Hard Drives
A week ago Seagate made an extraordinarily interesting announcement for long-term storage, their Kinetic Open Storage Platform, including disk drives with Ethernet connectivity. Below the fold, the details.
Subscribe to:
Comments (Atom)

