SlideShare a Scribd company logo
Blosc
Sending data from memory to CPU (and back)	

faster than memcpy()
Francesc Alted

Software Architect

PyData London 2014	

February 22, 2014
About Me
• I am the creator of tools like PyTables,
Blosc, BLZ and maintainer of Numexpr.	

• I learnt the hard way that ‘premature
optimization is the root of all evil’.	

• Now I only humbly try to optimize if I
really need to and I just hope that Blosc is
not an example of ‘premature optimization’.
About Continuum Analytics
• Develop new ways on how data is stored,
computed, and visualized.	

• Provide open technologies for data
integration on a massive scale.	

• Provide software tools, training, and
integration/consulting services to
corporate, government, and educational
clients worldwide.
Overview
• Compressing faster than memcpy(). Really?	

• How that can be?

(The ‘Starving CPU’ problem)	

• How Blosc works.	

• Being faster than memcpy() means that my
programs would actually run faster?
Compressing Faster
than memcpy()
Interactive Session Starts
• If you 	

want to experiment with Blosc in
your own machine: 

http://www.blosc.org/materials/PyData-
London-2014.tar.gz	

• blosc (blz too for later on) is required (both
are included in conda repository).
Open Questions
We have seen that, sometimes, Blosc can actually
be faster than memcpy(). Now:	

1. If compression takes way more CPU than
memcpy(), why Blosc can beat it?	

2. Does this mean that Blosc can actually
accelerate computations in real
scenarios?
The Starving CPU
Problem
“Across the industry, today’s chips are largely
able to execute code faster than we can feed
them with instructions and data.”	

!
– Richard Sites, after his article

“It’sThe Memory, Stupid!”, 

Microprocessor Report, 10(10),1996
Memory Access Time
vs CPU Cycle Time
Book in
2009
The Status of CPU
Starvation in 2014
• Memory latency (~10 ns) is much slower
(between 100x and 250x) than processors.	

• Memory bandwidth (~15 GB/s) is
improving at a better rate than memory
latency, but it is also slower than
processors (between 30x and 100x).
Blosc Goals and
Implementation
Blosc: (de)compressing
faster than memcpy()
Transmission + decompression faster than direct transfer?
Taking Advantage of
Memory-CPU Gap
• Blosc is meant to discover redundancy in
data as fast as possible.	

• It comes with a series of fast compressors:
BloscLZ, LZ4, Snappy, LZ4HC and Zlib	

• Blosc is meant for speed, not for high
compression ratios.
Blosc Is All About
Efficiency
• Uses data blocks that fit in L1 or L2 caches
(better speed, less compression ratios).	

• Uses multithreading by default.	

• The shuffle filter uses SSE2 instructions in
modern Intel and AMD processors.
Blocking: Divide and
Conquer
Suffling: Improving the
Compression Ratio
The shuffling algorithm does not actually
compress the data; it rather changes the byte
order in the data stream:
Shuffling Caveat
• Shuffling usually produces better
compression ratios with numerical data,
except when it does not.	

• If you mind about the compression ratio, it
is worth to deactivate it and check (it is
active by default).	

• Will see an example on real data later on.
Blosc Performance:
Laptop back in 2005
Blosc Performance:
Desktop Computer in 2012
First Answer for Open
Questions
• Blosc data blocking optimizes the cache
behavior during memory access.	

• Additionally, it uses multithreading and
SIMD instructions.	

• Add these to the Starved CPU problem and
you have a good hint now on why Blosc can
beat memcpy().
How Compression
Works With Real Data?
The Need for
Compression
• Compression allows to store more data
using the same storage capacity.	

• Sure, it uses more CPU time to compress/
decompress data.	

• But, that actually means using more wall
clock time?
The Need for a
Compressed Container
• A compressed container is meant to store
data in compressed state and transparently
deliver it uncompressed.	

• That means that the user only perceives
that her dataset takes less memory.	

• Only less space? What about data access
speed?
Source: Howison, M. High-throughput compression of FASTQ data with SeqDB.
IEEE Transactions on Computational Biology and Bioinformatics.
Example of How Blosc Accelerates Genomics I/O:	

SeqDB (backed by Blosc)
Bloscpack (I)
• Command line interface and serialization
format for Blosc:	

!
$ blpk c data.dat # compress
$ blpk d data.dat.blp # decompress
Bloscpack (II)
• Very convenient for easily serializing your
in-memory NumPy datasets:	

>>> a = np.linspace(0, 1, 3e8)
>>> print a.size, a.dtype
300000000 float64
>>> bp.pack_ndarray_file(a, 'a.blp')
>>> b = bp.unpack_ndarray_file('a.blp')
>>> (a == b).all()
True
Yet Another Example: 	

BLZ	

• BLZ is a both a format and library that has
been designed as an efficient data container
for Big Data.	

• Blosc and Bloscpack are at the heart of it in
order to achieve high-speed compression/
decompression.	

• BLZ is one of the backends supported by
our nascent Blaze library.
Appending Data in
Large NumPy Objects
Copy!
New memory	

allocation
array to be enlarged final array object
new data to append
• Normally a realloc() syscall will not succeed	

• Both memory areas have to exist simultaneously
Contiguous vs Chunked
NumPy container
Contiguous memory
BLZ container
chunk 1
chunk 2
Discontiguous memory
chunk N
...
Appending data in BLZ
compress
new chunk
array to be enlarged final array object
new data to append
Only a small amount of data has to be compressed
X
chunk 1
chunk 2
chunk 1
chunk 2
The btable object in BLZ
New row to append
• Columns are contiguous in memory	

• Chunks follow column order	

• Very efficient for querying (specially with a

large number of columns)
Chunks
Second Interactive
Session: BLZ and Blosc
on a Real Dataset
Second Hint for Open
Questions	

Blosc usage in BLZ means not only less storage
usage (~15x-40x reduction for the real life data
shown), but almost the same access time to
the data (~2x-10x slowdown).	

(Still need to address implementation details for
getting better performance)
Summary
• Blosc, being able to transfer data faster than
memcpy(), has enormous implications on
data management.	

• It is well suited not only for saving memory,
but for allowing close performance to
typical uncompressed data containers.	

• It works well not only for synthetic data,
but also for real-life datasets.
References
• Blosc: http://www.blosc.org	

• Bloscpack: https://github.com/Blosc/bloscpack	

• BLZ: http://blz.pydata.org
“Across the industry, today’s chips are largely able to execute code
faster than we can feed them with instructions and data. There are no
longer performance bottlenecks in the floating-point multiplier or in
having only a single integer unit. The real design action is in memory
subsystems— caches, buses, bandwidth, and latency.”	

!
“Over the coming decade, memory subsystem design will be the only
important design issue for microprocessors.”	

!
– Richard Sites, after his article “It’sThe Memory, Stupid!”,
Microprocessor Report, 10(10),1996
“Over this decade (2010-2020), memory subsystem optimization
will be (almost) the only important design issue for improving
performance.”	

– Me :)
Thank you!

More Related Content

What's hot (18)

PDF
Technology Updates of PG-Strom at Aug-2014 (PGUnconf@Tokyo)
Kohei KaiGai
 
PDF
HKG15-The Machine: A new kind of computer- Keynote by Dejan Milojicic
Linaro
 
PPTX
Modern software design in Big data era
Bill GU
 
PDF
I-Sieve: An inline High Performance Deduplication System Used in cloud storage
redpel dot com
 
PPTX
Mongodb backup
Dharshan Rangegowda
 
PDF
Joblib Toward efficient computing : from laptop to cloud
PyDataParis
 
PDF
Joblib PyDataParis2016
Alexandre Abadie
 
ODP
My talk at Topconf.com conference, Tallinn, 1st of November 2012
Kostja Osipov
 
PDF
Apache tajo configuration
Jihoon Son
 
PDF
Database Research on Modern Computing Architecture
Kyong-Ha Lee
 
PPTX
GFS & HDFS Introduction
Hariharan Ganesan
 
PDF
Slides for In-Datacenter Performance Analysis of a Tensor Processing Unit
Carlo C. del Mundo
 
PPTX
Webinar: Understanding Storage for Performance and Data Safety
MongoDB
 
PPTX
Availability and scalability in mongo
Md. Khairul Anam
 
PPTX
ScimoreDB @ CommunityDays 2011
scimore
 
PDF
The Google Chubby lock service for loosely-coupled distributed systems
Romain Jacotin
 
PPTX
Cassandra in Operation
niallmilton
 
Technology Updates of PG-Strom at Aug-2014 (PGUnconf@Tokyo)
Kohei KaiGai
 
HKG15-The Machine: A new kind of computer- Keynote by Dejan Milojicic
Linaro
 
Modern software design in Big data era
Bill GU
 
I-Sieve: An inline High Performance Deduplication System Used in cloud storage
redpel dot com
 
Mongodb backup
Dharshan Rangegowda
 
Joblib Toward efficient computing : from laptop to cloud
PyDataParis
 
Joblib PyDataParis2016
Alexandre Abadie
 
My talk at Topconf.com conference, Tallinn, 1st of November 2012
Kostja Osipov
 
Apache tajo configuration
Jihoon Son
 
Database Research on Modern Computing Architecture
Kyong-Ha Lee
 
GFS & HDFS Introduction
Hariharan Ganesan
 
Slides for In-Datacenter Performance Analysis of a Tensor Processing Unit
Carlo C. del Mundo
 
Webinar: Understanding Storage for Performance and Data Safety
MongoDB
 
Availability and scalability in mongo
Md. Khairul Anam
 
ScimoreDB @ CommunityDays 2011
scimore
 
The Google Chubby lock service for loosely-coupled distributed systems
Romain Jacotin
 
Cassandra in Operation
niallmilton
 

Similar to Blosc: Sending Data from Memory to CPU (and back) Faster than Memcpy by Francesc Alted (15)

PDF
Blosc Talk by Francesc Alted from PyData London 2014
PyData
 
PDF
PyData Paris 2015 - Closing keynote Francesc Alted
Pôle Systematic Paris-Region
 
PDF
Caching Methodology & Strategies
Tiệp Vũ
 
PDF
Caching methodology and strategies
Tiep Vu
 
PPTX
Extended memory access in PHP
Andrew Goodwin
 
PPTX
A Developer’s View into Spark's Memory Model with Wenchen Fan
Databricks
 
PPTX
A Developer's View Into Spark's Memory Model with Wenchen Fan
Databricks
 
PDF
Large Data Analyze With PyTables
Innfinision Cloud and BigData Solutions
 
PDF
PyTables
Ali Hallaji
 
PDF
Py tables
Ali Hallaji
 
PDF
Practical Code & Data Design
HenryRose9
 
PDF
PyTables
Ali Hallaji
 
PDF
Introduction to Memoria
Victor Smirnov
 
PDF
High performance operating system controlled memory compression
Mr. Chanuwan
 
PPTX
Ужимай и властвуй алгоритмы компрессии в базах данных / Петр Зайцев (Percona)
Ontico
 
Blosc Talk by Francesc Alted from PyData London 2014
PyData
 
PyData Paris 2015 - Closing keynote Francesc Alted
Pôle Systematic Paris-Region
 
Caching Methodology & Strategies
Tiệp Vũ
 
Caching methodology and strategies
Tiep Vu
 
Extended memory access in PHP
Andrew Goodwin
 
A Developer’s View into Spark's Memory Model with Wenchen Fan
Databricks
 
A Developer's View Into Spark's Memory Model with Wenchen Fan
Databricks
 
Large Data Analyze With PyTables
Innfinision Cloud and BigData Solutions
 
PyTables
Ali Hallaji
 
Py tables
Ali Hallaji
 
Practical Code & Data Design
HenryRose9
 
PyTables
Ali Hallaji
 
Introduction to Memoria
Victor Smirnov
 
High performance operating system controlled memory compression
Mr. Chanuwan
 
Ужимай и властвуй алгоритмы компрессии в базах данных / Петр Зайцев (Percona)
Ontico
 
Ad

More from PyData (20)

PDF
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
PyData
 
PDF
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
PyData
 
PDF
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
PyData
 
PDF
Using Embeddings to Understand the Variance and Evolution of Data Science... ...
PyData
 
PDF
Deploying Data Science for Distribution of The New York Times - Anne Bauer
PyData
 
PPTX
Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
PyData
 
PPTX
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
PyData
 
PDF
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
PyData
 
PDF
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
PyData
 
PDF
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
PyData
 
PDF
Words in Space - Rebecca Bilbro
PyData
 
PDF
End-to-End Machine learning pipelines for Python driven organizations - Nick ...
PyData
 
PPTX
Pydata beautiful soup - Monica Puerto
PyData
 
PDF
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
PyData
 
PPTX
Extending Pandas with Custom Types - Will Ayd
PyData
 
PDF
Measuring Model Fairness - Stephen Hoover
PyData
 
PDF
What's the Science in Data Science? - Skipper Seabold
PyData
 
PDF
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
PyData
 
PDF
Solving very simple substitution ciphers algorithmically - Stephen Enright-Ward
PyData
 
PDF
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
PyData
 
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
PyData
 
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
PyData
 
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
PyData
 
Using Embeddings to Understand the Variance and Evolution of Data Science... ...
PyData
 
Deploying Data Science for Distribution of The New York Times - Anne Bauer
PyData
 
Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
PyData
 
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
PyData
 
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
PyData
 
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
PyData
 
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
PyData
 
Words in Space - Rebecca Bilbro
PyData
 
End-to-End Machine learning pipelines for Python driven organizations - Nick ...
PyData
 
Pydata beautiful soup - Monica Puerto
PyData
 
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
PyData
 
Extending Pandas with Custom Types - Will Ayd
PyData
 
Measuring Model Fairness - Stephen Hoover
PyData
 
What's the Science in Data Science? - Skipper Seabold
PyData
 
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
PyData
 
Solving very simple substitution ciphers algorithmically - Stephen Enright-Ward
PyData
 
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
PyData
 
Ad

Recently uploaded (20)

PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
PPTX
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PPTX
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
PDF
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
PDF
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
PDF
Complete JavaScript Notes: From Basics to Advanced Concepts.pdf
haydendavispro
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PDF
Python basic programing language for automation
DanialHabibi2
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PDF
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
Complete JavaScript Notes: From Basics to Advanced Concepts.pdf
haydendavispro
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
Python basic programing language for automation
DanialHabibi2
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 

Blosc: Sending Data from Memory to CPU (and back) Faster than Memcpy by Francesc Alted

  • 1. Blosc Sending data from memory to CPU (and back) faster than memcpy() Francesc Alted
 Software Architect
 PyData London 2014 February 22, 2014
  • 2. About Me • I am the creator of tools like PyTables, Blosc, BLZ and maintainer of Numexpr. • I learnt the hard way that ‘premature optimization is the root of all evil’. • Now I only humbly try to optimize if I really need to and I just hope that Blosc is not an example of ‘premature optimization’.
  • 3. About Continuum Analytics • Develop new ways on how data is stored, computed, and visualized. • Provide open technologies for data integration on a massive scale. • Provide software tools, training, and integration/consulting services to corporate, government, and educational clients worldwide.
  • 4. Overview • Compressing faster than memcpy(). Really? • How that can be?
 (The ‘Starving CPU’ problem) • How Blosc works. • Being faster than memcpy() means that my programs would actually run faster?
  • 6. Interactive Session Starts • If you want to experiment with Blosc in your own machine: 
 http://www.blosc.org/materials/PyData- London-2014.tar.gz • blosc (blz too for later on) is required (both are included in conda repository).
  • 7. Open Questions We have seen that, sometimes, Blosc can actually be faster than memcpy(). Now: 1. If compression takes way more CPU than memcpy(), why Blosc can beat it? 2. Does this mean that Blosc can actually accelerate computations in real scenarios?
  • 8. The Starving CPU Problem “Across the industry, today’s chips are largely able to execute code faster than we can feed them with instructions and data.” ! – Richard Sites, after his article
 “It’sThe Memory, Stupid!”, 
 Microprocessor Report, 10(10),1996
  • 9. Memory Access Time vs CPU Cycle Time
  • 11. The Status of CPU Starvation in 2014 • Memory latency (~10 ns) is much slower (between 100x and 250x) than processors. • Memory bandwidth (~15 GB/s) is improving at a better rate than memory latency, but it is also slower than processors (between 30x and 100x).
  • 13. Blosc: (de)compressing faster than memcpy() Transmission + decompression faster than direct transfer?
  • 14. Taking Advantage of Memory-CPU Gap • Blosc is meant to discover redundancy in data as fast as possible. • It comes with a series of fast compressors: BloscLZ, LZ4, Snappy, LZ4HC and Zlib • Blosc is meant for speed, not for high compression ratios.
  • 15. Blosc Is All About Efficiency • Uses data blocks that fit in L1 or L2 caches (better speed, less compression ratios). • Uses multithreading by default. • The shuffle filter uses SSE2 instructions in modern Intel and AMD processors.
  • 17. Suffling: Improving the Compression Ratio The shuffling algorithm does not actually compress the data; it rather changes the byte order in the data stream:
  • 18. Shuffling Caveat • Shuffling usually produces better compression ratios with numerical data, except when it does not. • If you mind about the compression ratio, it is worth to deactivate it and check (it is active by default). • Will see an example on real data later on.
  • 21. First Answer for Open Questions • Blosc data blocking optimizes the cache behavior during memory access. • Additionally, it uses multithreading and SIMD instructions. • Add these to the Starved CPU problem and you have a good hint now on why Blosc can beat memcpy().
  • 23. The Need for Compression • Compression allows to store more data using the same storage capacity. • Sure, it uses more CPU time to compress/ decompress data. • But, that actually means using more wall clock time?
  • 24. The Need for a Compressed Container • A compressed container is meant to store data in compressed state and transparently deliver it uncompressed. • That means that the user only perceives that her dataset takes less memory. • Only less space? What about data access speed?
  • 25. Source: Howison, M. High-throughput compression of FASTQ data with SeqDB. IEEE Transactions on Computational Biology and Bioinformatics. Example of How Blosc Accelerates Genomics I/O: SeqDB (backed by Blosc)
  • 26. Bloscpack (I) • Command line interface and serialization format for Blosc: ! $ blpk c data.dat # compress $ blpk d data.dat.blp # decompress
  • 27. Bloscpack (II) • Very convenient for easily serializing your in-memory NumPy datasets: >>> a = np.linspace(0, 1, 3e8) >>> print a.size, a.dtype 300000000 float64 >>> bp.pack_ndarray_file(a, 'a.blp') >>> b = bp.unpack_ndarray_file('a.blp') >>> (a == b).all() True
  • 28. Yet Another Example: BLZ • BLZ is a both a format and library that has been designed as an efficient data container for Big Data. • Blosc and Bloscpack are at the heart of it in order to achieve high-speed compression/ decompression. • BLZ is one of the backends supported by our nascent Blaze library.
  • 29. Appending Data in Large NumPy Objects Copy! New memory allocation array to be enlarged final array object new data to append • Normally a realloc() syscall will not succeed • Both memory areas have to exist simultaneously
  • 30. Contiguous vs Chunked NumPy container Contiguous memory BLZ container chunk 1 chunk 2 Discontiguous memory chunk N ...
  • 31. Appending data in BLZ compress new chunk array to be enlarged final array object new data to append Only a small amount of data has to be compressed X chunk 1 chunk 2 chunk 1 chunk 2
  • 32. The btable object in BLZ New row to append • Columns are contiguous in memory • Chunks follow column order • Very efficient for querying (specially with a
 large number of columns) Chunks
  • 33. Second Interactive Session: BLZ and Blosc on a Real Dataset
  • 34. Second Hint for Open Questions Blosc usage in BLZ means not only less storage usage (~15x-40x reduction for the real life data shown), but almost the same access time to the data (~2x-10x slowdown). (Still need to address implementation details for getting better performance)
  • 35. Summary • Blosc, being able to transfer data faster than memcpy(), has enormous implications on data management. • It is well suited not only for saving memory, but for allowing close performance to typical uncompressed data containers. • It works well not only for synthetic data, but also for real-life datasets.
  • 36. References • Blosc: http://www.blosc.org • Bloscpack: https://github.com/Blosc/bloscpack • BLZ: http://blz.pydata.org
  • 37. “Across the industry, today’s chips are largely able to execute code faster than we can feed them with instructions and data. There are no longer performance bottlenecks in the floating-point multiplier or in having only a single integer unit. The real design action is in memory subsystems— caches, buses, bandwidth, and latency.” ! “Over the coming decade, memory subsystem design will be the only important design issue for microprocessors.” ! – Richard Sites, after his article “It’sThe Memory, Stupid!”, Microprocessor Report, 10(10),1996 “Over this decade (2010-2020), memory subsystem optimization will be (almost) the only important design issue for improving performance.” – Me :)