SlideShare a Scribd company logo
Streaming Huge Databases
using Logical Decoding
Adventures of a naive programmer
Oleksandr “Alex” Shulgin, Zalando SE
Overview
● Introduction
● Problem Statement
● Approach
● Problems
● Some Numbers
● Conclusion
Introduction
What is Logical Decoding all about?
● A new feature of PostgreSQL since version 9.4.
● Allows streaming database changes in a custom format.
● Requires an Output Plugin to be written (yes, in C).
● Consistent snapshot before all the changes?
Logical Decoding
CREATE_REPLICATION_SLOT "slot1" LOGICAL <plugin_name>;
BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SET TRANSACTION SNAPSHOT 'XXXXXXXX-N';
SELECT pg_export_snapshot();
Logical Decoding Output
{
"action": "I", /* INSERT */
"relation": [
"myschema", /* INTO myschema.mytable(id) */
"mytable"
],
"newtuple": {
"id": 1 /* VALUES(1) */
}
}
Problem Statement
?
Approach
?
W
W
https://github.com/zalando/saiki-tawon
Command Line and Dashboard
Long Live the Snapshot
{
"w-8ee20b3": {
"snapshot_id": "51E426C2-1",
"ts_start": "2016-01-14 11:02:40 UTC",
"heartbeat": "2016-01-15 07:10:27 UTC",
"pid": 1,
"backend_pid": 58793
},
"w-fbfb655": {
"snapshot_id": "51E426C4-1",
"ts_start": "2016-01-14 11:02:41 UTC",
"heartbeat": "2016-01-15 07:10:28 UTC",
"pid": 1,
"backend_pid": 58794
},
...
The Source System
● Ubuntu precise (12.04) 3.2.0-xx-generic
● CPU: @2.50GHz Xeon with 24 cores
● RAM: 125.88 GB
● 6x HDDs (spinning drives) in a RAID 1+0, 5 TB total capacity
● Data size: 3.0 TB / 17 B rows (+ 1.8 TB indexes)
● PostgreSQL 9.6devel
The Target System
“Things are working amazingly fast  
when you write to /dev/null.”
– proverbial wisdom
Problems?
Problems!
● OpenVPN quickly becomes the bottleneck on the laptop
Problems
● OpenVPN quickly becomes the bottleneck on the laptop
Obvious solution: deploy workers closer to the database.
Docker + Mesosphere DCOS
https://zalando-techmonkeys.github.io/
Problems
● Workers quickly run out of memory
The (problematic) code:
cursor.execute("SELECT * FROM mytable")
Problems
● Workers quickly run out of memory
The (problematic) code:
cursor.execute("SELECT * FROM mytable")
● Invokes PQexec().
● Async. connection doesn’t help.
● psycopg2 is not designed to stream results.
Problems
● Invoke COPY protocol!
Corrected code:
cursor.copy_expert(
"COPY (SELECT * FROM mytable) TO STDOUT",
...)
Problems
● Invoke COPY protocol!
Corrected code:
cursor.copy_expert(
"COPY (SELECT * FROM mytable) TO STDOUT",
...)
● Tried 32 MB, then 64 MB per worker: it was not enough...
● One of the values was around 80 MB(!). Not much we can do.
More Problems?
● More problems with this code
The correct(?) code:
cursor.copy_expert(
"COPY (SELECT * FROM mytable) TO STDOUT",
...)
More Problems?
● More problems with this code
The correct(?) code:
cursor.copy_expert(
"COPY (SELECT * FROM mytable) TO STDOUT",
...)
● SELECT * FROM [ONLY] myschema.mytable
NoSQL?
● How about some JSON for comparison?
SELECT row_to_json(x.*) FROM mytable AS x
● Slows down the export 2-3 times.
● Not 100% equivalent to what output plugin emits.
● Have to write a C function for every plugin.
What if we would write a generic function...
CREATE FUNCTION pg_logical_slot_stream_relation(
IN slot_name NAME,
IN relnamespace NAME,
IN relname NAME,
IN nochildren BOOL DEFAULT FALSE,
VARIADIC options TEXT[] DEFAULT '{}',
OUT data TEXT
)
RETURNS SETOF TEXT ...
The Final Code
cursor.copy_expert(
"COPY (SELECT pg_logical_slot_stream_relation(...)) TO STDOUT",
...)
● Do not use SELECT … FROM pg_logical_slot_… – it caches result in the backend.
● Requires changes to core PostgreSQL.
● Ideally should not require a slot, only a snapshot.
● Slots cannot be used concurrently (yet).
At Last: Some Numbers
6 client processes, SSL (no compression), 1Gbit/s network interface
Query Run Time Volume Notes
SELECT * FROM … 7.5 h 2.7 TB 105 MB/s
pglogical / JSON 17.5 h 6.7 TB 112 MB/s
pglogical / native 30+ h (incomplete) 11+ TB 106 MB/s
Bottled Water / Avro 13.5 h 5.0 TB 108 MB/s
Space for Improvement
In native protocol format pglogical_output emits metadata per each tuple.
● Metadata overhead: 5.0 TB (167%)
○ nspname + relname + attnames
● Protocol overhead: 1.5 TB (50%)
○ message type + lengths
Set plugin option relmeta_cache_size to -1.
● Network is apparently the bottleneck.
● What if we enable SSL compression?..
A Common Number: ~110 MB/s
sslcompression=1?
● Nowadays seems to be really hard to do, re: CRIME vulnerability.
● Older distro versions: set env. OPENSSL_DEFAULT_ZLIB
● Newer distro versions: OpenSSL is compiled without zlib. Good luck!
● TLSv1.3 will remove support for compression.
● HINT: your streaming replication is likely to be running uncompressed.
https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2012-4929
A Much Better Picture
Compression FTW!
24 client processes, SSL (with compression)
Query Run Time Volume Notes
SELECT * FROM … 3h (vs. 7.5 h) 2.7 TB
pglogical / JSON 7-8* h (vs. 17.5 h) 6.7 TB *ordering
pglogical / native 8-9* h (vs. 30+ h) 7.2 TB (vs. 11+ TB)
Bottled Water / Avro 10 h (vs. 13.5 h) 5.0 TB
In Conclusion
● Set relmeta_cache_size with pglogical_output native.
● Run a benchmark to see if you need compression.
● Order tables from largest to smallest.
● Do listen on the replication slot once the export is finished.
● Help needed: review the proposed streaming interface!
References
● https://github.com/zalando/saiki-tawon
● https://github.com/2ndQuadrant/postgres/tree/dev/pglogical-output
● https://github.com/confluentinc/bottledwater-pg/
● https://zalando-techmonkeys.github.io
● https://github.com/zalando/pg_view
● http://www.slideshare.net/AlexanderShulgin3/adding-replication-
protocol-support-for-psycopg2
● http://www.postgresql.org/message-id/flat/CACACo5RNZ0OB8K...
Thank you!
Questions?

More Related Content

What's hot (20)

PDF
Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)
Masao Fujii
 
PDF
On The Building Of A PostgreSQL Cluster
Srihari Sriraman
 
PDF
GOTO 2013: Why Zalando trusts in PostgreSQL
Henning Jacobs
 
PPTX
PostgreSQL Terminology
Showmax Engineering
 
PDF
Case Studies on PostgreSQL
InMobi Technology
 
PDF
Toro DB- Open-source, MongoDB-compatible database, built on top of PostgreSQL
InMobi Technology
 
PDF
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna
PostgreSQL-Consulting
 
PDF
PostgreSQL HA
haroonm
 
PPTX
Tuning Linux for MongoDB
Tim Vaillancourt
 
PDF
PostgreSQL for Oracle Developers and DBA's
Gerger
 
PDF
Lessons PostgreSQL learned from commercial databases, and didn’t
PGConf APAC
 
PDF
Shenandoah GC: Java Without The Garbage Collection Hiccups (Christine Flood)
Red Hat Developers
 
PDF
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
Alexey Lesovsky
 
PDF
Ilya Kosmodemiansky - An ultimate guide to upgrading your PostgreSQL installa...
PostgreSQL-Consulting
 
PPTX
Building Spark as Service in Cloud
InMobi Technology
 
PDF
Postgresql database administration volume 1
Federico Campoli
 
PPTX
Creating a Benchmarking Infrastructure That Just Works
Tim Callaghan
 
PDF
Elephant Roads: a tour of Postgres forks
Command Prompt., Inc
 
PDF
Managing PostgreSQL with PgCenter
Alexey Lesovsky
 
PDF
Spilo, отказоустойчивый PostgreSQL кластер / Oleksii Kliukin (Zalando SE)
Ontico
 
Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)
Masao Fujii
 
On The Building Of A PostgreSQL Cluster
Srihari Sriraman
 
GOTO 2013: Why Zalando trusts in PostgreSQL
Henning Jacobs
 
PostgreSQL Terminology
Showmax Engineering
 
Case Studies on PostgreSQL
InMobi Technology
 
Toro DB- Open-source, MongoDB-compatible database, built on top of PostgreSQL
InMobi Technology
 
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna
PostgreSQL-Consulting
 
PostgreSQL HA
haroonm
 
Tuning Linux for MongoDB
Tim Vaillancourt
 
PostgreSQL for Oracle Developers and DBA's
Gerger
 
Lessons PostgreSQL learned from commercial databases, and didn’t
PGConf APAC
 
Shenandoah GC: Java Without The Garbage Collection Hiccups (Christine Flood)
Red Hat Developers
 
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
Alexey Lesovsky
 
Ilya Kosmodemiansky - An ultimate guide to upgrading your PostgreSQL installa...
PostgreSQL-Consulting
 
Building Spark as Service in Cloud
InMobi Technology
 
Postgresql database administration volume 1
Federico Campoli
 
Creating a Benchmarking Infrastructure That Just Works
Tim Callaghan
 
Elephant Roads: a tour of Postgres forks
Command Prompt., Inc
 
Managing PostgreSQL with PgCenter
Alexey Lesovsky
 
Spilo, отказоустойчивый PostgreSQL кластер / Oleksii Kliukin (Zalando SE)
Ontico
 

Viewers also liked (20)

PDF
Enterprise PostgreSQL - EDB's answer to conventional Databases
Ashnikbiz
 
PDF
Владимир Бородин - PostgreSQL
Yandex
 
PDF
Developing PostgreSQL Performance, Simon Riggs
Fuenteovejuna
 
PPT
Потоковая репликация PostgreSQL
DevOWL Meetup
 
PDF
Multimaster
Stas Kelvich
 
PDF
PostgreSQL replication from setup to advanced features.
Pivorak MeetUp
 
PDF
Performance improvements in PostgreSQL 9.5 and beyond
Tomas Vondra
 
PDF
PostgreSQL na EXT4, XFS, BTRFS a ZFS / FOSDEM PgDay 2016
Tomas Vondra
 
PDF
Out of the box replication in postgres 9.4
Denish Patel
 
PDF
PostgreSQL on EXT4, XFS, BTRFS and ZFS
Tomas Vondra
 
PDF
PostgreSQL performance improvements in 9.5 and 9.6
Tomas Vondra
 
PDF
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
PostgreSQL-Consulting
 
PDF
PostgreSQL 9.6 Performance-Scalability Improvements
PGConf APAC
 
PDF
Streaming replication in practice
Alexey Lesovsky
 
PPTX
Docker в работе: взгляд на использование в Badoo через год
Badoo Development
 
PDF
Radical Agility with Autonomous Teams and Microservices in the Cloud
Zalando Technology
 
PDF
Linux tuning to improve PostgreSQL performance
PostgreSQL-Consulting
 
PDF
Postgres in Production - Best Practices 2014
EDB
 
PDF
Postgresql on NFS - J.Battiato, pgday2016
Jonathan Battiato
 
PDF
Deep dive into PostgreSQL statistics.
Alexey Lesovsky
 
Enterprise PostgreSQL - EDB's answer to conventional Databases
Ashnikbiz
 
Владимир Бородин - PostgreSQL
Yandex
 
Developing PostgreSQL Performance, Simon Riggs
Fuenteovejuna
 
Потоковая репликация PostgreSQL
DevOWL Meetup
 
Multimaster
Stas Kelvich
 
PostgreSQL replication from setup to advanced features.
Pivorak MeetUp
 
Performance improvements in PostgreSQL 9.5 and beyond
Tomas Vondra
 
PostgreSQL na EXT4, XFS, BTRFS a ZFS / FOSDEM PgDay 2016
Tomas Vondra
 
Out of the box replication in postgres 9.4
Denish Patel
 
PostgreSQL on EXT4, XFS, BTRFS and ZFS
Tomas Vondra
 
PostgreSQL performance improvements in 9.5 and 9.6
Tomas Vondra
 
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
PostgreSQL-Consulting
 
PostgreSQL 9.6 Performance-Scalability Improvements
PGConf APAC
 
Streaming replication in practice
Alexey Lesovsky
 
Docker в работе: взгляд на использование в Badoo через год
Badoo Development
 
Radical Agility with Autonomous Teams and Microservices in the Cloud
Zalando Technology
 
Linux tuning to improve PostgreSQL performance
PostgreSQL-Consulting
 
Postgres in Production - Best Practices 2014
EDB
 
Postgresql on NFS - J.Battiato, pgday2016
Jonathan Battiato
 
Deep dive into PostgreSQL statistics.
Alexey Lesovsky
 
Ad

Similar to Streaming huge databases using logical decoding (20)

PDF
High performance json- postgre sql vs. mongodb
Wei Shan Ang
 
PPTX
Road to sbt 1.0 paved with server
Eugene Yokota
 
PDF
Finding Xori: Malware Analysis Triage with Automated Disassembly
Priyanka Aash
 
PDF
Shared Database Concurrency
Aivars Kalvans
 
PDF
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
PGConf APAC
 
PPT
A Life of breakpoint
Hajime Morrita
 
PDF
Debugging ZFS: From Illumos to Linux
Serapheim-Nikolaos Dimitropoulos
 
PDF
Experiences building a distributed shared log on RADOS - Noah Watkins
Ceph Community
 
PDF
lecture16-recap-questions-and-answers.pdf
AyushKumar93531
 
PDF
Mender.io | Develop embedded applications faster | Comparing C and Golang
Mender.io
 
PDF
Java Memory Model
Łukasz Koniecki
 
PPTX
Andriy Shalaenko - GO security tips
OWASP Kyiv
 
ODP
Ceph Day Melbourne - Troubleshooting Ceph
Ceph Community
 
PDF
Os Wilhelm
oscon2007
 
PDF
Performance optimization techniques for Java code
Attila Balazs
 
PDF
Customize and Secure the Runtime and Dependencies of Your Procedural Language...
VMware Tanzu
 
PPTX
Troubleshooting .net core on linux
Pavel Klimiankou
 
PDF
10 reasons to be excited about go
Dvir Volk
 
PDF
Specialized Compiler for Hash Cracking
Positive Hack Days
 
High performance json- postgre sql vs. mongodb
Wei Shan Ang
 
Road to sbt 1.0 paved with server
Eugene Yokota
 
Finding Xori: Malware Analysis Triage with Automated Disassembly
Priyanka Aash
 
Shared Database Concurrency
Aivars Kalvans
 
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
PGConf APAC
 
A Life of breakpoint
Hajime Morrita
 
Debugging ZFS: From Illumos to Linux
Serapheim-Nikolaos Dimitropoulos
 
Experiences building a distributed shared log on RADOS - Noah Watkins
Ceph Community
 
lecture16-recap-questions-and-answers.pdf
AyushKumar93531
 
Mender.io | Develop embedded applications faster | Comparing C and Golang
Mender.io
 
Java Memory Model
Łukasz Koniecki
 
Andriy Shalaenko - GO security tips
OWASP Kyiv
 
Ceph Day Melbourne - Troubleshooting Ceph
Ceph Community
 
Os Wilhelm
oscon2007
 
Performance optimization techniques for Java code
Attila Balazs
 
Customize and Secure the Runtime and Dependencies of Your Procedural Language...
VMware Tanzu
 
Troubleshooting .net core on linux
Pavel Klimiankou
 
10 reasons to be excited about go
Dvir Volk
 
Specialized Compiler for Hash Cracking
Positive Hack Days
 
Ad

Recently uploaded (20)

PPTX
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
DOCX
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PDF
July Patch Tuesday
Ivanti
 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
PDF
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
DOCX
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
PPTX
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
PDF
What Makes Contify’s News API Stand Out: Key Features at a Glance
Contify
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PPTX
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PDF
Biography of Daniel Podor.pdf
Daniel Podor
 
PDF
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
PPTX
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
PDF
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
July Patch Tuesday
Ivanti
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
What Makes Contify’s News API Stand Out: Key Features at a Glance
Contify
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
Biography of Daniel Podor.pdf
Daniel Podor
 
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 

Streaming huge databases using logical decoding

  • 1. Streaming Huge Databases using Logical Decoding Adventures of a naive programmer Oleksandr “Alex” Shulgin, Zalando SE
  • 2. Overview ● Introduction ● Problem Statement ● Approach ● Problems ● Some Numbers ● Conclusion
  • 3. Introduction What is Logical Decoding all about? ● A new feature of PostgreSQL since version 9.4. ● Allows streaming database changes in a custom format. ● Requires an Output Plugin to be written (yes, in C). ● Consistent snapshot before all the changes?
  • 4. Logical Decoding CREATE_REPLICATION_SLOT "slot1" LOGICAL <plugin_name>; BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ; SET TRANSACTION SNAPSHOT 'XXXXXXXX-N'; SELECT pg_export_snapshot();
  • 5. Logical Decoding Output { "action": "I", /* INSERT */ "relation": [ "myschema", /* INTO myschema.mytable(id) */ "mytable" ], "newtuple": { "id": 1 /* VALUES(1) */ } }
  • 8. Command Line and Dashboard
  • 9. Long Live the Snapshot { "w-8ee20b3": { "snapshot_id": "51E426C2-1", "ts_start": "2016-01-14 11:02:40 UTC", "heartbeat": "2016-01-15 07:10:27 UTC", "pid": 1, "backend_pid": 58793 }, "w-fbfb655": { "snapshot_id": "51E426C4-1", "ts_start": "2016-01-14 11:02:41 UTC", "heartbeat": "2016-01-15 07:10:28 UTC", "pid": 1, "backend_pid": 58794 }, ...
  • 10. The Source System ● Ubuntu precise (12.04) 3.2.0-xx-generic ● CPU: @2.50GHz Xeon with 24 cores ● RAM: 125.88 GB ● 6x HDDs (spinning drives) in a RAID 1+0, 5 TB total capacity ● Data size: 3.0 TB / 17 B rows (+ 1.8 TB indexes) ● PostgreSQL 9.6devel
  • 11. The Target System “Things are working amazingly fast   when you write to /dev/null.” – proverbial wisdom
  • 13. Problems! ● OpenVPN quickly becomes the bottleneck on the laptop
  • 14. Problems ● OpenVPN quickly becomes the bottleneck on the laptop Obvious solution: deploy workers closer to the database. Docker + Mesosphere DCOS https://zalando-techmonkeys.github.io/
  • 15. Problems ● Workers quickly run out of memory The (problematic) code: cursor.execute("SELECT * FROM mytable")
  • 16. Problems ● Workers quickly run out of memory The (problematic) code: cursor.execute("SELECT * FROM mytable") ● Invokes PQexec(). ● Async. connection doesn’t help. ● psycopg2 is not designed to stream results.
  • 17. Problems ● Invoke COPY protocol! Corrected code: cursor.copy_expert( "COPY (SELECT * FROM mytable) TO STDOUT", ...)
  • 18. Problems ● Invoke COPY protocol! Corrected code: cursor.copy_expert( "COPY (SELECT * FROM mytable) TO STDOUT", ...) ● Tried 32 MB, then 64 MB per worker: it was not enough... ● One of the values was around 80 MB(!). Not much we can do.
  • 19. More Problems? ● More problems with this code The correct(?) code: cursor.copy_expert( "COPY (SELECT * FROM mytable) TO STDOUT", ...)
  • 20. More Problems? ● More problems with this code The correct(?) code: cursor.copy_expert( "COPY (SELECT * FROM mytable) TO STDOUT", ...) ● SELECT * FROM [ONLY] myschema.mytable
  • 21. NoSQL? ● How about some JSON for comparison? SELECT row_to_json(x.*) FROM mytable AS x ● Slows down the export 2-3 times. ● Not 100% equivalent to what output plugin emits. ● Have to write a C function for every plugin.
  • 22. What if we would write a generic function... CREATE FUNCTION pg_logical_slot_stream_relation( IN slot_name NAME, IN relnamespace NAME, IN relname NAME, IN nochildren BOOL DEFAULT FALSE, VARIADIC options TEXT[] DEFAULT '{}', OUT data TEXT ) RETURNS SETOF TEXT ...
  • 23. The Final Code cursor.copy_expert( "COPY (SELECT pg_logical_slot_stream_relation(...)) TO STDOUT", ...) ● Do not use SELECT … FROM pg_logical_slot_… – it caches result in the backend. ● Requires changes to core PostgreSQL. ● Ideally should not require a slot, only a snapshot. ● Slots cannot be used concurrently (yet).
  • 24. At Last: Some Numbers 6 client processes, SSL (no compression), 1Gbit/s network interface Query Run Time Volume Notes SELECT * FROM … 7.5 h 2.7 TB 105 MB/s pglogical / JSON 17.5 h 6.7 TB 112 MB/s pglogical / native 30+ h (incomplete) 11+ TB 106 MB/s Bottled Water / Avro 13.5 h 5.0 TB 108 MB/s
  • 25. Space for Improvement In native protocol format pglogical_output emits metadata per each tuple. ● Metadata overhead: 5.0 TB (167%) ○ nspname + relname + attnames ● Protocol overhead: 1.5 TB (50%) ○ message type + lengths Set plugin option relmeta_cache_size to -1.
  • 26. ● Network is apparently the bottleneck. ● What if we enable SSL compression?.. A Common Number: ~110 MB/s
  • 27. sslcompression=1? ● Nowadays seems to be really hard to do, re: CRIME vulnerability. ● Older distro versions: set env. OPENSSL_DEFAULT_ZLIB ● Newer distro versions: OpenSSL is compiled without zlib. Good luck! ● TLSv1.3 will remove support for compression. ● HINT: your streaming replication is likely to be running uncompressed. https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2012-4929
  • 28. A Much Better Picture
  • 29. Compression FTW! 24 client processes, SSL (with compression) Query Run Time Volume Notes SELECT * FROM … 3h (vs. 7.5 h) 2.7 TB pglogical / JSON 7-8* h (vs. 17.5 h) 6.7 TB *ordering pglogical / native 8-9* h (vs. 30+ h) 7.2 TB (vs. 11+ TB) Bottled Water / Avro 10 h (vs. 13.5 h) 5.0 TB
  • 30. In Conclusion ● Set relmeta_cache_size with pglogical_output native. ● Run a benchmark to see if you need compression. ● Order tables from largest to smallest. ● Do listen on the replication slot once the export is finished. ● Help needed: review the proposed streaming interface!
  • 31. References ● https://github.com/zalando/saiki-tawon ● https://github.com/2ndQuadrant/postgres/tree/dev/pglogical-output ● https://github.com/confluentinc/bottledwater-pg/ ● https://zalando-techmonkeys.github.io ● https://github.com/zalando/pg_view ● http://www.slideshare.net/AlexanderShulgin3/adding-replication- protocol-support-for-psycopg2 ● http://www.postgresql.org/message-id/flat/CACACo5RNZ0OB8K...