SlideShare a Scribd company logo
Getting Started with
      PL/Proxy
          Peter Eisentraut
        peter@eisentraut.org

          F-Secure Corporation



  PostgreSQL Conference East 2011



                                    CC-BY
Concept



  • a database partitioning system implemented as a
   procedural language
  • “sharding”/horizontal partitioning
  • PostgreSQL’s No(t-only)SQL solution
Concept


          application   application        application   application



                                frontend



          partition 1   partition 2        partition 3   partition 4
Areas of Application


   • high write load
   • (high read load)
   • allow for some “eventual consistency”
   • have reasonable partitioning keys
   • use/plan to use server-side functions
Example
 Have:1
 CREATE TABLE products (
     prod_id serial PRIMARY KEY ,
     category integer NOT NULL ,
     title varchar (50) NOT NULL ,
     actor varchar (50) NOT NULL ,
     price numeric (12 ,2) NOT NULL ,
     special smallint ,
     common_prod_id integer NOT NULL
 );

 INSERT INTO products VALUES (...) ;
 UPDATE products SET ... WHERE ...;
 DELETE FROM products WHERE ...;
 plus various queries

   1 dellstore2   example database
Installation



   • Download: http://plproxy.projects.postgresql.org,
     Deb, RPM, . . .
   • Create language: psql -d dellstore2 -f
     ...../plproxy.sql
Backend Functions I
  CREATE FUNCTION insert_product ( p_category int ,
       p_title varchar , p_actor varchar , p_price
       numeric , p_special smallint ,
       p_common_prod_id int ) RETURNS int
  LANGUAGE plpgsql
  AS $$
  DECLARE
        cnt int ;
  BEGIN
        INSERT INTO products ( category , title ,
           actor , price , special , common_prod_id )
           VALUES ( p_category , p_title , p_actor ,
           p_price , p_special , p_common_prod_id ) ;
        GET DIAGNOSTICS cnt = ROW_COUNT ;
        RETURN cnt ;
  END ;
  $$ ;
Backend Functions II
  CREATE FUNCTION update_product_price ( p_prod_id
       int , p_price numeric ) RETURNS int
  LANGUAGE plpgsql
  AS $$
  DECLARE
        cnt int ;
  BEGIN
        UPDATE products SET price = p_price WHERE
            prod_id = p_prod_id ;
        GET DIAGNOSTICS cnt = ROW_COUNT ;
        RETURN cnt ;
  END ;
  $$ ;
Backend Functions III

  CREATE FUNCTION delete_product_by_title ( p_title
       varchar ) RETURNS int
  LANGUAGE plpgsql
  AS $$
  DECLARE
        cnt int ;
  BEGIN
        DELETE FROM products WHERE title = p_title ;
        GET DIAGNOSTICS cnt = ROW_COUNT ;
        RETURN cnt ;
  END ;
  $$ ;
Frontend Functions I
  CREATE FUNCTION insert_product ( p_category int ,
       p_title varchar , p_actor varchar , p_price
       numeric , p_special smallint ,
       p_common_prod_id int ) RETURNS SETOF int
  LANGUAGE plproxy
  AS $$
  CLUSTER ' dellstore_cluster ';
  RUN ON hashtext ( p_title ) ;
  $$ ;

  CREATE FUNCTION update_product_price ( p_prod_id
       int , p_price numeric ) RETURNS SETOF int
  LANGUAGE plproxy
  AS $$
  CLUSTER ' dellstore_cluster ';
  RUN ON ALL ;
  $$ ;
Frontend Functions II


  CREATE FUNCTION delete_product_by_title ( p_title
       varchar ) RETURNS int
  LANGUAGE plpgsql
  AS $$
  CLUSTER ' dellstore_cluster ';
  RUN ON hashtext ( p_title ) ;
  $$ ;
Frontend Query Functions I


  CREATE FUNCTION get_product_price ( p_prod_id
       int ) RETURNS SETOF numeric
  LANGUAGE plproxy
  AS $$
  CLUSTER ' dellstore_cluster ';
  RUN ON ALL ;
  SELECT price FROM products WHERE prod_id =
       p_prod_id ;
  $$ ;
Frontend Query Functions II

  CREATE FUNCTION
       get_products_by_category ( p_category int )
       RETURNS SETOF products
  LANGUAGE plproxy
  AS $$
  CLUSTER ' dellstore_cluster ';
  RUN ON ALL ;
  SELECT * FROM products WHERE category =
       p_category ;
  $$ ;
Unpartitioned Small Tables


  CREATE FUNCTION insert_category ( p_categoryname )
       RETURNS SETOF int
  LANGUAGE plproxy
  AS $$
  CLUSTER ' dellstore_cluster ';
  RUN ON 0;
  $$ ;
Which Hash Key?



   • natural keys (names, descriptions, UUIDs)
   • not serials (Consider using fewer “ID” fields.)
   • single columns
   • group sensibly to allow joins on backend
Set Basic Parameters

   • number of partitions (2n ), e. g. 8
   • host names, e. g.
       • frontend: dbfe
       • backends: dbbe1, . . . , dbbe8
   • database names, e. g.
       • frontend: dellstore2
       • backends: store01, . . . , store08
   • user names, e. g. storeapp
   • hardware:
       • frontend: lots of memory, normal disk
       • backends: full-sized database server
Set Basic Parameters

   • number of partitions (2n ), e. g. 8
   • host names, e. g.
       • frontend: dbfe
       • backends: dbbe1, . . . , dbbe8 (or start at 0?)
   • database names, e. g.
       • frontend: dellstore2
       • backends: store01, . . . , store08 (or start at 0?)
   • user names, e. g. storeapp
   • hardware:
       • frontend: lots of memory, normal disk
       • backends: full-sized database server
Configuration
  CREATE FUNCTION
     plproxy . get_cluster_partitions ( cluster_name
     text ) RETURNS SETOF text LANGUAGE plpgsql AS
     $$ ... $$ ;

  CREATE FUNCTION
     plproxy . get_cluster_version ( cluster_name
     text ) RETURNS int LANGUAGE plpgsql AS
     $$ ... $$ ;

  CREATE FUNCTION plproxy . get_cluster_config ( IN
     cluster_name text , OUT key text , OUT val
     text ) RETURNS SETOF record LANGUAGE plpgsql
     AS $$ ... $$ ;
get_cluster_partitions
  Simplistic approach:
  CREATE FUNCTION
       plproxy . get_cluster_partitions ( cluster_name
       text ) RETURNS SETOF text
  LANGUAGE plpgsql
  AS $$
  BEGIN
        IF cluster_name = ' dellstore_cluster ' THEN
             RETURN NEXT ' dbname = store01 host = dbbe1 ';
             RETURN NEXT ' dbname = store02 host = dbbe2 ';
             ...
             RETURN NEXT ' dbname = store08 host = dbbe8 ';
             RETURN ;
        END IF ;
        RAISE EXCEPTION ' Unknown cluster ';
  END ;
  $$ ;
get_cluster_version
  Simplistic approach:
  CREATE FUNCTION
      plproxy . get_cluster_version ( cluster_name
      text ) RETURNS int
  LANGUAGE plpgsql
  AS $$
  BEGIN
        IF cluster_name = ' dellstore_cluster ' THEN
            RETURN 1;
        END IF ;
        RAISE EXCEPTION ' Unknown cluster ';
  END ;
  $$ LANGUAGE plpgsql ;
get_cluster_config
  CREATE OR REPLACE FUNCTION
       plproxy . get_cluster_config ( IN cluster_name
       text , OUT key text , OUT val text ) RETURNS
       SETOF record
  LANGUAGE plpgsql
  AS $$
  BEGIN
        -- same config for all clusters
        key := ' connection_lifetime ';
        val := 30*60; -- 30 m
        RETURN NEXT ;
        RETURN ;
  END ;
  $$ ;
Table-Driven Configuration I
  CREATE TABLE plproxy . partitions (
      cluster_name text NOT NULL ,
      host text NOT NULL ,
      port text NOT NULL ,
      dbname text NOT NULL ,
      PRIMARY KEY ( cluster_name , dbname )
  );

  INSERT INTO plproxy . partitions        VALUES
  ( ' dellstore_cluster ' , ' dbbe1 ' ,   ' 5432 ' ,
       ' store01 ') ,
  ( ' dellstore_cluster ' , ' dbbe2 ' ,   ' 5432 ' ,
       ' store02 ') ,
  ...
  ( ' dellstore_cluster ' , ' dbbe8 ' ,   ' 5432 ' ,
       ' store03 ') ;
Table-Driven Configuration II

  CREATE TABLE plproxy . cluster_users (
      cluster_name text NOT NULL ,
      remote_user text NOT NULL ,
      local_user NOT NULL ,
      PRIMARY KEY ( cluster_name , remote_user ,
         local_user )
  );

  INSERT INTO plproxy . cluster_users VALUES
  ( ' dellstore_cluster ' , ' storeapp ' , ' storeapp ') ;
Table-Driven Configuration III
  CREATE TABLE plproxy . remote_passwords (
      host text NOT NULL ,
      port text NOT NULL ,
      dbname text NOT NULL ,
      remote_user text NOT NULL ,
      password text ,
      PRIMARY KEY ( host , port , dbname ,
         remote_user )
  );

  INSERT INTO plproxy . remote_passwords VALUES
  ( ' dbbe1 ' , ' 5432 ' , ' store01 ' , ' storeapp ' ,
       ' Thu1Ued0 ') ,
  ...

  -- or use . pgpass ?
Table-Driven Configuration IV

  CREATE TABLE plproxy . cluster_version (
      id int PRIMARY KEY
  );

  INSERT INTO plproxy . cluster_version VALUES (1) ;

  GRANT SELECT ON plproxy . cluster_version TO
     PUBLIC ;

  /* extra credit : write trigger that changes the
     version when one of the other tables changes
     */
Table-Driven Configuration V
  CREATE OR REPLACE FUNCTION plproxy . get_cluster_partitions ( p_cluster_name text )
         RETURNS SETOF text
  LANGUAGE plpgsql
  SECURITY DEFINER
  AS $$
  DECLARE
        r record ;
  BEGIN
        FOR r IN
             SELECT ' host = ' || host || ' port = ' || port || ' dbname = ' || dbname || '
                   user = ' || remote_user || ' password = ' || password AS dsn
             FROM plproxy . partitions NATURAL JOIN plproxy . cluster_users NATURAL JOIN
                   plproxy . remote_passwords
             WHERE cluster_name = p_cluster_name
             AND local_user = session_user
             ORDER BY dbname      -- important
        LOOP
             RETURN NEXT r. dsn ;
        END LOOP ;
        IF NOT found THEN
             RAISE EXCEPTION ' no such cluster : % ', p_cluster_name ;
        END IF ;
        RETURN ;
  END ;
  $$ ;
Table-Driven Configuration VI
  CREATE FUNCTION
       plproxy . get_cluster_version ( p_cluster_name
       text ) RETURNS int
  LANGUAGE plpgsql
  AS $$
  DECLARE
        ret int ;
  BEGIN
        SELECT INTO ret id FROM
            plproxy . cluster_version ;
        RETURN ret ;
  END ;
  $$ ;
SQL/MED Configuration
 CREATE SERVER dellstore_cluster FOREIGN DATA
    WRAPPER plproxy
 OPTIONS (
     connection_lifetime ' 1800 ' ,
     p0 ' dbname = store01 host = dbbe1 ' ,
     p1 ' dbname = store02 host = dbbe2 ' ,
     ...
     p7 ' dbname = store08 host = dbbe8 '
 );

 CREATE USER MAPPING FOR storeapp SERVER
    dellstore_cluster
       OPTIONS ( user ' storeapp ' , password
          ' sekret ') ;

 GRANT USAGE ON SERVER dellstore_cluster TO
    storeapp ;
Hash Functions


  RUN ON hashtext ( somecolumn ) ;

    • want a fast, uniform hash function
    • typically use hashtext
    • problem: implementation might change
    • possible solution: https://github.com/petere/pgvihash
Sequences


 shard 1:
 ALTER SEQUENCE products_prod_id_seq MINVALUE 1
    MAXVALUE 100000000 START 1;
 shard 2:
 ALTER SEQUENCE products_prod_id_seq MINVALUE
    100000001 MAXVALUE 200000000 START 100000001;
 etc.
Aggregates
 Example: count all products
 Backend:
 CREATE FUNCTION count_products () RETURNS bigint
    LANGUAGE SQL STABLE AS $$SELECT count (*)
    FROM products$$ ;
 Frontend:
 CREATE FUNCTION count_products () RETURNS SETOF
      bigint LANGUAGE plproxy AS $$
 CLUSTER ' dellstore_cluster ';
 RUN ON ALL ;
 $$ ;

 SELECT sum ( x ) AS count FROM count_products () AS
    t(x);
Dynamic Queries I
 a. k. a. “cheating” ;-)
  CREATE FUNCTION execute_query ( sql text ) RETURNS
       SETOF RECORD LANGUAGE plproxy
  AS $$
  CLUSTER ' dellstore_cluster ';
  RUN ON ALL ;
  $$ ;

  CREATE FUNCTION execute_query ( sql text ) RETURNS
       SETOF RECORD LANGUAGE plpgsql
  AS $$
  BEGIN
        RETURN QUERY EXECUTE sql ;
  END ;
  $$ ;
Dynamic Queries II

  SELECT * FROM execute_query ( ' SELECT title ,
     price FROM products ') AS ( title varchar ,
     price numeric ) ;

  SELECT category , sum ( sum_price ) FROM
     execute_query ( ' SELECT category , sum ( price )
     FROM products GROUP BY category ') AS
     ( category int , sum_price numeric ) GROUP BY
     category ;
Repartitioning

   • changing partitioning key is extremely cumbersome
   • adding partitions is somewhat cumbersome, e. g., to split
    shard 0:
     COPY ( SELECT * FROM products WHERE
        hashtext ( title :: text ) & 15 <> 0) TO
        ' somewhere ';
     DELETE FROM products WHERE
        hashtext ( title :: text ) & 15 <> 0;
    Better start out with enough partitions!
PgBouncer

          application   application        application   application



                                frontend



          PgBouncer     PgBouncer          PgBouncer     PgBouncer



          partition 1   partition 2        partition 3   partition 4




 Use
 pool_mode = statement
Development Issues



   • foreign keys
   • notifications
   • hash key check constraints
   • testing (pgTAP), no validator
Administration


   • centralized logging
   • distributed shell (dsh)
   • query canceling/timeouts
   • access control, firewalling
   • deployment
High Availability

  Frontend:
    • multiple frontends (DNS, load balancer?)
    • replicate partition configuration (Slony, Bucardo, WAL)
    • Heartbeat, UCARP, etc.
  Backend:
    • replicate backends shards individually (Slony, WAL, DRBD)
    • use partition configuration to configure load spreading or
      failover
Advanced Topics

   • generic insert, update, delete functions
   • frontend joins
   • backend joins
   • finding balance between function interface and dynamic
    queries
   • arrays, SPLIT BY
   • use for remote database calls
   • cross-shard calls
   • SQL/MED (foreign table) integration
The End

More Related Content

What's hot (20)

PPT
Oracle database - Get external data via HTTP, FTP and Web Services
Kim Berg Hansen
 
PDF
PL/Perl - New Features in PostgreSQL 9.0
Tim Bunce
 
PPTX
Read, store and create xml and json
Kim Berg Hansen
 
ODP
Adodb Pdo Presentation
Tom Rogers
 
PPTX
PostgreSQL- An Introduction
Smita Prasad
 
PDF
PDO Basics - PHPMelb 2014
andrewdotcom
 
PDF
Redis & ZeroMQ: How to scale your application
rjsmelo
 
PPTX
Jersey framework
knight1128
 
PDF
PL/Perl - New Features in PostgreSQL 9.0 201012
Tim Bunce
 
PDF
Tips
mclee
 
PDF
Application Logging in the 21st century - 2014.key
Tim Bunce
 
PPT
On UnQLite
charsbar
 
KEY
Hanganalyze presentation
Leyi (Kamus) Zhang
 
PDF
What you need to remember when you upload to CPAN
charsbar
 
PDF
Melhorando sua API com DSLs
Augusto Pascutti
 
PDF
Top 10 Mistakes When Migrating From Oracle to PostgreSQL
Jim Mlodgenski
 
ODP
Plproxy
Joshua Drake
 
PPTX
Why is the application running so slowly?
Michael Rosenblum
 
PPTX
Powershell alias
LearningTech
 
PDF
External Language Stored Procedures for MySQL
Antony T Curtis
 
Oracle database - Get external data via HTTP, FTP and Web Services
Kim Berg Hansen
 
PL/Perl - New Features in PostgreSQL 9.0
Tim Bunce
 
Read, store and create xml and json
Kim Berg Hansen
 
Adodb Pdo Presentation
Tom Rogers
 
PostgreSQL- An Introduction
Smita Prasad
 
PDO Basics - PHPMelb 2014
andrewdotcom
 
Redis & ZeroMQ: How to scale your application
rjsmelo
 
Jersey framework
knight1128
 
PL/Perl - New Features in PostgreSQL 9.0 201012
Tim Bunce
 
Tips
mclee
 
Application Logging in the 21st century - 2014.key
Tim Bunce
 
On UnQLite
charsbar
 
Hanganalyze presentation
Leyi (Kamus) Zhang
 
What you need to remember when you upload to CPAN
charsbar
 
Melhorando sua API com DSLs
Augusto Pascutti
 
Top 10 Mistakes When Migrating From Oracle to PostgreSQL
Jim Mlodgenski
 
Plproxy
Joshua Drake
 
Why is the application running so slowly?
Michael Rosenblum
 
Powershell alias
LearningTech
 
External Language Stored Procedures for MySQL
Antony T Curtis
 

Viewers also liked (10)

PDF
C14 Greenplum Database Technology - Large Scale-out and Next generation Analy...
Insight Technology, Inc.
 
PDF
Implementing Parallelism in PostgreSQL - PGCon 2014
EDB
 
ODP
Chetan postgresql partitioning
OpenSourceIndia
 
PDF
Useful PostgreSQL Extensions
EDB
 
PDF
BigDataを迎え撃つ! PostgreSQL並列分散ミドルウェア「Stado」の紹介と検証報告
Uptime Technologies LLC (JP)
 
PDF
Escalabilidade, Sharding, Paralelismo e Bigdata com PostgreSQL? Yes, we can!
Matheus Espanhol
 
PDF
Evaluating NoSQL Performance: Time for Benchmarking
Sergey Bushik
 
PDF
PostgreSQL в высоконагруженных проектах
Alexey Vasiliev
 
PDF
Couchbase Performance Benchmarking
Renat Khasanshyn
 
PDF
Methods of Sharding MySQL
Laine Campbell
 
C14 Greenplum Database Technology - Large Scale-out and Next generation Analy...
Insight Technology, Inc.
 
Implementing Parallelism in PostgreSQL - PGCon 2014
EDB
 
Chetan postgresql partitioning
OpenSourceIndia
 
Useful PostgreSQL Extensions
EDB
 
BigDataを迎え撃つ! PostgreSQL並列分散ミドルウェア「Stado」の紹介と検証報告
Uptime Technologies LLC (JP)
 
Escalabilidade, Sharding, Paralelismo e Bigdata com PostgreSQL? Yes, we can!
Matheus Espanhol
 
Evaluating NoSQL Performance: Time for Benchmarking
Sergey Bushik
 
PostgreSQL в высоконагруженных проектах
Alexey Vasiliev
 
Couchbase Performance Benchmarking
Renat Khasanshyn
 
Methods of Sharding MySQL
Laine Campbell
 
Ad

Similar to Getting Started with PL/Proxy (20)

PDF
plProxy, pgBouncer, pgBalancer
elliando dias
 
PPTX
SCALE 15x Minimizing PostgreSQL Major Version Upgrade Downtime
Jeff Frost
 
PDF
Postgres Plus Advanced Server 9.2新機能ご紹介
Yuji Fujita
 
PPTX
PostgreSQL - It's kind've a nifty database
Barry Jones
 
PDF
Moskva Architecture Highload
Ontico
 
KEY
PostgreSQL
Reuven Lerner
 
PDF
Demystifying PostgreSQL (Zendcon 2010)
NOLOH LLC.
 
PDF
Demystifying PostgreSQL
NOLOH LLC.
 
PPTX
Syntactic sugar in postgre sql
Antony Abramchenko
 
PDF
Syntactic sugar in Postgre SQL
Antony Abramchenko
 
PDF
0292-introduction-postgresql.pdf
Mustafa Keskin
 
PDF
Heroku Postgres SQL Tips, Tricks, Hacks
Salesforce Developers
 
PDF
ProxySQL Tutorial - PLAM 2016
Derek Downey
 
PDF
Extensions on PostgreSQL
Alpaca
 
PDF
Five steps perform_2013
PostgreSQL Experts, Inc.
 
PDF
PerlApp2Postgresql (2)
Jerome Eteve
 
PPTX
ProxySQL Use Case Scenarios / Alkin Tezuysal (Percona)
Ontico
 
PPTX
Proxysql use case scenarios hl++ 2017
Alkin Tezuysal
 
PDF
PostgreSQL Server Programming Second Edition Usama Dar Hannu Krosing Jim Mlod...
trddarvai
 
PDF
HandsOn ProxySQL Tutorial - PLSC18
Derek Downey
 
plProxy, pgBouncer, pgBalancer
elliando dias
 
SCALE 15x Minimizing PostgreSQL Major Version Upgrade Downtime
Jeff Frost
 
Postgres Plus Advanced Server 9.2新機能ご紹介
Yuji Fujita
 
PostgreSQL - It's kind've a nifty database
Barry Jones
 
Moskva Architecture Highload
Ontico
 
PostgreSQL
Reuven Lerner
 
Demystifying PostgreSQL (Zendcon 2010)
NOLOH LLC.
 
Demystifying PostgreSQL
NOLOH LLC.
 
Syntactic sugar in postgre sql
Antony Abramchenko
 
Syntactic sugar in Postgre SQL
Antony Abramchenko
 
0292-introduction-postgresql.pdf
Mustafa Keskin
 
Heroku Postgres SQL Tips, Tricks, Hacks
Salesforce Developers
 
ProxySQL Tutorial - PLAM 2016
Derek Downey
 
Extensions on PostgreSQL
Alpaca
 
Five steps perform_2013
PostgreSQL Experts, Inc.
 
PerlApp2Postgresql (2)
Jerome Eteve
 
ProxySQL Use Case Scenarios / Alkin Tezuysal (Percona)
Ontico
 
Proxysql use case scenarios hl++ 2017
Alkin Tezuysal
 
PostgreSQL Server Programming Second Edition Usama Dar Hannu Krosing Jim Mlod...
trddarvai
 
HandsOn ProxySQL Tutorial - PLSC18
Derek Downey
 
Ad

More from Peter Eisentraut (20)

PDF
Programming with Python and PostgreSQL
Peter Eisentraut
 
PDF
Linux distribution for the cloud
Peter Eisentraut
 
PDF
Most Wanted: Future PostgreSQL Features
Peter Eisentraut
 
ODP
Porting Applications From Oracle To PostgreSQL
Peter Eisentraut
 
PDF
Porting Oracle Applications to PostgreSQL
Peter Eisentraut
 
PDF
PostgreSQL and XML
Peter Eisentraut
 
PDF
XML Support: Specifications and Development
Peter Eisentraut
 
PDF
PostgreSQL: Die Freie Datenbankalternative
Peter Eisentraut
 
PDF
The Road to the XML Type: Current and Future Developments
Peter Eisentraut
 
PDF
Access ohne Access: Freie Datenbank-Frontends
Peter Eisentraut
 
PDF
Replication Solutions for PostgreSQL
Peter Eisentraut
 
PDF
PostgreSQL News
Peter Eisentraut
 
PDF
PostgreSQL News
Peter Eisentraut
 
PDF
Access ohne Access: Freie Datenbank-Frontends
Peter Eisentraut
 
PDF
Docbook: Textverarbeitung mit XML
Peter Eisentraut
 
PDF
Collateral Damage: Consequences of Spam and Virus Filtering for the E-Mail Sy...
Peter Eisentraut
 
PDF
Collateral Damage: Consequences of Spam and Virus Filtering for the E-Mail S...
Peter Eisentraut
 
PDF
Spaß mit PostgreSQL
Peter Eisentraut
 
PDF
The Common Debian Build System (CDBS)
Peter Eisentraut
 
PDF
SQL/MED and PostgreSQL
Peter Eisentraut
 
Programming with Python and PostgreSQL
Peter Eisentraut
 
Linux distribution for the cloud
Peter Eisentraut
 
Most Wanted: Future PostgreSQL Features
Peter Eisentraut
 
Porting Applications From Oracle To PostgreSQL
Peter Eisentraut
 
Porting Oracle Applications to PostgreSQL
Peter Eisentraut
 
PostgreSQL and XML
Peter Eisentraut
 
XML Support: Specifications and Development
Peter Eisentraut
 
PostgreSQL: Die Freie Datenbankalternative
Peter Eisentraut
 
The Road to the XML Type: Current and Future Developments
Peter Eisentraut
 
Access ohne Access: Freie Datenbank-Frontends
Peter Eisentraut
 
Replication Solutions for PostgreSQL
Peter Eisentraut
 
PostgreSQL News
Peter Eisentraut
 
PostgreSQL News
Peter Eisentraut
 
Access ohne Access: Freie Datenbank-Frontends
Peter Eisentraut
 
Docbook: Textverarbeitung mit XML
Peter Eisentraut
 
Collateral Damage: Consequences of Spam and Virus Filtering for the E-Mail Sy...
Peter Eisentraut
 
Collateral Damage: Consequences of Spam and Virus Filtering for the E-Mail S...
Peter Eisentraut
 
Spaß mit PostgreSQL
Peter Eisentraut
 
The Common Debian Build System (CDBS)
Peter Eisentraut
 
SQL/MED and PostgreSQL
Peter Eisentraut
 

Recently uploaded (20)

PDF
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
PDF
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
PDF
Advancing WebDriver BiDi support in WebKit
Igalia
 
PDF
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
PDF
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
DOCX
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
PPTX
Designing Production-Ready AI Agents
Kunal Rai
 
PDF
What Makes Contify’s News API Stand Out: Key Features at a Glance
Contify
 
PDF
July Patch Tuesday
Ivanti
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PDF
Staying Human in a Machine- Accelerated World
Catalin Jora
 
PPTX
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PPTX
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PDF
Biography of Daniel Podor.pdf
Daniel Podor
 
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
Advancing WebDriver BiDi support in WebKit
Igalia
 
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
Designing Production-Ready AI Agents
Kunal Rai
 
What Makes Contify’s News API Stand Out: Key Features at a Glance
Contify
 
July Patch Tuesday
Ivanti
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
Staying Human in a Machine- Accelerated World
Catalin Jora
 
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
Biography of Daniel Podor.pdf
Daniel Podor
 

Getting Started with PL/Proxy

  • 1. Getting Started with PL/Proxy Peter Eisentraut [email protected] F-Secure Corporation PostgreSQL Conference East 2011 CC-BY
  • 2. Concept • a database partitioning system implemented as a procedural language • “sharding”/horizontal partitioning • PostgreSQL’s No(t-only)SQL solution
  • 3. Concept application application application application frontend partition 1 partition 2 partition 3 partition 4
  • 4. Areas of Application • high write load • (high read load) • allow for some “eventual consistency” • have reasonable partitioning keys • use/plan to use server-side functions
  • 5. Example Have:1 CREATE TABLE products ( prod_id serial PRIMARY KEY , category integer NOT NULL , title varchar (50) NOT NULL , actor varchar (50) NOT NULL , price numeric (12 ,2) NOT NULL , special smallint , common_prod_id integer NOT NULL ); INSERT INTO products VALUES (...) ; UPDATE products SET ... WHERE ...; DELETE FROM products WHERE ...; plus various queries 1 dellstore2 example database
  • 6. Installation • Download: http://plproxy.projects.postgresql.org, Deb, RPM, . . . • Create language: psql -d dellstore2 -f ...../plproxy.sql
  • 7. Backend Functions I CREATE FUNCTION insert_product ( p_category int , p_title varchar , p_actor varchar , p_price numeric , p_special smallint , p_common_prod_id int ) RETURNS int LANGUAGE plpgsql AS $$ DECLARE cnt int ; BEGIN INSERT INTO products ( category , title , actor , price , special , common_prod_id ) VALUES ( p_category , p_title , p_actor , p_price , p_special , p_common_prod_id ) ; GET DIAGNOSTICS cnt = ROW_COUNT ; RETURN cnt ; END ; $$ ;
  • 8. Backend Functions II CREATE FUNCTION update_product_price ( p_prod_id int , p_price numeric ) RETURNS int LANGUAGE plpgsql AS $$ DECLARE cnt int ; BEGIN UPDATE products SET price = p_price WHERE prod_id = p_prod_id ; GET DIAGNOSTICS cnt = ROW_COUNT ; RETURN cnt ; END ; $$ ;
  • 9. Backend Functions III CREATE FUNCTION delete_product_by_title ( p_title varchar ) RETURNS int LANGUAGE plpgsql AS $$ DECLARE cnt int ; BEGIN DELETE FROM products WHERE title = p_title ; GET DIAGNOSTICS cnt = ROW_COUNT ; RETURN cnt ; END ; $$ ;
  • 10. Frontend Functions I CREATE FUNCTION insert_product ( p_category int , p_title varchar , p_actor varchar , p_price numeric , p_special smallint , p_common_prod_id int ) RETURNS SETOF int LANGUAGE plproxy AS $$ CLUSTER ' dellstore_cluster '; RUN ON hashtext ( p_title ) ; $$ ; CREATE FUNCTION update_product_price ( p_prod_id int , p_price numeric ) RETURNS SETOF int LANGUAGE plproxy AS $$ CLUSTER ' dellstore_cluster '; RUN ON ALL ; $$ ;
  • 11. Frontend Functions II CREATE FUNCTION delete_product_by_title ( p_title varchar ) RETURNS int LANGUAGE plpgsql AS $$ CLUSTER ' dellstore_cluster '; RUN ON hashtext ( p_title ) ; $$ ;
  • 12. Frontend Query Functions I CREATE FUNCTION get_product_price ( p_prod_id int ) RETURNS SETOF numeric LANGUAGE plproxy AS $$ CLUSTER ' dellstore_cluster '; RUN ON ALL ; SELECT price FROM products WHERE prod_id = p_prod_id ; $$ ;
  • 13. Frontend Query Functions II CREATE FUNCTION get_products_by_category ( p_category int ) RETURNS SETOF products LANGUAGE plproxy AS $$ CLUSTER ' dellstore_cluster '; RUN ON ALL ; SELECT * FROM products WHERE category = p_category ; $$ ;
  • 14. Unpartitioned Small Tables CREATE FUNCTION insert_category ( p_categoryname ) RETURNS SETOF int LANGUAGE plproxy AS $$ CLUSTER ' dellstore_cluster '; RUN ON 0; $$ ;
  • 15. Which Hash Key? • natural keys (names, descriptions, UUIDs) • not serials (Consider using fewer “ID” fields.) • single columns • group sensibly to allow joins on backend
  • 16. Set Basic Parameters • number of partitions (2n ), e. g. 8 • host names, e. g. • frontend: dbfe • backends: dbbe1, . . . , dbbe8 • database names, e. g. • frontend: dellstore2 • backends: store01, . . . , store08 • user names, e. g. storeapp • hardware: • frontend: lots of memory, normal disk • backends: full-sized database server
  • 17. Set Basic Parameters • number of partitions (2n ), e. g. 8 • host names, e. g. • frontend: dbfe • backends: dbbe1, . . . , dbbe8 (or start at 0?) • database names, e. g. • frontend: dellstore2 • backends: store01, . . . , store08 (or start at 0?) • user names, e. g. storeapp • hardware: • frontend: lots of memory, normal disk • backends: full-sized database server
  • 18. Configuration CREATE FUNCTION plproxy . get_cluster_partitions ( cluster_name text ) RETURNS SETOF text LANGUAGE plpgsql AS $$ ... $$ ; CREATE FUNCTION plproxy . get_cluster_version ( cluster_name text ) RETURNS int LANGUAGE plpgsql AS $$ ... $$ ; CREATE FUNCTION plproxy . get_cluster_config ( IN cluster_name text , OUT key text , OUT val text ) RETURNS SETOF record LANGUAGE plpgsql AS $$ ... $$ ;
  • 19. get_cluster_partitions Simplistic approach: CREATE FUNCTION plproxy . get_cluster_partitions ( cluster_name text ) RETURNS SETOF text LANGUAGE plpgsql AS $$ BEGIN IF cluster_name = ' dellstore_cluster ' THEN RETURN NEXT ' dbname = store01 host = dbbe1 '; RETURN NEXT ' dbname = store02 host = dbbe2 '; ... RETURN NEXT ' dbname = store08 host = dbbe8 '; RETURN ; END IF ; RAISE EXCEPTION ' Unknown cluster '; END ; $$ ;
  • 20. get_cluster_version Simplistic approach: CREATE FUNCTION plproxy . get_cluster_version ( cluster_name text ) RETURNS int LANGUAGE plpgsql AS $$ BEGIN IF cluster_name = ' dellstore_cluster ' THEN RETURN 1; END IF ; RAISE EXCEPTION ' Unknown cluster '; END ; $$ LANGUAGE plpgsql ;
  • 21. get_cluster_config CREATE OR REPLACE FUNCTION plproxy . get_cluster_config ( IN cluster_name text , OUT key text , OUT val text ) RETURNS SETOF record LANGUAGE plpgsql AS $$ BEGIN -- same config for all clusters key := ' connection_lifetime '; val := 30*60; -- 30 m RETURN NEXT ; RETURN ; END ; $$ ;
  • 22. Table-Driven Configuration I CREATE TABLE plproxy . partitions ( cluster_name text NOT NULL , host text NOT NULL , port text NOT NULL , dbname text NOT NULL , PRIMARY KEY ( cluster_name , dbname ) ); INSERT INTO plproxy . partitions VALUES ( ' dellstore_cluster ' , ' dbbe1 ' , ' 5432 ' , ' store01 ') , ( ' dellstore_cluster ' , ' dbbe2 ' , ' 5432 ' , ' store02 ') , ... ( ' dellstore_cluster ' , ' dbbe8 ' , ' 5432 ' , ' store03 ') ;
  • 23. Table-Driven Configuration II CREATE TABLE plproxy . cluster_users ( cluster_name text NOT NULL , remote_user text NOT NULL , local_user NOT NULL , PRIMARY KEY ( cluster_name , remote_user , local_user ) ); INSERT INTO plproxy . cluster_users VALUES ( ' dellstore_cluster ' , ' storeapp ' , ' storeapp ') ;
  • 24. Table-Driven Configuration III CREATE TABLE plproxy . remote_passwords ( host text NOT NULL , port text NOT NULL , dbname text NOT NULL , remote_user text NOT NULL , password text , PRIMARY KEY ( host , port , dbname , remote_user ) ); INSERT INTO plproxy . remote_passwords VALUES ( ' dbbe1 ' , ' 5432 ' , ' store01 ' , ' storeapp ' , ' Thu1Ued0 ') , ... -- or use . pgpass ?
  • 25. Table-Driven Configuration IV CREATE TABLE plproxy . cluster_version ( id int PRIMARY KEY ); INSERT INTO plproxy . cluster_version VALUES (1) ; GRANT SELECT ON plproxy . cluster_version TO PUBLIC ; /* extra credit : write trigger that changes the version when one of the other tables changes */
  • 26. Table-Driven Configuration V CREATE OR REPLACE FUNCTION plproxy . get_cluster_partitions ( p_cluster_name text ) RETURNS SETOF text LANGUAGE plpgsql SECURITY DEFINER AS $$ DECLARE r record ; BEGIN FOR r IN SELECT ' host = ' || host || ' port = ' || port || ' dbname = ' || dbname || ' user = ' || remote_user || ' password = ' || password AS dsn FROM plproxy . partitions NATURAL JOIN plproxy . cluster_users NATURAL JOIN plproxy . remote_passwords WHERE cluster_name = p_cluster_name AND local_user = session_user ORDER BY dbname -- important LOOP RETURN NEXT r. dsn ; END LOOP ; IF NOT found THEN RAISE EXCEPTION ' no such cluster : % ', p_cluster_name ; END IF ; RETURN ; END ; $$ ;
  • 27. Table-Driven Configuration VI CREATE FUNCTION plproxy . get_cluster_version ( p_cluster_name text ) RETURNS int LANGUAGE plpgsql AS $$ DECLARE ret int ; BEGIN SELECT INTO ret id FROM plproxy . cluster_version ; RETURN ret ; END ; $$ ;
  • 28. SQL/MED Configuration CREATE SERVER dellstore_cluster FOREIGN DATA WRAPPER plproxy OPTIONS ( connection_lifetime ' 1800 ' , p0 ' dbname = store01 host = dbbe1 ' , p1 ' dbname = store02 host = dbbe2 ' , ... p7 ' dbname = store08 host = dbbe8 ' ); CREATE USER MAPPING FOR storeapp SERVER dellstore_cluster OPTIONS ( user ' storeapp ' , password ' sekret ') ; GRANT USAGE ON SERVER dellstore_cluster TO storeapp ;
  • 29. Hash Functions RUN ON hashtext ( somecolumn ) ; • want a fast, uniform hash function • typically use hashtext • problem: implementation might change • possible solution: https://github.com/petere/pgvihash
  • 30. Sequences shard 1: ALTER SEQUENCE products_prod_id_seq MINVALUE 1 MAXVALUE 100000000 START 1; shard 2: ALTER SEQUENCE products_prod_id_seq MINVALUE 100000001 MAXVALUE 200000000 START 100000001; etc.
  • 31. Aggregates Example: count all products Backend: CREATE FUNCTION count_products () RETURNS bigint LANGUAGE SQL STABLE AS $$SELECT count (*) FROM products$$ ; Frontend: CREATE FUNCTION count_products () RETURNS SETOF bigint LANGUAGE plproxy AS $$ CLUSTER ' dellstore_cluster '; RUN ON ALL ; $$ ; SELECT sum ( x ) AS count FROM count_products () AS t(x);
  • 32. Dynamic Queries I a. k. a. “cheating” ;-) CREATE FUNCTION execute_query ( sql text ) RETURNS SETOF RECORD LANGUAGE plproxy AS $$ CLUSTER ' dellstore_cluster '; RUN ON ALL ; $$ ; CREATE FUNCTION execute_query ( sql text ) RETURNS SETOF RECORD LANGUAGE plpgsql AS $$ BEGIN RETURN QUERY EXECUTE sql ; END ; $$ ;
  • 33. Dynamic Queries II SELECT * FROM execute_query ( ' SELECT title , price FROM products ') AS ( title varchar , price numeric ) ; SELECT category , sum ( sum_price ) FROM execute_query ( ' SELECT category , sum ( price ) FROM products GROUP BY category ') AS ( category int , sum_price numeric ) GROUP BY category ;
  • 34. Repartitioning • changing partitioning key is extremely cumbersome • adding partitions is somewhat cumbersome, e. g., to split shard 0: COPY ( SELECT * FROM products WHERE hashtext ( title :: text ) & 15 <> 0) TO ' somewhere '; DELETE FROM products WHERE hashtext ( title :: text ) & 15 <> 0; Better start out with enough partitions!
  • 35. PgBouncer application application application application frontend PgBouncer PgBouncer PgBouncer PgBouncer partition 1 partition 2 partition 3 partition 4 Use pool_mode = statement
  • 36. Development Issues • foreign keys • notifications • hash key check constraints • testing (pgTAP), no validator
  • 37. Administration • centralized logging • distributed shell (dsh) • query canceling/timeouts • access control, firewalling • deployment
  • 38. High Availability Frontend: • multiple frontends (DNS, load balancer?) • replicate partition configuration (Slony, Bucardo, WAL) • Heartbeat, UCARP, etc. Backend: • replicate backends shards individually (Slony, WAL, DRBD) • use partition configuration to configure load spreading or failover
  • 39. Advanced Topics • generic insert, update, delete functions • frontend joins • backend joins • finding balance between function interface and dynamic queries • arrays, SPLIT BY • use for remote database calls • cross-shard calls • SQL/MED (foreign table) integration