SlideShare a Scribd company logo
Federated
PostgreSQL
Who Am I?
●

Jim Mlodgenski
–
–

●

jimm@openscg.com
@jim_mlodgenski

Co-organizer of
–
–

●

NYC PUG (www.nycpug.org)
Philly PUG (www.phlpug.org)

CTO, OpenSCG
–

www.openscg.com
http://nyc.pgconf.us
What is a federated database?
“A federated database system is a type of meta-database
management system (DBMS), which transparently maps
multiple autonomous database systems into a single federated
database. The constituent databases are interconnected via a
computer network and may be geographically decentralized. ...
There is no actual data integration in the constituent disparate
databases as a result of data federation.”
-Wikipedia
How does PostgreSQL do it?
●

Uses Foreign Table Wrappers (FDW)

●

Used with SQL/MED
–
–

Management of External Data

–
●

New ANIS SQL 2003 Extension
Standard way of handling remote objects in SQL databases

Wrappers used by SQL/MED to access remotes data
sources
Types of Foreign Data Wrappers
●

SQL

●

NoSQL

●

File

●

Miscellaneous

●

PostgreSQL
SQL Wrappers
●

Oracle

●

SQLite

●

MySQL

●

JDBC

●

Informix

●

ODBC

●

Firebird
SQL Wrappers
CREATE SERVER oracle_server FOREIGN DATA WRAPPER
oracle_fdw OPTIONS (dbserver 'ORACLE_DBNAME');
CREATE USER MAPPING FOR CURRENT_USER
SERVER oracle_server
OPTIONS (user 'scott', password 'tiger');
CREATE FOREIGN TABLE fdw_test (
userid

numeric,

username

text,

email

text

)
SERVER oracle_server
OPTIONS ( schema 'scott', table 'fdw_test');
postgres=# select * from fdw_test;
userid | username |

email

--------+----------+------------------1 | scott
(1 row)

| scott@oracle.com
NoSQL Wrappers
●

MongoDB

●

Redis

●

CouchDB

●

Neo4j

●

MonetDB

●

Tycoon
NoSQL Wrappers
CREATE SERVER mongo_server FOREIGN DATA WRAPPER
mongo_fdw OPTIONS (address '192.168.122.47', port '27017');
CREATE FOREIGN TABLE databases (
_id NAME,
name TEXT
)
SERVER mongo_server
OPTIONS (database 'mydb', collection 'pgData');
test=# select * from databases ;
_id

|

name

--------------------------+-----------52fd49bfba3ae4ea54afc459 | mongo
52fd49bfba3ae4ea54afc45a | postgresql
52fd49bfba3ae4ea54afc45b | oracle
52fd49bfba3ae4ea54afc45c | mysql
52fd49bfba3ae4ea54afc45d | redis
52fd49bfba3ae4ea54afc45e | db2
(6 rows)
File Wrappers
●

Delimited files

●

Fixed length files

●

JSON files
File Wrappers
CREATE SERVER pg_load FOREIGN DATA WRAPPER file_fdw;
CREATE FOREIGN TABLE leads (
first_name text, last_name text,
company_name text, address text,
city text, county text,
state text, zip text,
phone1 text, phone2 text,
email text, web text
) SERVER pg_load
OPTIONS ( filename '/tmp/us-500.csv', format 'csv', header 'TRUE' );
test=# select first_name || ' ' || last_name as full_name, email from leads limit 3;
full_name

|

email

-------------------+------------------------------James Butt

| jbutt@gmail.com

Josephine Darakjy | josephine_darakjy@darakjy.org
Art Venere
(3 rows)

| art@venere.org
Miscellaneous Wrappers
●

Hadoop

●

LDAP

●

S3

●

WWW

●

PG-Strom
Hadoop Wrapper
CREATE SERVER hive_server FOREIGN DATA WRAPPER
hive_fdw OPTIONS (address '127.0.0.1', port '10000');
CREATE USER MAPPING

FOR PUBLIC SERVER hive_server;

CREATE FOREIGN TABLE order_line (
ol_w_id

integer,

ol_d_id

integer,

ol_o_id

integer,

ol_number

integer,

ol_i_id

integer,

ol_delivery_d

timestamp,

ol_amount

decimal(6,2),

ol_supply_w_id

integer,

ol_quantity

decimal(2,0),

ol_dist_info

varchar(24)

) SERVER hive_server OPTIONS (table 'order_line');
INSERT INTO item_sale_month
SELECT ol_i_id as i_id,
EXTRACT(YEAR FROM ol_delivery_d) as year,
EXTRACT(MONTH FROM ol_delivery_d) as month,
sum(ol_amount) as amount
FROM order_line
GROUP BY 1, 2, 3;
Hadoop Wrapper
●

Hadoop foreign tables can also be writable
CREATE FORIEGN TABLE audit (
audit_id

bigint,

event_d

timestamp,

table

varchar,

action

varchar,

user

varchar,

) SERVER hive_server
OPTIONS (table 'audit',
flume_port '44444');
INSERT INTO audit
VALUES (nextval('audit_id_seq'), now(), 'users', 'SELECT', 'scott');
Hadoop Wrapper
●

It also works with HBase tables
CREATE FOREIGN TABLE hive_hbase_table (
key

varchar,

value varchar
) SERVER localhive
OPTIONS (table 'hbase_table', hbase_address 'localhost',
hbase_port '9090', hbase_mapping ':key,cf:val');
INSERT INTO hive_hbase_table VALUES ('key1', 'value1');
INSERT INTO hive_hbase_table VALUES ('key2', 'value2');
UPDATE hive_hbase_table SET value = 'update' WHERE key = 'key2';
DELETE FROM hive_hbase_table WHERE key='key1';
SELECT * from hive_hbase_table;
WWW Wrapper
CREATE SERVER www_fdw_server_google_search FOREIGN DATA WRAPPER www_fdw
OPTIONS (uri 'https://ajax.googleapis.com/ajax/services/search/web?v=1.0');
CREATE USER MAPPING FOR current_user SERVER www_fdw_server_google_search;
CREATE FOREIGN TABLE www_fdw_google_search (
q text, GsearchResultClass text, unescapedUrl text, url text,
visibleUrl text, cacheUrl text, title text, titleNoFormatting text, content text
) SERVER www_fdw_server_google_search;
select url,substring(title,1,25)||'...',substring(content,1,25)||'...'
from www_fdw_google_search where q='postgresql fdw';
url

|

?column?

|

?column?

-------------------------------------------------------------+------------------------------+-----------------------------http://wiki.postgresql.org/wiki/Foreign_data_wrappers

| Foreign data wrappers - <... | Jan 24, 2014 <b>...</b> 1...

http://www.postgresql.org/docs/9.3/static/postgres-fdw.html | <b>PostgreSQL</b>: Docume... | F.31.1. <b>FDW</b> Option...
http://www.postgresql.org/docs/9.3/static/fdwhandler.html

| <b>PostgreSQL</b>: Docume... | Foreign Data Wrapper Call...

http://www.craigkerstiens.com/2013/08/05/a-look-at-FDWs/

| A look at Foreign Data Wr... | Aug 5, 2013 <b>...</b> An...

(4 rows)
PostgreSQL Wrapper
●

The most functional FDW by far

●

Replaces much of the functionality of dblink

●

Shipped as a contrib module
PostgreSQL Wrapper
CREATE SERVER postgres_server FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host 'localhost', port '5432', dbname 'test2');
CREATE USER MAPPING FOR PUBLIC SERVER postgres_server;
CREATE FOREIGN TABLE bird_strikes (
aircraft_type varchar, airport varchar, altitude varchar, aircraft_model varchar,
num_wildlife_struck varchar, impact_to_flight varchar, effect varchar,
location varchar, flight_num varchar, flight_date timestamp,
record_id int, indicated_damage varchar, freeform_en_route varchar, num_engines varchar,
airline varchar, origin_state varchar, phase_of_flight varchar, precipitation varchar,
wildlife_collected boolean, wildlife_sent_to_smithsonian boolean, remarks varchar,
reported_date timestamp, wildlife_size varchar, sky_conditions varchar, wildlife_species varchar,
when_time_hhmm varchar, time_of_day varchar, pilot_warned varchar,
cost_out_of_service varchar, cost_other varchar, cost_repair varchar, cost_total varchar,
miles_from_airport varchar, feet_above_ground varchar, num_human_fatalities integer,
num_injured integer, speed_knots varchar
) SERVER postgres_server OPTIONS (table_name 'bird_strikes');
PostgreSQL Wrapper
●

Only requests columns that are needed
test=# explain verbose select airport, flight_date from bird_strikes;
QUERY PLAN
------------------------------------------------------------------------------Foreign Scan on public.bird_strikes

(cost=100.00..148.40 rows=1280 width=40)

Output: airport, flight_date
Remote SQL: SELECT airport, flight_date FROM public.bird_strikes
(3 rows)
PostgreSQL Wrapper
●

Sends a WHERE clause
test=# explain verbose select airport, flight_date from
bird_strikes where flight_date > '2011-01-01';
QUERY PLAN
-----------------------------------------------------------------Foreign Scan on public.bird_strikes
rows=427 width=40)

(cost=100.00..134.54

Output: airport, flight_date
Remote SQL: SELECT airport, flight_date FROM
public.bird_strikes WHERE ((flight_date > '2011-01-01
00:00:00'::timestamp without time zone))
(3 rows)
PostgreSQL Wrapper
●

Sends built-in immutable functions
test=# explain verbose select airport, flight_date from bird_strikes where flight_date
> '2011-01-01' and length(airport) < 10;
QUERY PLAN
------------------------------------------------------------------------------Foreign Scan on public.bird_strikes

(cost=100.00..135.24 rows=142 width=40)

Output: airport, flight_date
Remote SQL: SELECT airport, flight_date FROM public.bird_strikes WHERE ((flight_date
> '2011-01-01 00:00:00'::timestamp without time zone)) AND ((length(airport) < 10))
(3 rows)
PostgreSQL Wrapper
●

Writable (INSERT, UPDATE, DELETE)
test=# explain verbose update bird_strikes set airport = 'Unknown' where record_id = 313339;
QUERY PLAN
------------------------------------------------------------------------------Update on public.bird_strikes

(cost=100.00..111.05 rows=1 width=964)

Remote SQL: UPDATE public.bird_strikes SET airport = $2 WHERE ctid = $1
->

Foreign Scan on public.bird_strikes

(cost=100.00..111.05 rows=1 width=964)

Output: aircraft_type, 'Unknown'::character varying, altitude, aircraft_model, num_wildlife_struck,
impact_to_flight, effect, location, flight_num, flight_date, record_id, indicated_damage, freefo
rm_en_route, num_engines, airline, origin_state, phase_of_flight, precipitation, wildlife_collected,
wildlife_sent_to_smithsonian, remarks, reported_date, wildlife_size, sky_conditions, wildlife_species, w
hen_time_hhmm, time_of_day, pilot_warned, cost_out_of_service, cost_other, cost_repair, cost_total, miles_from_airport,
feet_above_ground, num_human_fatalities, num_injured, speed_knots, ctid
Remote SQL: SELECT aircraft_type, altitude, aircraft_model, num_wildlife_struck, impact_to_flight, effect,
location, flight_num, flight_date, record_id, indicated_damage, freeform_en_route, num_en
gines, airline, origin_state, phase_of_flight, precipitation, wildlife_collected, wildlife_sent_to_smithsonian, remarks,
reported_date, wildlife_size, sky_conditions, wildlife_species, when_time_hhmm, time
_of_day, pilot_warned, cost_out_of_service, cost_other, cost_repair, cost_total, miles_from_airport, feet_above_ground,
num_human_fatalities, num_injured, speed_knots, ctid FROM public.bird_strikes WHERE (
(record_id = 313339)) FOR UPDATE
(5 rows)
PostgreSQL Wrapper
●

Writes are transactional
test=# select airport from bird_strikes where record_id = 313339;
airport
--------Unknown
(1 row)
test=# BEGIN;
BEGIN
test=# update bird_strikes set airport = 'UNKNOWN' where record_id = 313339;
UPDATE 1
test=# ROLLBACK;
ROLLBACK
test=# select airport from bird_strikes where record_id = 313339;
airport
--------Unknown
(1 row)
Limitations
●

Aggregates are not pushed down
test=# explain verbose select count(*) from bird_strikes;
QUERY PLAN
--------------------------------------------------------------------------------------------------------Aggregate

(cost=220.92..220.93 rows=1 width=0)

Output: count(*)
->

Foreign Scan on public.bird_strikes

(cost=100.00..212.39 rows=3413 width=0)

Output: aircraft_type, airport, altitude, aircraft_model, num_wildlife_struck, impact_to_flight, effect,
location, flight_num, flight_date, record_id, indicated_damage, freeform_en_route, num_engi
nes, airline, origin_state, phase_of_flight, precipitation, wildlife_collected, wildlife_sent_to_smithsonian,
remarks, reported_date, wildlife_size, sky_conditions, wildlife_species, when_time_hhmm, time_o
f_day, pilot_warned, cost_out_of_service, cost_other, cost_repair, cost_total, miles_from_airport,
feet_above_ground, num_human_fatalities, num_injured, speed_knots
Remote SQL: SELECT NULL FROM public.bird_strikes
(5 rows)
Limitations
●

ORDER BY, GROUP BY, LIMIT not pushed down
test=# explain verbose select flight_num from bird_strikes order by flight_date limit 5;
QUERY PLAN
------------------------------------------------------------------------------------------Limit

(cost=169.66..169.67 rows=5 width=40)

Output: flight_num, flight_date
->

Sort

(cost=169.66..172.86 rows=1280 width=40)

Output: flight_num, flight_date
Sort Key: bird_strikes.flight_date
->

Foreign Scan on public.bird_strikes

(cost=100.00..148.40 rows=1280 width=40)

Output: flight_num, flight_date
Remote SQL: SELECT flight_num, flight_date FROM public.bird_strikes
(8 rows)
Limitations
●

Joins not pushed down
test=# explain verbose select s.name, b.flight_date
test-# from bird_strikes b, state_code s
test-# where b.location = s.abbreviation and flight_date > '2011-01-01';
QUERY PLAN
------------------------------------------------------------------------------Hash Join

(cost=239.88..349.95 rows=1986 width=40)

Output: s.name, b.flight_date
Hash Cond: ((s.abbreviation)::text = (b.location)::text)
->

Foreign Scan on public.state_code s

(cost=100.00..137.90 rows=930 width=64)

Output: s.id, s.name, s.abbreviation, s.country, s.type, s.sort, s.status, s.occupied, s.notes, s.fips_state, s.assoc_press,
s.standard_federal_region, s.census_region, s.census_region_name, s.cen
sus_division, s.census_devision_name, s.circuit_court
Remote SQL: SELECT name, abbreviation FROM public.state_code
->

Hash

(cost=134.54..134.54 rows=427 width=40)

Output: b.flight_date, b.location
->

Foreign Scan on public.bird_strikes b

(cost=100.00..134.54 rows=427 width=40)

Output: b.flight_date, b.location
Remote SQL: SELECT location, flight_date FROM public.bird_strikes WHERE ((flight_date > '2011-01-01 00:00:00'::timestamp
without time zone))
(11 rows)
Limitations (Gotcha)
●

Sometimes the foreign tables don't act like tables
test=# SELECT l.*, w.lat, w.lng
FROM leads l, www_fdw_geocoder_google w
WHERE w.address = l.address || ',' || l.city || ',' || l.state;
first_name | last_name | company_name | address | city | county |
state | zip | phone1 | phone2 | email | web | lat | lng
------------+-----------+--------------+---------+------+-------+-------+-----+--------+--------+-------+-----+-----+----(0 rows)
Limitations (Gotcha)
QUERY PLAN
------------------------------------------------------------------------------------------Merge Join

(cost=187.47..215.47 rows=1000 width=448)

Output: l.first_name, l.last_name, l.company_name, l.address, l.city, l.county, l.state, l.zip, l.phone1, l.phone2, l.email, l.web, w.lat,
w.lng
Merge Cond: ((((((l.address || ','::text) || l.city) || ','::text) || l.state)) = w.address)
->

Sort

(cost=37.64..38.14 rows=200 width=384)

Output: l.first_name, l.last_name, l.company_name, l.address, l.city, l.county, l.state, l.zip, l.phone1, l.phone2, l.email, l.web,
(((((l.address || ','::text) || l.city) || ','::text) || l.state
))
Sort Key: (((((l.address || ','::text) || l.city) || ','::text) || l.state))
->

Foreign Scan on public.leads l

(cost=0.00..30.00 rows=200 width=384)

Output: l.first_name, l.last_name, l.company_name, l.address, l.city, l.county, l.state, l.zip, l.phone1, l.phone2, l.email,
l.web, ((((l.address || ','::text) || l.city) || ','::text) || l.
state)
Foreign File: /tmp/us-500.csv
Foreign File Size: 81485
->

Sort

(cost=149.83..152.33 rows=1000 width=96)

Output: w.lat, w.lng, w.address
Sort Key: w.address
->

Foreign Scan on public.www_fdw_geocoder_google w
Output: w.lat, w.lng, w.address
WWW API: Request

(16 rows)

(cost=0.00..100.00 rows=1000 width=96)
Limitations (Gotcha)
CREATE OR REPLACE FUNCTION google_geocode(
OUT first_name text, OUT last_name text, OUT company_name text, OUT address text, OUT city text, OUT county text,
OUT state text, OUT zip text, OUT phone1 text, OUT phone2 text, OUT email text, OUT web text, OUT lat text, OUT lng text)
RETURNS SETOF RECORD AS $$
DECLARE
r

record;

f_adr text;
l_lat text;
l_lng text;
BEGIN
FOR r IN SELECT * FROM leads LOOP
f_adr := r.address || ',' || r.city || ',' || r.state;
EXECUTE 'SELECT lat, lng FROM www_fdw_geocoder_google WHERE address = $1'
INTO l_lat, l_lng
USING f_adr;
SELECT

r.first_name, r.last_name, r.company_name, r.address, r.city, r.county, r.state, r.zip,
r.phone1, r.phone2, r.email, r.web, l_lat, l_lng

INTO first_name, last_name, company_name, address, city, county, state, zip,
phone1, phone2, email, web, lat, lng;
RETURN NEXT;
END LOOP;
END $$ LANGUAGE plpgsql;
Writing a new FDW
●

Might not need to write one if there is a http interface

●

Use the Blackhole as a template
–

https://bitbucket.org/adunstan/blackhole_fdw
Writing a new FDW
Datum blackhole_fdw_handler(PG_FUNCTION_ARGS){
...
/* these are required */
fdwroutine->GetForeignRelSize = blackholeGetForeignRelSize;
fdwroutine->GetForeignPaths = blackholeGetForeignPaths;
fdwroutine->GetForeignPlan = blackholeGetForeignPlan;
fdwroutine->BeginForeignScan = blackholeBeginForeignScan;
fdwroutine->IterateForeignScan = blackholeIterateForeignScan;
fdwroutine->ReScanForeignScan = blackholeReScanForeignScan;
fdwroutine->EndForeignScan = blackholeEndForeignScan;
/* remainder are optional - use NULL if not required */
/* support for insert / update / delete */
fdwroutine->AddForeignUpdateTargets = blackholeAddForeignUpdateTargets;
fdwroutine->PlanForeignModify = blackholePlanForeignModify;
fdwroutine->BeginForeignModify = blackholeBeginForeignModify;
fdwroutine->ExecForeignInsert = blackholeExecForeignInsert;
fdwroutine->ExecForeignUpdate = blackholeExecForeignUpdate;
fdwroutine->ExecForeignDelete = blackholeExecForeignDelete;
fdwroutine->EndForeignModify = blackholeEndForeignModify;
/* support for EXPLAIN */
fdwroutine->ExplainForeignScan = blackholeExplainForeignScan;
fdwroutine->ExplainForeignModify = blackholeExplainForeignModify;
/* support for ANALYSE */
fdwroutine->AnalyzeForeignTable = blackholeAnalyzeForeignTable;
PG_RETURN_POINTER(fdwroutine);
}
Future
●

Even more Wrappers

●

Check Constraints on Foreign Tables
–

●

Allows partitioning

Joins
–

Custom Scan API
●

Probably will not be the way to do this, but progress being made
Questions?
jimm@openscg.com
@jim_mlodgenski

More Related Content

What's hot (20)

PDF
Tradeoffs in Distributed Systems Design: Is Kafka The Best? (Ben Stopford and...
HostedbyConfluent
 
PDF
Using the FLaNK Stack for edge ai (flink, nifi, kafka, kudu)
Timothy Spann
 
PDF
Stream Processing 과 Confluent Cloud 시작하기
confluent
 
PDF
iostatの見方
Yohei Azekatsu
 
PDF
Tangram: Distributed Scheduling Framework for Apache Spark at Facebook
Databricks
 
PDF
なぜ自社で脆弱性診断を行うべきなのか
Sen Ueno
 
PDF
Kernel Recipes 2017: Using Linux perf at Netflix
Brendan Gregg
 
PDF
How to Design Indexes, Really
Karwin Software Solutions LLC
 
PDF
Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...
HostedbyConfluent
 
PPTX
iostat await svctm の 見かた、考え方
歩 柴田
 
KEY
PostgreSQL
Reuven Lerner
 
PPTX
やってはいけない空振りDelete
Yu Yamada
 
PPTX
in-memory database system and low latency
hyeongchae lee
 
PDF
Automated master failover
Yoshinori Matsunobu
 
PPTX
Zabbixによるオートスケーリングクラスタ監視とオペレーション自動化
真乙 九龍
 
PDF
Linux Performance Analysis: New Tools and Old Secrets
Brendan Gregg
 
PPTX
Kafka replication apachecon_2013
Jun Rao
 
PDF
MySQL Advanced Administrator 2021 - 네오클로바
NeoClova
 
PDF
スケールアップファーストのNoSQL、ScyllaDB(スキュラDB)
昌桓 李
 
PDF
Sql Antipatterns Strike Back
Karwin Software Solutions LLC
 
Tradeoffs in Distributed Systems Design: Is Kafka The Best? (Ben Stopford and...
HostedbyConfluent
 
Using the FLaNK Stack for edge ai (flink, nifi, kafka, kudu)
Timothy Spann
 
Stream Processing 과 Confluent Cloud 시작하기
confluent
 
iostatの見方
Yohei Azekatsu
 
Tangram: Distributed Scheduling Framework for Apache Spark at Facebook
Databricks
 
なぜ自社で脆弱性診断を行うべきなのか
Sen Ueno
 
Kernel Recipes 2017: Using Linux perf at Netflix
Brendan Gregg
 
How to Design Indexes, Really
Karwin Software Solutions LLC
 
Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...
HostedbyConfluent
 
iostat await svctm の 見かた、考え方
歩 柴田
 
PostgreSQL
Reuven Lerner
 
やってはいけない空振りDelete
Yu Yamada
 
in-memory database system and low latency
hyeongchae lee
 
Automated master failover
Yoshinori Matsunobu
 
Zabbixによるオートスケーリングクラスタ監視とオペレーション自動化
真乙 九龍
 
Linux Performance Analysis: New Tools and Old Secrets
Brendan Gregg
 
Kafka replication apachecon_2013
Jun Rao
 
MySQL Advanced Administrator 2021 - 네오클로바
NeoClova
 
スケールアップファーストのNoSQL、ScyllaDB(スキュラDB)
昌桓 李
 
Sql Antipatterns Strike Back
Karwin Software Solutions LLC
 

Similar to Postgresql Federation (20)

PDF
2013 Collaborate - OAUG - Presentation
Biju Thomas
 
PPT
Leveraging Hadoop in your PostgreSQL Environment
Jim Mlodgenski
 
PDF
ProxySQL and the Tricks Up Its Sleeve - Percona Live 2022.pdf
Jesmar Cannao'
 
PDF
Codepot - Pig i Hive: szybkie wprowadzenie / Pig and Hive crash course
Sages
 
PDF
Search@airbnb
Mousom Gupta
 
PDF
PostgreSQL Procedural Languages: Tips, Tricks and Gotchas
Jim Mlodgenski
 
PPTX
Drupal 8 migrate!
Andy Postnikov
 
PDF
SF Big Analytics 20191112: How to performance-tune Spark applications in larg...
Chester Chen
 
PDF
Troubleshooting PostgreSQL Streaming Replication
Alexey Lesovsky
 
PDF
Presto anatomy
Dongmin Yu
 
PDF
Application Monitoring using Open Source: VictoriaMetrics - ClickHouse
VictoriaMetrics
 
PDF
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Altinity Ltd
 
PDF
MySQL Workbench for DFW Unix Users Group
Dave Stokes
 
PDF
Creating PostgreSQL-as-a-Service at Scale
Sean Chittenden
 
PDF
JavaScript client API for Google Apps Script API primer
Bruce McPherson
 
PDF
OQGraph @ SCaLE 11x 2013
Antony T Curtis
 
PDF
[245] presto 내부구조 파헤치기
NAVER D2
 
PDF
FOSDEM 2012: MySQL synchronous replication in practice with Galera
FromDual GmbH
 
PPTX
Love Your Database Railsconf 2017
gisborne
 
PDF
PostgreSQL - масштабирование в моде, Valentine Gogichashvili (Zalando SE)
Ontico
 
2013 Collaborate - OAUG - Presentation
Biju Thomas
 
Leveraging Hadoop in your PostgreSQL Environment
Jim Mlodgenski
 
ProxySQL and the Tricks Up Its Sleeve - Percona Live 2022.pdf
Jesmar Cannao'
 
Codepot - Pig i Hive: szybkie wprowadzenie / Pig and Hive crash course
Sages
 
Search@airbnb
Mousom Gupta
 
PostgreSQL Procedural Languages: Tips, Tricks and Gotchas
Jim Mlodgenski
 
Drupal 8 migrate!
Andy Postnikov
 
SF Big Analytics 20191112: How to performance-tune Spark applications in larg...
Chester Chen
 
Troubleshooting PostgreSQL Streaming Replication
Alexey Lesovsky
 
Presto anatomy
Dongmin Yu
 
Application Monitoring using Open Source: VictoriaMetrics - ClickHouse
VictoriaMetrics
 
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Altinity Ltd
 
MySQL Workbench for DFW Unix Users Group
Dave Stokes
 
Creating PostgreSQL-as-a-Service at Scale
Sean Chittenden
 
JavaScript client API for Google Apps Script API primer
Bruce McPherson
 
OQGraph @ SCaLE 11x 2013
Antony T Curtis
 
[245] presto 내부구조 파헤치기
NAVER D2
 
FOSDEM 2012: MySQL synchronous replication in practice with Galera
FromDual GmbH
 
Love Your Database Railsconf 2017
gisborne
 
PostgreSQL - масштабирование в моде, Valentine Gogichashvili (Zalando SE)
Ontico
 
Ad

More from Jim Mlodgenski (10)

PDF
Strategic autovacuum
Jim Mlodgenski
 
PDF
Top 10 Mistakes When Migrating From Oracle to PostgreSQL
Jim Mlodgenski
 
PDF
Oracle postgre sql-mirgration-top-10-mistakes
Jim Mlodgenski
 
PDF
Profiling PL/pgSQL
Jim Mlodgenski
 
PDF
Debugging Your PL/pgSQL Code
Jim Mlodgenski
 
PDF
An Introduction To PostgreSQL Triggers
Jim Mlodgenski
 
ODP
Introduction to PostgreSQL
Jim Mlodgenski
 
PDF
Scaling PostreSQL with Stado
Jim Mlodgenski
 
ODP
Multi-Master Replication with Slony
Jim Mlodgenski
 
ODP
Scaling PostgreSQL With GridSQL
Jim Mlodgenski
 
Strategic autovacuum
Jim Mlodgenski
 
Top 10 Mistakes When Migrating From Oracle to PostgreSQL
Jim Mlodgenski
 
Oracle postgre sql-mirgration-top-10-mistakes
Jim Mlodgenski
 
Profiling PL/pgSQL
Jim Mlodgenski
 
Debugging Your PL/pgSQL Code
Jim Mlodgenski
 
An Introduction To PostgreSQL Triggers
Jim Mlodgenski
 
Introduction to PostgreSQL
Jim Mlodgenski
 
Scaling PostreSQL with Stado
Jim Mlodgenski
 
Multi-Master Replication with Slony
Jim Mlodgenski
 
Scaling PostgreSQL With GridSQL
Jim Mlodgenski
 
Ad

Recently uploaded (20)

PPTX
Agentforce World Tour Toronto '25 - MCP with MuleSoft
Alexandra N. Martinez
 
PPTX
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PPTX
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
PDF
Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
PDF
🚀 Let’s Build Our First Slack Workflow! 🔧.pdf
SanjeetMishra29
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PDF
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
PDF
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 
PPTX
Digital Circuits, important subject in CS
contactparinay1
 
PDF
UPDF - AI PDF Editor & Converter Key Features
DealFuel
 
DOCX
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
PDF
ICONIQ State of AI Report 2025 - The Builder's Playbook
Razin Mustafiz
 
PDF
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
PDF
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
PDF
Kit-Works Team Study_20250627_한달만에만든사내서비스키링(양다윗).pdf
Wonjun Hwang
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit
 
PDF
Peak of Data & AI Encore AI-Enhanced Workflows for the Real World
Safe Software
 
PPTX
Mastering ODC + Okta Configuration - Chennai OSUG
HathiMaryA
 
Agentforce World Tour Toronto '25 - MCP with MuleSoft
Alexandra N. Martinez
 
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
🚀 Let’s Build Our First Slack Workflow! 🔧.pdf
SanjeetMishra29
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 
Digital Circuits, important subject in CS
contactparinay1
 
UPDF - AI PDF Editor & Converter Key Features
DealFuel
 
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
ICONIQ State of AI Report 2025 - The Builder's Playbook
Razin Mustafiz
 
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
Kit-Works Team Study_20250627_한달만에만든사내서비스키링(양다윗).pdf
Wonjun Hwang
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit
 
Peak of Data & AI Encore AI-Enhanced Workflows for the Real World
Safe Software
 
Mastering ODC + Okta Configuration - Chennai OSUG
HathiMaryA
 

Postgresql Federation

  • 2. Who Am I? ● Jim Mlodgenski – – ● [email protected] @jim_mlodgenski Co-organizer of – – ● NYC PUG (www.nycpug.org) Philly PUG (www.phlpug.org) CTO, OpenSCG – www.openscg.com
  • 4. What is a federated database? “A federated database system is a type of meta-database management system (DBMS), which transparently maps multiple autonomous database systems into a single federated database. The constituent databases are interconnected via a computer network and may be geographically decentralized. ... There is no actual data integration in the constituent disparate databases as a result of data federation.” -Wikipedia
  • 5. How does PostgreSQL do it? ● Uses Foreign Table Wrappers (FDW) ● Used with SQL/MED – – Management of External Data – ● New ANIS SQL 2003 Extension Standard way of handling remote objects in SQL databases Wrappers used by SQL/MED to access remotes data sources
  • 6. Types of Foreign Data Wrappers ● SQL ● NoSQL ● File ● Miscellaneous ● PostgreSQL
  • 8. SQL Wrappers CREATE SERVER oracle_server FOREIGN DATA WRAPPER oracle_fdw OPTIONS (dbserver 'ORACLE_DBNAME'); CREATE USER MAPPING FOR CURRENT_USER SERVER oracle_server OPTIONS (user 'scott', password 'tiger'); CREATE FOREIGN TABLE fdw_test ( userid numeric, username text, email text ) SERVER oracle_server OPTIONS ( schema 'scott', table 'fdw_test'); postgres=# select * from fdw_test; userid | username | email --------+----------+------------------1 | scott (1 row) | [email protected]
  • 10. NoSQL Wrappers CREATE SERVER mongo_server FOREIGN DATA WRAPPER mongo_fdw OPTIONS (address '192.168.122.47', port '27017'); CREATE FOREIGN TABLE databases ( _id NAME, name TEXT ) SERVER mongo_server OPTIONS (database 'mydb', collection 'pgData'); test=# select * from databases ; _id | name --------------------------+-----------52fd49bfba3ae4ea54afc459 | mongo 52fd49bfba3ae4ea54afc45a | postgresql 52fd49bfba3ae4ea54afc45b | oracle 52fd49bfba3ae4ea54afc45c | mysql 52fd49bfba3ae4ea54afc45d | redis 52fd49bfba3ae4ea54afc45e | db2 (6 rows)
  • 11. File Wrappers ● Delimited files ● Fixed length files ● JSON files
  • 12. File Wrappers CREATE SERVER pg_load FOREIGN DATA WRAPPER file_fdw; CREATE FOREIGN TABLE leads ( first_name text, last_name text, company_name text, address text, city text, county text, state text, zip text, phone1 text, phone2 text, email text, web text ) SERVER pg_load OPTIONS ( filename '/tmp/us-500.csv', format 'csv', header 'TRUE' ); test=# select first_name || ' ' || last_name as full_name, email from leads limit 3; full_name | email -------------------+------------------------------James Butt | [email protected] Josephine Darakjy | [email protected] Art Venere (3 rows) | [email protected]
  • 14. Hadoop Wrapper CREATE SERVER hive_server FOREIGN DATA WRAPPER hive_fdw OPTIONS (address '127.0.0.1', port '10000'); CREATE USER MAPPING FOR PUBLIC SERVER hive_server; CREATE FOREIGN TABLE order_line ( ol_w_id integer, ol_d_id integer, ol_o_id integer, ol_number integer, ol_i_id integer, ol_delivery_d timestamp, ol_amount decimal(6,2), ol_supply_w_id integer, ol_quantity decimal(2,0), ol_dist_info varchar(24) ) SERVER hive_server OPTIONS (table 'order_line'); INSERT INTO item_sale_month SELECT ol_i_id as i_id, EXTRACT(YEAR FROM ol_delivery_d) as year, EXTRACT(MONTH FROM ol_delivery_d) as month, sum(ol_amount) as amount FROM order_line GROUP BY 1, 2, 3;
  • 15. Hadoop Wrapper ● Hadoop foreign tables can also be writable CREATE FORIEGN TABLE audit ( audit_id bigint, event_d timestamp, table varchar, action varchar, user varchar, ) SERVER hive_server OPTIONS (table 'audit', flume_port '44444'); INSERT INTO audit VALUES (nextval('audit_id_seq'), now(), 'users', 'SELECT', 'scott');
  • 16. Hadoop Wrapper ● It also works with HBase tables CREATE FOREIGN TABLE hive_hbase_table ( key varchar, value varchar ) SERVER localhive OPTIONS (table 'hbase_table', hbase_address 'localhost', hbase_port '9090', hbase_mapping ':key,cf:val'); INSERT INTO hive_hbase_table VALUES ('key1', 'value1'); INSERT INTO hive_hbase_table VALUES ('key2', 'value2'); UPDATE hive_hbase_table SET value = 'update' WHERE key = 'key2'; DELETE FROM hive_hbase_table WHERE key='key1'; SELECT * from hive_hbase_table;
  • 17. WWW Wrapper CREATE SERVER www_fdw_server_google_search FOREIGN DATA WRAPPER www_fdw OPTIONS (uri 'https://ajax.googleapis.com/ajax/services/search/web?v=1.0'); CREATE USER MAPPING FOR current_user SERVER www_fdw_server_google_search; CREATE FOREIGN TABLE www_fdw_google_search ( q text, GsearchResultClass text, unescapedUrl text, url text, visibleUrl text, cacheUrl text, title text, titleNoFormatting text, content text ) SERVER www_fdw_server_google_search; select url,substring(title,1,25)||'...',substring(content,1,25)||'...' from www_fdw_google_search where q='postgresql fdw'; url | ?column? | ?column? -------------------------------------------------------------+------------------------------+-----------------------------http://wiki.postgresql.org/wiki/Foreign_data_wrappers | Foreign data wrappers - <... | Jan 24, 2014 <b>...</b> 1... http://www.postgresql.org/docs/9.3/static/postgres-fdw.html | <b>PostgreSQL</b>: Docume... | F.31.1. <b>FDW</b> Option... http://www.postgresql.org/docs/9.3/static/fdwhandler.html | <b>PostgreSQL</b>: Docume... | Foreign Data Wrapper Call... http://www.craigkerstiens.com/2013/08/05/a-look-at-FDWs/ | A look at Foreign Data Wr... | Aug 5, 2013 <b>...</b> An... (4 rows)
  • 18. PostgreSQL Wrapper ● The most functional FDW by far ● Replaces much of the functionality of dblink ● Shipped as a contrib module
  • 19. PostgreSQL Wrapper CREATE SERVER postgres_server FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host 'localhost', port '5432', dbname 'test2'); CREATE USER MAPPING FOR PUBLIC SERVER postgres_server; CREATE FOREIGN TABLE bird_strikes ( aircraft_type varchar, airport varchar, altitude varchar, aircraft_model varchar, num_wildlife_struck varchar, impact_to_flight varchar, effect varchar, location varchar, flight_num varchar, flight_date timestamp, record_id int, indicated_damage varchar, freeform_en_route varchar, num_engines varchar, airline varchar, origin_state varchar, phase_of_flight varchar, precipitation varchar, wildlife_collected boolean, wildlife_sent_to_smithsonian boolean, remarks varchar, reported_date timestamp, wildlife_size varchar, sky_conditions varchar, wildlife_species varchar, when_time_hhmm varchar, time_of_day varchar, pilot_warned varchar, cost_out_of_service varchar, cost_other varchar, cost_repair varchar, cost_total varchar, miles_from_airport varchar, feet_above_ground varchar, num_human_fatalities integer, num_injured integer, speed_knots varchar ) SERVER postgres_server OPTIONS (table_name 'bird_strikes');
  • 20. PostgreSQL Wrapper ● Only requests columns that are needed test=# explain verbose select airport, flight_date from bird_strikes; QUERY PLAN ------------------------------------------------------------------------------Foreign Scan on public.bird_strikes (cost=100.00..148.40 rows=1280 width=40) Output: airport, flight_date Remote SQL: SELECT airport, flight_date FROM public.bird_strikes (3 rows)
  • 21. PostgreSQL Wrapper ● Sends a WHERE clause test=# explain verbose select airport, flight_date from bird_strikes where flight_date > '2011-01-01'; QUERY PLAN -----------------------------------------------------------------Foreign Scan on public.bird_strikes rows=427 width=40) (cost=100.00..134.54 Output: airport, flight_date Remote SQL: SELECT airport, flight_date FROM public.bird_strikes WHERE ((flight_date > '2011-01-01 00:00:00'::timestamp without time zone)) (3 rows)
  • 22. PostgreSQL Wrapper ● Sends built-in immutable functions test=# explain verbose select airport, flight_date from bird_strikes where flight_date > '2011-01-01' and length(airport) < 10; QUERY PLAN ------------------------------------------------------------------------------Foreign Scan on public.bird_strikes (cost=100.00..135.24 rows=142 width=40) Output: airport, flight_date Remote SQL: SELECT airport, flight_date FROM public.bird_strikes WHERE ((flight_date > '2011-01-01 00:00:00'::timestamp without time zone)) AND ((length(airport) < 10)) (3 rows)
  • 23. PostgreSQL Wrapper ● Writable (INSERT, UPDATE, DELETE) test=# explain verbose update bird_strikes set airport = 'Unknown' where record_id = 313339; QUERY PLAN ------------------------------------------------------------------------------Update on public.bird_strikes (cost=100.00..111.05 rows=1 width=964) Remote SQL: UPDATE public.bird_strikes SET airport = $2 WHERE ctid = $1 -> Foreign Scan on public.bird_strikes (cost=100.00..111.05 rows=1 width=964) Output: aircraft_type, 'Unknown'::character varying, altitude, aircraft_model, num_wildlife_struck, impact_to_flight, effect, location, flight_num, flight_date, record_id, indicated_damage, freefo rm_en_route, num_engines, airline, origin_state, phase_of_flight, precipitation, wildlife_collected, wildlife_sent_to_smithsonian, remarks, reported_date, wildlife_size, sky_conditions, wildlife_species, w hen_time_hhmm, time_of_day, pilot_warned, cost_out_of_service, cost_other, cost_repair, cost_total, miles_from_airport, feet_above_ground, num_human_fatalities, num_injured, speed_knots, ctid Remote SQL: SELECT aircraft_type, altitude, aircraft_model, num_wildlife_struck, impact_to_flight, effect, location, flight_num, flight_date, record_id, indicated_damage, freeform_en_route, num_en gines, airline, origin_state, phase_of_flight, precipitation, wildlife_collected, wildlife_sent_to_smithsonian, remarks, reported_date, wildlife_size, sky_conditions, wildlife_species, when_time_hhmm, time _of_day, pilot_warned, cost_out_of_service, cost_other, cost_repair, cost_total, miles_from_airport, feet_above_ground, num_human_fatalities, num_injured, speed_knots, ctid FROM public.bird_strikes WHERE ( (record_id = 313339)) FOR UPDATE (5 rows)
  • 24. PostgreSQL Wrapper ● Writes are transactional test=# select airport from bird_strikes where record_id = 313339; airport --------Unknown (1 row) test=# BEGIN; BEGIN test=# update bird_strikes set airport = 'UNKNOWN' where record_id = 313339; UPDATE 1 test=# ROLLBACK; ROLLBACK test=# select airport from bird_strikes where record_id = 313339; airport --------Unknown (1 row)
  • 25. Limitations ● Aggregates are not pushed down test=# explain verbose select count(*) from bird_strikes; QUERY PLAN --------------------------------------------------------------------------------------------------------Aggregate (cost=220.92..220.93 rows=1 width=0) Output: count(*) -> Foreign Scan on public.bird_strikes (cost=100.00..212.39 rows=3413 width=0) Output: aircraft_type, airport, altitude, aircraft_model, num_wildlife_struck, impact_to_flight, effect, location, flight_num, flight_date, record_id, indicated_damage, freeform_en_route, num_engi nes, airline, origin_state, phase_of_flight, precipitation, wildlife_collected, wildlife_sent_to_smithsonian, remarks, reported_date, wildlife_size, sky_conditions, wildlife_species, when_time_hhmm, time_o f_day, pilot_warned, cost_out_of_service, cost_other, cost_repair, cost_total, miles_from_airport, feet_above_ground, num_human_fatalities, num_injured, speed_knots Remote SQL: SELECT NULL FROM public.bird_strikes (5 rows)
  • 26. Limitations ● ORDER BY, GROUP BY, LIMIT not pushed down test=# explain verbose select flight_num from bird_strikes order by flight_date limit 5; QUERY PLAN ------------------------------------------------------------------------------------------Limit (cost=169.66..169.67 rows=5 width=40) Output: flight_num, flight_date -> Sort (cost=169.66..172.86 rows=1280 width=40) Output: flight_num, flight_date Sort Key: bird_strikes.flight_date -> Foreign Scan on public.bird_strikes (cost=100.00..148.40 rows=1280 width=40) Output: flight_num, flight_date Remote SQL: SELECT flight_num, flight_date FROM public.bird_strikes (8 rows)
  • 27. Limitations ● Joins not pushed down test=# explain verbose select s.name, b.flight_date test-# from bird_strikes b, state_code s test-# where b.location = s.abbreviation and flight_date > '2011-01-01'; QUERY PLAN ------------------------------------------------------------------------------Hash Join (cost=239.88..349.95 rows=1986 width=40) Output: s.name, b.flight_date Hash Cond: ((s.abbreviation)::text = (b.location)::text) -> Foreign Scan on public.state_code s (cost=100.00..137.90 rows=930 width=64) Output: s.id, s.name, s.abbreviation, s.country, s.type, s.sort, s.status, s.occupied, s.notes, s.fips_state, s.assoc_press, s.standard_federal_region, s.census_region, s.census_region_name, s.cen sus_division, s.census_devision_name, s.circuit_court Remote SQL: SELECT name, abbreviation FROM public.state_code -> Hash (cost=134.54..134.54 rows=427 width=40) Output: b.flight_date, b.location -> Foreign Scan on public.bird_strikes b (cost=100.00..134.54 rows=427 width=40) Output: b.flight_date, b.location Remote SQL: SELECT location, flight_date FROM public.bird_strikes WHERE ((flight_date > '2011-01-01 00:00:00'::timestamp without time zone)) (11 rows)
  • 28. Limitations (Gotcha) ● Sometimes the foreign tables don't act like tables test=# SELECT l.*, w.lat, w.lng FROM leads l, www_fdw_geocoder_google w WHERE w.address = l.address || ',' || l.city || ',' || l.state; first_name | last_name | company_name | address | city | county | state | zip | phone1 | phone2 | email | web | lat | lng ------------+-----------+--------------+---------+------+-------+-------+-----+--------+--------+-------+-----+-----+----(0 rows)
  • 29. Limitations (Gotcha) QUERY PLAN ------------------------------------------------------------------------------------------Merge Join (cost=187.47..215.47 rows=1000 width=448) Output: l.first_name, l.last_name, l.company_name, l.address, l.city, l.county, l.state, l.zip, l.phone1, l.phone2, l.email, l.web, w.lat, w.lng Merge Cond: ((((((l.address || ','::text) || l.city) || ','::text) || l.state)) = w.address) -> Sort (cost=37.64..38.14 rows=200 width=384) Output: l.first_name, l.last_name, l.company_name, l.address, l.city, l.county, l.state, l.zip, l.phone1, l.phone2, l.email, l.web, (((((l.address || ','::text) || l.city) || ','::text) || l.state )) Sort Key: (((((l.address || ','::text) || l.city) || ','::text) || l.state)) -> Foreign Scan on public.leads l (cost=0.00..30.00 rows=200 width=384) Output: l.first_name, l.last_name, l.company_name, l.address, l.city, l.county, l.state, l.zip, l.phone1, l.phone2, l.email, l.web, ((((l.address || ','::text) || l.city) || ','::text) || l. state) Foreign File: /tmp/us-500.csv Foreign File Size: 81485 -> Sort (cost=149.83..152.33 rows=1000 width=96) Output: w.lat, w.lng, w.address Sort Key: w.address -> Foreign Scan on public.www_fdw_geocoder_google w Output: w.lat, w.lng, w.address WWW API: Request (16 rows) (cost=0.00..100.00 rows=1000 width=96)
  • 30. Limitations (Gotcha) CREATE OR REPLACE FUNCTION google_geocode( OUT first_name text, OUT last_name text, OUT company_name text, OUT address text, OUT city text, OUT county text, OUT state text, OUT zip text, OUT phone1 text, OUT phone2 text, OUT email text, OUT web text, OUT lat text, OUT lng text) RETURNS SETOF RECORD AS $$ DECLARE r record; f_adr text; l_lat text; l_lng text; BEGIN FOR r IN SELECT * FROM leads LOOP f_adr := r.address || ',' || r.city || ',' || r.state; EXECUTE 'SELECT lat, lng FROM www_fdw_geocoder_google WHERE address = $1' INTO l_lat, l_lng USING f_adr; SELECT r.first_name, r.last_name, r.company_name, r.address, r.city, r.county, r.state, r.zip, r.phone1, r.phone2, r.email, r.web, l_lat, l_lng INTO first_name, last_name, company_name, address, city, county, state, zip, phone1, phone2, email, web, lat, lng; RETURN NEXT; END LOOP; END $$ LANGUAGE plpgsql;
  • 31. Writing a new FDW ● Might not need to write one if there is a http interface ● Use the Blackhole as a template – https://bitbucket.org/adunstan/blackhole_fdw
  • 32. Writing a new FDW Datum blackhole_fdw_handler(PG_FUNCTION_ARGS){ ... /* these are required */ fdwroutine->GetForeignRelSize = blackholeGetForeignRelSize; fdwroutine->GetForeignPaths = blackholeGetForeignPaths; fdwroutine->GetForeignPlan = blackholeGetForeignPlan; fdwroutine->BeginForeignScan = blackholeBeginForeignScan; fdwroutine->IterateForeignScan = blackholeIterateForeignScan; fdwroutine->ReScanForeignScan = blackholeReScanForeignScan; fdwroutine->EndForeignScan = blackholeEndForeignScan; /* remainder are optional - use NULL if not required */ /* support for insert / update / delete */ fdwroutine->AddForeignUpdateTargets = blackholeAddForeignUpdateTargets; fdwroutine->PlanForeignModify = blackholePlanForeignModify; fdwroutine->BeginForeignModify = blackholeBeginForeignModify; fdwroutine->ExecForeignInsert = blackholeExecForeignInsert; fdwroutine->ExecForeignUpdate = blackholeExecForeignUpdate; fdwroutine->ExecForeignDelete = blackholeExecForeignDelete; fdwroutine->EndForeignModify = blackholeEndForeignModify; /* support for EXPLAIN */ fdwroutine->ExplainForeignScan = blackholeExplainForeignScan; fdwroutine->ExplainForeignModify = blackholeExplainForeignModify; /* support for ANALYSE */ fdwroutine->AnalyzeForeignTable = blackholeAnalyzeForeignTable; PG_RETURN_POINTER(fdwroutine); }
  • 33. Future ● Even more Wrappers ● Check Constraints on Foreign Tables – ● Allows partitioning Joins – Custom Scan API ● Probably will not be the way to do this, but progress being made