SlideShare a Scribd company logo
PostgreSQL
Database Administrator- Part4
By Poguttu
Database
Maintenance &
Performance
and
Concurrency
Day-4
• PostgreSQL Tuning and Performance
• Find and Tune Slow Running Queries
• Collecting regular statistics from pg_stat*
views
• Finding out what makes SQL slow
• Speeding up queries without rewriting them
• Discovering why a query is not using an
index
• Forcing a query to use an index
• EXPLAIN and SQL Execution
• Workload Analysis
database_info.sql : SELECT db.datname, au.rolname as datdba,
pg_encoding_to_char(db.encoding) as encoding,
db.datallowconn, db.datconnlimit, db.datfrozenxid,
tb.spcname as tblspc,
-- db.datconfig,
db.datacl
FROM pg_database db
JOIN pg_authid au ON au.oid = db.datdba
JOIN pg_tablespace tb ON tb.oid = db.dattablespace
ORDER BY 1;
database_sizes.sql
SELECT datname,
pg_size_pretty(pg_database_size(datname))as size_pretty,
pg_database_size(datname) as size,
(SELECT pg_size_pretty (SUM( pg_database_size(datname))::bigint)
FROM pg_database) AS total,
((pg_database_size(datname) / (SELECT SUM( pg_database_size(datname))
FROM pg_database) ) *
100)::numeric(6,3) AS pct
FROM pg_database ORDER BY datname;
blocked_transactions.sql
/* Requires PostgreSQL version is 9.2 or Greater */
SELECT
w.query as waiting_query, w.pid as w_pid, w.usename as w_user, l.query as locking_query, l.pid as
l_pid,
l.usename as l_user, t.schemaname || '.' || t.relname as tablename
FROM pg_stat_activity w
JOIN pg_locks l1 ON (w.pid = l1.pid and not l1.granted)
JOIN pg_locks l2 on (l1.relation = l2.relation and l2.granted)
JOIN pg_stat_activity l ON (l2.pid = l.pid)
JOIN pg_stat_user_tables t ON (l1.relation = t.relid) WHERE w.waiting;
cache_hit_ratio.sql
SELECT pg_stat_database.datname, pg_stat_database.blks_read, pg_stat_database.blks_hit,
round((pg_stat_database.blks_hit::double precision / (pg_stat_database.blks_read
+ pg_stat_database.blks_hit
+1)::double precision * 100::double precision)::numeric, 2) AS
cachehitratio
FROM pg_stat_database
WHERE pg_stat_database.datname !~ '^(template(0|1)|postgres)$'::text
ORDER BY round((pg_stat_database.blks_hit::double precision
/ (pg_stat_database.blks_read
+ pg_stat_database.blks_hit
+ 1)::double precision * 100::double precision)::numeric, 2) DESC;
connection_counts.sql: SELECT COUNT(*) FROM pg_stat_activity;
SELECT usename, count(*) FROM pg_stat_activity GROUP BY 1 ORDER BY 1;
SELECT datname, usename, count(*) FROM pg_stat_activity GROUP BY 1, 2 ORDER BY 1, 2;
SELECT usename, datname, count(*) FROM pg_stat_activity GROUP BY 1, 2 ORDER BY 1, 2;
current_locks.sql:SELECT database, relation, n.nspname, c.relname, pid, a.usename,
locktype, mode,
granted, tuple FROM pg_locks l
JOIN pg_class c ON (c.oid = l.relation)
JOIN pg_namespace n ON (n.oid = c.relnamespace)
JOIN pg_stat_activity a ON (a.procpid = l.pid)
ORDER BY database, relation, pid;
current_queries.sql: SELECT a.datname, a.procpid as pid,
CASE WHEN a.client_addr IS NULL
THEN 'local'
ELSE a.client_addr::text
END as client_addr,
a.usename as user, a.waiting, l.procpid as blocking_pid, l.usename as blicking_user,
a.current_query, a.query_start, current_timestamp - a.query_start as duration
FROM pg_stat_activity a
LEFT JOIN pg_locks l1 ON (a.procpid = l1.pid )
LEFT JOIN pg_locks l2 on (l1.relation = l2.relation )
LEFT JOIN pg_stat_activity l ON (l2.pid = l.procpid)
LEFT JOIN pg_stat_user_tables t ON (l1.relation = t.relid)
WHERE pg_backend_pid() <> a.procpid
ORDER BY a.datname,
a.query_start;
SELECT w.current_query as waiting_query, w.procpid as w_pid, w.usename as w_user,
l.current_query as locking_query, l.procpid as l_pid, l.usename as l_user,
t.schemaname || '.' || t.relname as tablename FROM pg_stat_activity w
JOIN pg_locks l1 ON (w.procpid = l1.pid and not l1.granted)
JOIN pg_locks l2 on (l1.relation = l2.relation and l2.granted)
JOIN pg_stat_activity l ON (l2.pid = l.procpid)
JOIN pg_stat_user_tables t ON (l1.relation = t.relid)
WHERE w.waiting;
SELECT datname, procpid as pid, client_addr, usename as user, current_query,
CASE WHEN waiting = TRUE
THEN 'BLOCKED'
ELSE 'no'
END as waiting,
query_start, current_timestamp - query_start as duration
FROM pg_stat_activity WHERE pg_backend_pid() <> procpid
ORDER BY datname, query_start;
current_queries_blocked.sql
SELECT c.datname, c.pid as pid, c.client_addr, c.usename as user, c.query,
CASE WHEN c.waiting = TRUE
THEN 'BLOCKED'
ELSE 'no'
END as waiting,
l.pid as blocked_by,
current_timestamp - c.query_start as duration
FROM pg_stat_activity c
LEFT JOIN pg_locks l1 ON (c.pid = l1.pid and not l1.granted)
LEFT JOIN pg_locks l2 on (l1.relation = l2.relation and l2.granted)
LEFT JOIN pg_stat_activity l ON (l2.pid = l.pid)
LEFT JOIN pg_stat_user_tables t ON (l1.relation = t.relid)
WHERE pg_backend_pid() <> c.pid
ORDER BY datname, query_start;
most_active_tables.sql
SELECT schemaname, relname, idx_tup_fetch + seq_tup_read as TotalReads
FROM pg_stat_all_tables WHERE idx_tup_fetch + seq_tup_read != 0
AND schemaname NOT IN ( 'pg_catalog', 'pg_toast' )
order by TotalReads desc LIMIT 10;
pg_runtime.sql
SELECT pg_postmaster_start_time() as pg_start, current_timestamp - pg_postmaster_start_time()
as runtime;
table_row_counts.sql
SELECT s.nspname, c.relname as table, c.reltuples::int4 as rows FROM pg_catalog.pg_class c
JOIN pg_catalog.pg_namespace s ON (c.relnamespace = s.oid) WHERE relkind = 'r'
AND c.reltuples::int4 > 0 ORDER BY rows DESC;
pg_stat_all_tables.sql
SELECT n.nspname, s.relname, c.reltuples::bigint,-- n_live_tup,
n_tup_ins, n_tup_upd, n_tup_del,
date_trunc('second', last_vacuum) as last_vacuum,
date_trunc('second', last_autovacuum) as last_autovacuum, date_trunc('second',
last_analyze) as last_analyze, date_trunc('second', last_autoanalyze) as last_autoanalyze
, round( current_setting('autovacuum_vacuum_threshold')::integer +
current_setting('autovacuum_vacuum_scale_factor')::numeric * C.reltuples) AS av_threshold
/* ,CASE WHEN reltuples > 0
THEN round(100.0 * n_dead_tup / (reltuples))
ELSE 0
END AS pct_dead,
CASE WHEN n_dead_tup > round( current_setting('autovacuum_vacuum_threshold')::integer +
current_setting('autovacuum_vacuum_scale_factor')::numeric * C.reltuples)
THEN 'VACUUM'
ELSE 'ok'
END AS "av_needed"
*/
FROM pg_stat_all_tables s
JOIN pg_class c ON c.oid = s.relid
JOIN pg_namespace n ON (n.oid = c.relnamespace)
WHERE s.relname NOT LIKE 'pg_%'
AND s.relname NOT LIKE 'sql_%'
-- AND s.relname LIKE '%TBL%'
ORDER by 1, 2;
pg_stat_all_indexes.sql
SELECT n.nspname as schema, i.relname as table, i.indexrelname as index,
i.idx_scan, i.idx_tup_read, i.idx_tup_fetch,
CASE WHEN idx.indisprimary
THEN 'pkey'
WHEN idx.indisunique
THEN 'uidx'
ELSE 'idx'
END AS type,
CASE WHEN idx.indisvalid
THEN 'valid'
ELSE 'INVALID'
END as statusi,
pg_relation_size( quote_ident(n.nspname) || '.' || quote_ident(i.relname) ) as size_in_bytes,
pg_size_pretty(pg_relation_size(quote_ident(n.nspname) || '.' || quote_ident(i.relname) )) as
size
FROM pg_stat_all_indexes i
JOIN pg_class c ON (c.oid = i.relid)
JOIN pg_namespace n ON (n.oid = c.relnamespace)
JOIN pg_index idx ON (idx.indexrelid = i.indexrelid )
WHERE i.relname LIKE '%%'
AND n.nspname NOT LIKE 'pg_%'
-- AND idx.indisunique = TRUE
-- AND NOT idx.indisprimary
-- AND i.indexrelname LIKE 'tmp%'
-- AND idx.indisvalid IS false
/* AND NOT idx.indisprimary
AND NOT idx.indisunique
AND idx_scan = 0
*/ ORDER BY 1, 2, 3;
blocked_transactions.sql
/* Requires PostgreSQL version is 9.2 or Greater */
SELECT
w.query as waiting_query, w.pid as w_pid, w.usename as w_user, l.query as locking_query, l.pid
as l_pid,
l.usename as l_user, t.schemaname || '.' || t.relname as tablename
FROM pg_stat_activity w
JOIN pg_locks l1 ON (w.pid = l1.pid and not l1.granted)
JOIN pg_locks l2 on (l1.relation = l2.relation and l2.granted)
JOIN pg_stat_activity l ON (l2.pid = l.pid)
JOIN pg_stat_user_tables t ON (l1.relation = t.relid) WHERE w.waiting;
cache_hit_ratio.sql
SELECT pg_stat_database.datname, pg_stat_database.blks_read,
pg_stat_database.blks_hit, round((pg_stat_database.blks_hit::double precision /
(pg_stat_database.blks_read
+ pg_stat_database.blks_hit
+1)::double precision * 100::double precision)::numeric, 2) AS
cachehitratio
FROM pg_stat_database
WHERE pg_stat_database.datname !~ '^(template(0|1)|postgres)$'::text
ORDER BY round((pg_stat_database.blks_hit::double precision / (pg_stat_database.blks_read
+ pg_stat_database.blks_hit
+ 1)::double precision * 100::double precision)::numeric, 2)
DESC;
connection_counts.sql: SELECT COUNT(*) FROM pg_stat_activity;
SELECT usename, count(*) FROM pg_stat_activity GROUP BY 1 ORDER BY 1;
SELECT datname, usename, count(*) FROM pg_stat_activity GROUP BY 1, 2 ORDER BY 1, 2;
SELECT usename, datname, count(*) FROM pg_stat_activity GROUP BY 1, 2 ORDER BY 1, 2;
current_locks.sql:SELECT database, relation, n.nspname, c.relname, pid, a.usename,
locktype, mode,
granted, tuple FROM pg_locks l JOIN pg_class c ON (c.oid = l.relation)
JOIN pg_namespace n ON (n.oid = c.relnamespace) JOIN pg_stat_activity a ON (a.procpid = l.pid)
ORDER BY database, relation, pid;
current_queries.sql: SELECT a.datname, a.procpid as pid,
CASE WHEN a.client_addr IS NULL
THEN 'local'
ELSE a.client_addr::text
END as client_addr,
a.usename as user, a.waiting, l.procpid as blocking_pid, l.usename as blicking_user,
a.current_query, a.query_start, current_timestamp - a.query_start as duration
FROM pg_stat_activity a
LEFT JOIN pg_locks l1 ON (a.procpid = l1.pid )
LEFT JOIN pg_locks l2 on (l1.relation = l2.relation )
LEFT JOIN pg_stat_activity l ON (l2.pid = l.procpid)
LEFT JOIN pg_stat_user_tables t ON (l1.relation = t.relid)
WHERE pg_backend_pid() <> a.procpid
ORDER BY a.datname,
a.query_start;
SELECT w.current_query as waiting_query, w.procpid as w_pid, w.usename as w_user,
current_queries_blocked.sql
SELECT c.datname, c.pid as pid, c.client_addr, c.usename as user, c.query,
CASE WHEN c.waiting = TRUE
THEN 'BLOCKED'
ELSE 'no'
END as waiting,
l.pid as blocked_by, current_timestamp - c.query_start as duration
FROM pg_stat_activity c
LEFT JOIN pg_locks l1 ON (c.pid = l1.pid and not l1.granted)
LEFT JOIN pg_locks l2 on (l1.relation = l2.relation and l2.granted)
LEFT JOIN pg_stat_activity l ON (l2.pid = l.pid)
LEFT JOIN pg_stat_user_tables t ON (l1.relation = t.relid)
WHERE pg_backend_pid() <> c.pid
ORDER BY datname, query_start;
most_active_tables.sql
SELECT schemaname, relname, idx_tup_fetch + seq_tup_read as TotalReads
FROM pg_stat_all_tables WHERE idx_tup_fetch + seq_tup_read != 0
pg_stat_all_tables.sql
SELECT n.nspname, s.relname, c.reltuples::bigint,-- n_live_tup,
n_tup_ins, n_tup_upd, n_tup_del,
date_trunc('second', last_vacuum) as last_vacuum,
date_trunc('second', last_autovacuum) as last_autovacuum, date_trunc('second',
last_analyze) as last_analyze, date_trunc('second', last_autoanalyze) as
last_autoanalyze
, round( current_setting('autovacuum_vacuum_threshold')::integer +
current_setting('autovacuum_vacuum_scale_factor')::numeric * C.reltuples) AS
av_threshold
/* ,CASE WHEN reltuples > 0
THEN round(100.0 * n_dead_tup / (reltuples))
ELSE 0
END AS pct_dead,
CASE WHEN n_dead_tup > round(
current_setting('autovacuum_vacuum_threshold')::integer +
current_setting('autovacuum_vacuum_scale_factor')::numeric * C.reltuples)
THEN 'VACUUM'
ELSE 'ok'
END AS "av_needed"
*/
FROM pg_stat_all_tables s
JOIN pg_class c ON c.oid = s.relid
JOIN pg_namespace n ON (n.oid = c.relnamespace)
WHERE s.relname NOT LIKE 'pg_%'
AND s.relname NOT LIKE 'sql_%'
-- AND s.relname LIKE '%TBL%'
ORDER by 1, 2;
pg_stat_all_indexes.sql
SELECT n.nspname as schema, i.relname as table, i.indexrelname as index,
i.idx_scan, i.idx_tup_read, i.idx_tup_fetch,
CASE WHEN idx.indisprimary
THEN 'pkey'
WHEN idx.indisunique
THEN 'uidx'
ELSE 'idx'
END AS type,
CASE WHEN idx.indisvalid
THEN 'valid'
ELSE 'INVALID'
END as statusi,
pg_relation_size( quote_ident(n.nspname) || '.' || quote_ident(i.relname) ) as
size_in_bytes,
pg_size_pretty(pg_relation_size(quote_ident(n.nspname) || '.' ||
quote_ident(i.relname) )) as size
FROM pg_stat_all_indexes i
JOIN pg_class c ON (c.oid = i.relid)
JOIN pg_namespace n ON (n.oid = c.relnamespace)
JOIN pg_index idx ON (idx.indexrelid = i.indexrelid )
WHERE i.relname LIKE '%%'
AND n.nspname NOT LIKE 'pg_%'
-- AND idx.indisunique = TRUE
-- AND NOT idx.indisprimary
-- AND i.indexrelname LIKE 'tmp%'
-- AND idx.indisvalid IS false
/* AND NOT idx.indisprimary
AND NOT idx.indisunique
AND idx_scan = 0
*/ ORDER BY 1, 2, 3;
Standard Statistics Views
pg_stat_activity- One row per server process, showing
information related to the current activity of that process,
such as state and current query.
pg_stat_archiver- One row only, showing statistics about
the WAL archiver process's activity.
pg_stat_bgwriter- One row only, showing statistics about
the background writer process's activity.
pg_stat_database- One row per database, showing
database-wide statistics.
pg_stat_all_tables-One row for each table in the current
database, showing statistics about accesses to that specific
table..
pg_stat_sys_tables- Same as pg_stat_all_tables,
except that only system tables are shown.
pg_stat_user_tables- Same as pg_stat_all_tables, except that only user tables are
shown.
pg_stat_xact_all_tables- Similar to pg_stat_all_tables, but counts actions taken so far
within the current transaction (which are not yet included in pg_stat_all_tables and related
views). The columns for numbers of live and dead rows and vacuum and analyze actions are
not present in this view.
pg_stat_xact_sys_tables -Same as pg_stat_xact_all_tables, except that only system tables
are shown.
pg_stat_xact_user_tables- Same as pg_stat_xact_all_tables, except that only user tables are
shown.
pg_statio_user_sequences-Same as pg_statio_all_sequences, except that only user sequences
are shown.
pg_stat_user_functions- One row for each tracked function, showing statistics about
executions of that function.
pg_stat_xact_user_functions-Similar to pg_stat_user_functions, but counts only calls during
the current transaction (which are not yet included in pg_stat_user_functions).
Standard Statistics Views
pg_statio_sys_indexes- Same as pg_statio_all_indexes,
except that only indexes on system tables are shown.
pg_statio_user_indexes- Same as pg_statio_all_indexes,
except that only indexes on user tables are shown.
pg_statio_all_sequences-One row for each sequence in the
current database, showing statistics about I/O on that
specific sequence. See pg_statio_all_sequences for details.
pg_statio_sys_sequences Same as
pg_statio_all_sequences, except that only system sequences
are shown. (Presently, no system sequences are defined, so
this view is always empty.)
pg_stat_replication-One row per WAL sender process,
showing statistics about replication to that sender's
connected standby server..
pg_stat_database_conflicts-One row per database, showing
database-wide statistics about query cancels due to conflict
with recovery on standby servers.
ws
pg_stat_all_indexes -One row for each index in the
current database, showing statistics about accesses to that
specific index.
pg_stat_sys_indexes- Same as pg_stat_all_indexes,
except that only indexes on system tables are shown.
pg_stat_user_indexes- Same as pg_stat_all_indexes,
except that only indexes on user tables are shown.
pg_statio_all_tables- One row for each table in the
current database, showing statistics about I/O on that specific
table.
pg_statio_sys_tables- Same as pg_statio_all_tables,
except that only system tables are shown.
pg_statio_user_tables- Same as pg_statio_all_tables,
except that only user tables are shown.
pg_statio_all_indexes- One row for each index in the
current database, showing statistics about I/O on that specific
index.
connection_counts.sql
SELECT COUNT(*) FROM pg_stat_activity;
SELECT usename, count(*) FROM pg_stat_activity GROUP BY 1 ORDER BY 1;
SELECT datname, usename, count(*) FROM pg_stat_activity GROUP BY 1, 2 ORDER BY 1, 2;
SELECT usename, datname, count(*) FROM pg_stat_activity GROUP BY 1, 2 ORDER BY 1, 2;
current_locks.sql
SELECT database, relation, n.nspname, c.relname, pid, a.usename, locktype,
mode,
granted, tuple FROM pg_locks l
JOIN pg_class c ON (c.oid = l.relation)
JOIN pg_namespace n ON (n.oid = c.relnamespace)
JOIN pg_stat_activity a ON (a.procpid = l.pid)
ORDER BY database, relation, pid;
current_queries.sql
SELECT a.datname, a.procpid as pid,
CASE WHEN a.client_addr IS NULL
THEN 'local'
ELSE a.client_addr::text
END as client_addr,
a.usename as user, a.waiting, l.procpid as blocking_pid, l.usename as blicking_user,
a.current_query, a.query_start, current_timestamp - a.query_start as duration
FROM pg_stat_activity a
LEFT JOIN pg_locks l1 ON (a.procpid = l1.pid )
LEFT JOIN pg_locks l2 on (l1.relation = l2.relation )
LEFT JOIN pg_stat_activity l ON (l2.pid = l.procpid)
LEFT JOIN pg_stat_user_tables t ON (l1.relation = t.relid)
WHERE pg_backend_pid() <> a.procpid
ORDER BY a.datname,
a.query_start;
SELECT w.current_query as waiting_query, w.procpid as w_pid, w.usename as w_user,
l.current_query as locking_query, l.procpid as l_pid, l.usename as l_user,
t.schemaname || '.' || t.relname as tablename FROM pg_stat_activity w
JOIN pg_locks l1 ON (w.procpid = l1.pid and not l1.granted)
JOIN pg_locks l2 on (l1.relation = l2.relation and l2.granted)
JOIN pg_stat_activity l ON (l2.pid = l.procpid)
JOIN pg_stat_user_tables t ON (l1.relation = t.relid)
SELECT datname, procpid as pid, client_addr, usename as user, current_query,
CASE WHEN waiting = TRUE
THEN 'BLOCKED'
ELSE 'no'
END as waiting,
query_start, current_timestamp - query_start as duration
FROM pg_stat_activity WHERE pg_backend_pid() <> procpid
ORDER BY datname, query_start;
current_queries_blocked.sql
/* Use for PostgreSQL 9.2 or greater */
SELECT c.datname, c.pid as pid, c.client_addr, c.usename as user, c.query,
CASE WHEN c.waiting = TRUE
THEN 'BLOCKED'
ELSE 'no'
END as waiting,
l.pid as blocked_by,
current_timestamp - c.query_start as duration
FROM pg_stat_activity c
LEFT JOIN pg_locks l1 ON (c.pid = l1.pid and not l1.granted)
LEFT JOIN pg_locks l2 on (l1.relation = l2.relation and l2.granted)
LEFT JOIN pg_stat_activity l ON (l2.pid = l.pid)
LEFT JOIN pg_stat_user_tables t ON (l1.relation = t.relid)
WHERE pg_backend_pid() <> c.pid
ORDER BY datname, query_start;
most_active_tables.sql
SELECT schemaname, relname, idx_tup_fetch + seq_tup_read as TotalReads
FROM pg_stat_all_tables WHERE idx_tup_fetch + seq_tup_read != 0
AND schemaname NOT IN ( 'pg_catalog', 'pg_toast' )
order by TotalReads desc LIMIT 10;
gen_table_compare_counts.sql
SELECT 'SELECT ''' || c.relname || ''' as table, ' || c.reltuples::int4 || ' as rows, ' ||
'(SELECT COUNT(*) FROM ' || s.nspname || '."' || c.relname || '") as cnt;'
FROM pg_catalog.pg_class c
What is Vacuum?
• Vacuum does the following:
• Gathering table and index statistics
• Reorganize the table
• Clean up tables and index dead blocks
• Frozen by record XID to prevent XID Wraparound
• #1 and #2 are generally required for DBMS management. But #3 and #4 are necessary because of the PostgreSQL MVCC feature
VACUUM
• •Restructures pages and reclaims space taken by dead rows (rows that were deleted BEFORE any of the current transactions
started)
• •Removes dead rows from indexes and TOAST tables
• •Having long-running transactions can mess everything up(including long transactions on replica if hot_standby_feedback == on)
• •Truncates the table if possible
• •Updates free space map
• •Done to avoid needing
• VACUUM FULLNOT NEEDED:
• •On replica
• •After TRUNCATE
VACUUM FULL
•Shrinks table size(rewrites all “alive” tuples into a new file as
compactly as possible)
•Can only be launched manually(not by autovacuum)
•OID of the relation stays the same, relfilenode (on-disk name)
changesCons:•ACCESS EXCLUSIVE LOCK(no reading or writing allowed)
•table size≤needed space ≤table size * 2
•Need a REINDEX
•Takes a long time
Alternative:
pg_repack-does allow reads and writes, but needs more space (≥table
size * 2)VACUUM FULL5
Postgresql Database Administration- Day4
Postgresql Database Administration- Day4
Below have required changes to force the Autovacuum parameters for running frequently.
First enable the log for Autovacuum process:
log_autovacuum_min_duration = 0
Increase the size of worker to check table more:
autovacuum_max_workers = 6
autovacuum_naptime = 15s
Decrease the value of thresholds and auto analyze to trigger the sooner:
autovacuum_vacuum_threshold = 25
autovacuum_vacuum_scale_factor = 0.1
autovacuum_analyze_threshold = 10
autovacuum_analyze_scale_factor = 0.05
Make autovacuum less interruptable:
autovacuum_vacuum_cost_delay = 10ms
autovacuum_vacuum_cost_limit = 1000
Script to check the status of AutoVacuum for all Tables
SELECT
schemaname
,relname
,n_live_tup
,n_dead_tup
,last_autovacuum
FROM pg_stat_all_tables
ORDER BY n_dead_tup
/(n_live_tup
* current_setting('autovacuum_vacuum_scale_factor')::float8
+ current_setting('autovacuum_vacuum_threshold')::float8)
DESC;
Controlling automatic database maintenance
Autovacuum is enabled by default in PostgreSQL and mostly does a great job of maintaining your PostgreSQL database. We say mostly because
it doesn't know everything you do about the database, such as the best time to perform maintenance actions. Let's explore the settings that
can be tuned so that you can use vacuums efficiently.
Exercising control requires some thinking about what you actually want:
What are the best times of day to do things? When are system resources more available?
Which days are quiet, and which are not?
Which tables are critical to the application, and which are not?
Perform the following steps:
The first thing to do is make sure that autovacuum is switched on, which is the default. Check that you have the following parameters
enabled in yourpostgresql.conffile:
autovacuum = on
track_counts = on
PostgreSQL controls autovacuum with more than 40 individually tunable parameters that provide a wide range of...Get quickly up to speed
on the latest tech
Removing issues that cause bloat
Bloat can be caused by long-running queries or long-running write transactions that execute alongside write-heavy workloads.
Resolving that is mostly down to understanding the workloads running on the server.
Look at the age of the oldest snapshots that are running, like this:
postgres=# SELECT now() -
CASE
WHEN backend_xid IS NOT NULL
THEN xact_start
ELSE query_start END
AS age
, pid, backend_xid AS xid, backend_xmin AS xmin, stateFROM pg_stat_activity WHERE backend_type = 'client backend’ ORDER BY 1
DESC;
age | pid | xid | xmin | state
----------------+-------+----------+----------+------------------
00:00:25.791098 | 27624 | | 10671262 | active
00:00:08.018103 | 27591 | | | idle in transaction
00:00:00.002444 | 27630 | 10703641 | 10703639 | active
00:00:00.001506 | 27631 | 10703642 | 10703640 | active
00:00:00.000324 | 27632 | 10703643 | 10703641 | active
00:00:00...
Identifying and fixing bloated tables and indexes
PostgreSQL implements Multiversion Concurrency Control (MVCC), which allows users to read data at the same
time as writers make changes.
This is an important feature for concurrency in database applications, as it can allow the following:
Better performance because of fewer locks
Greatly reduced deadlocking
Simplified application design and management
Bloated tables and indexes are a natural consequence of MVCC design in PostgreSQL. It is caused mainly by
updates, as we must retain both the old and new updates for a certain period of time.
Bloating results in increased disk consumption, as well as performance loss—if a table is twice as big as it
should be, scanning it takes twice as long. VACUUM is one of the best ways of removing bloat.
Many users execute VACUUM far too frequently, while at the same time complaining about the cost of doing
so. This recipe is all about understanding when you need to run VACUUM by estimating the amount of bloat...
Monitoring and tuning a vacuum
If you're currently waiting for a long-running vacuum (or autovacuum) to finish, go straight to the How to do it... section.
If you've just had a long-running vacuum complete, then you may want to think about setting a few parameters.
autovacuum_max_workers should always be set to more than 2. Setting it too high may not be very useful, and so you need to be careful.
Setting vacuum_cost_delay too high is counterproductive. VACUUM is your friend, not your enemy, so delaying it until it doesn't happen at all just makes
things worse.
maintenance_work_mem should be set to anything up to 1 GB, according to how much memory you can allocate to this task at this time.
Let's watch what happens when we run a large VACUUM. Don't run VACUUM FULL, because it runs for a long time while holding an AccessExclusiveLock
on the table.
First, locate which process is running the VACUUM by using the pg_stat_activity view to identify the specific pid (34399 is just an example...
test=# SELECT oid::regclass::text AS table, age(relfrozenxid) AS xid_age, mxid_age(relminmxid) AS mxid_age, least(
(SELECT setting::int
FROM pg_settings WHERE name = 'autovacuum_freeze_max_age') - age(relfrozenxid),
(SELECT setting::int FROM pg_settings WHERE name = 'autovacuum_multixact_freeze_max_age') -
mxid_age(relminmxid) ) AS tx_before_wraparound_vacuum,pg_size_pretty(pg_total_relation_size(oid)) AS size,
pg_stat_get_last_autovacuum_time(oid) AS last_autovacuum FROM pg_class WHERE relfrozenxid != 0 AND oid > 16384
ORDER BY tx_before_wraparound_vacuum;
Database Maintenance
Maintenance Tools
Optimizer Statistics
Demo - Optimizer Statistics
Example - Updating Statistics
Data Fragmentation and Bloat
Routine Vacuuming
Vacuuming Commands
Vacuum and Vacuum Full
Demo - Vacuum Command
Vacuumdb Utility
Autovacuuming
Autovacuuming Parameters
Per-Table Thresholds
Routine Reindexing
When to Reindex
Demo - Reindexing

More Related Content

What's hot (19)

ODP
Postgresql Federation
Jim Mlodgenski
 
PDF
MongoDB Database Replication
Mehdi Valikhani
 
PPT
Oracle Tracing
Merin Mathew
 
PDF
Non-Relational Postgres / Bruce Momjian (EnterpriseDB)
Ontico
 
PPTX
PostgreSQL- An Introduction
Smita Prasad
 
PDF
Postgres 12 Cluster Database operations.
Vijay Kumar N
 
PDF
Full Text Search In PostgreSQL
Karwin Software Solutions LLC
 
PDF
Postgresql search demystified
javier ramirez
 
PPTX
Comparing SAS Files
Laura A Schild
 
DOCX
Backup and Recovery
Anar Godjaev
 
PDF
Top 10 Mistakes When Migrating From Oracle to PostgreSQL
Jim Mlodgenski
 
PDF
Unix commands in etl testing
Garuda Trainings
 
ODP
Pro PostgreSQL, OSCon 2008
Robert Treat
 
DOCX
Exadata - BULK DATA LOAD Testing on Database Machine
Monowar Mukul
 
PDF
hadoop
longhao
 
PDF
Introduction to scoop and its functions
Rupak Roy
 
PDF
Import and Export Big Data using R Studio
Rupak Roy
 
PDF
Test Dml With Nologging
N/A
 
PPTX
Read, store and create xml and json
Kim Berg Hansen
 
Postgresql Federation
Jim Mlodgenski
 
MongoDB Database Replication
Mehdi Valikhani
 
Oracle Tracing
Merin Mathew
 
Non-Relational Postgres / Bruce Momjian (EnterpriseDB)
Ontico
 
PostgreSQL- An Introduction
Smita Prasad
 
Postgres 12 Cluster Database operations.
Vijay Kumar N
 
Full Text Search In PostgreSQL
Karwin Software Solutions LLC
 
Postgresql search demystified
javier ramirez
 
Comparing SAS Files
Laura A Schild
 
Backup and Recovery
Anar Godjaev
 
Top 10 Mistakes When Migrating From Oracle to PostgreSQL
Jim Mlodgenski
 
Unix commands in etl testing
Garuda Trainings
 
Pro PostgreSQL, OSCon 2008
Robert Treat
 
Exadata - BULK DATA LOAD Testing on Database Machine
Monowar Mukul
 
hadoop
longhao
 
Introduction to scoop and its functions
Rupak Roy
 
Import and Export Big Data using R Studio
Rupak Roy
 
Test Dml With Nologging
N/A
 
Read, store and create xml and json
Kim Berg Hansen
 

Similar to Postgresql Database Administration- Day4 (20)

PDF
Notes for SQLite3 Usage
William Lee
 
PDF
ETL Patterns with Postgres
Martin Loetzsch
 
PDF
pg_proctab: Accessing System Stats in PostgreSQL
Command Prompt., Inc
 
PDF
pg_proctab: Accessing System Stats in PostgreSQL
Mark Wong
 
PDF
Introduction to source{d} Engine and source{d} Lookout
source{d}
 
PDF
Parquet performance tuning: the missing guide
Ryan Blue
 
PDF
PerlApp2Postgresql (2)
Jerome Eteve
 
PPTX
Know your SQL Server - DMVs
Comunidade NetPonto
 
PDF
When to NoSQL and when to know SQL
Simon Elliston Ball
 
PDF
Aplicações 10x a 100x mais rápida com o postgre sql
Fabio Telles Rodriguez
 
PDF
Advanced pg_stat_statements: Filtering, Regression Testing & more
Lukas Fittl
 
PDF
2018 db-rainer schuettengruber-beating-oracles_optimizer_at_its_own_game-pres...
Rainer Schuettengruber
 
PDF
Introduction to Spark Datasets - Functional and relational together at last
Holden Karau
 
PDF
Explain this!
Fabio Telles Rodriguez
 
PDF
Dok Talks #115 - What More Can I Learn From My OpenTelemetry Traces?
DoKC
 
PPTX
AWS Hadoop and PIG and overview
Dan Morrill
 
PDF
Sql for dbaspresentation
oracle documents
 
PDF
Introducing Apache Spark's Data Frames and Dataset APIs workshop series
Holden Karau
 
PDF
Easing the Complex with SPBench framework
adriano1mg
 
PDF
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...
NoSQLmatters
 
Notes for SQLite3 Usage
William Lee
 
ETL Patterns with Postgres
Martin Loetzsch
 
pg_proctab: Accessing System Stats in PostgreSQL
Command Prompt., Inc
 
pg_proctab: Accessing System Stats in PostgreSQL
Mark Wong
 
Introduction to source{d} Engine and source{d} Lookout
source{d}
 
Parquet performance tuning: the missing guide
Ryan Blue
 
PerlApp2Postgresql (2)
Jerome Eteve
 
Know your SQL Server - DMVs
Comunidade NetPonto
 
When to NoSQL and when to know SQL
Simon Elliston Ball
 
Aplicações 10x a 100x mais rápida com o postgre sql
Fabio Telles Rodriguez
 
Advanced pg_stat_statements: Filtering, Regression Testing & more
Lukas Fittl
 
2018 db-rainer schuettengruber-beating-oracles_optimizer_at_its_own_game-pres...
Rainer Schuettengruber
 
Introduction to Spark Datasets - Functional and relational together at last
Holden Karau
 
Explain this!
Fabio Telles Rodriguez
 
Dok Talks #115 - What More Can I Learn From My OpenTelemetry Traces?
DoKC
 
AWS Hadoop and PIG and overview
Dan Morrill
 
Sql for dbaspresentation
oracle documents
 
Introducing Apache Spark's Data Frames and Dataset APIs workshop series
Holden Karau
 
Easing the Complex with SPBench framework
adriano1mg
 
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...
NoSQLmatters
 
Ad

Recently uploaded (20)

PPTX
MuleSoft MCP Support (Model Context Protocol) and Use Case Demo
shyamraj55
 
PPTX
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
PDF
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PDF
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
PDF
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
DOCX
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PPTX
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
DOCX
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
PPTX
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
PDF
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
 
PDF
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
MuleSoft MCP Support (Model Context Protocol) and Use Case Demo
shyamraj55
 
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
 
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
Ad

Postgresql Database Administration- Day4

  • 2. Database Maintenance & Performance and Concurrency Day-4 • PostgreSQL Tuning and Performance • Find and Tune Slow Running Queries • Collecting regular statistics from pg_stat* views • Finding out what makes SQL slow • Speeding up queries without rewriting them • Discovering why a query is not using an index • Forcing a query to use an index • EXPLAIN and SQL Execution • Workload Analysis
  • 3. database_info.sql : SELECT db.datname, au.rolname as datdba, pg_encoding_to_char(db.encoding) as encoding, db.datallowconn, db.datconnlimit, db.datfrozenxid, tb.spcname as tblspc, -- db.datconfig, db.datacl FROM pg_database db JOIN pg_authid au ON au.oid = db.datdba JOIN pg_tablespace tb ON tb.oid = db.dattablespace ORDER BY 1; database_sizes.sql SELECT datname, pg_size_pretty(pg_database_size(datname))as size_pretty, pg_database_size(datname) as size, (SELECT pg_size_pretty (SUM( pg_database_size(datname))::bigint) FROM pg_database) AS total, ((pg_database_size(datname) / (SELECT SUM( pg_database_size(datname)) FROM pg_database) ) * 100)::numeric(6,3) AS pct FROM pg_database ORDER BY datname; blocked_transactions.sql /* Requires PostgreSQL version is 9.2 or Greater */ SELECT w.query as waiting_query, w.pid as w_pid, w.usename as w_user, l.query as locking_query, l.pid as l_pid, l.usename as l_user, t.schemaname || '.' || t.relname as tablename FROM pg_stat_activity w JOIN pg_locks l1 ON (w.pid = l1.pid and not l1.granted) JOIN pg_locks l2 on (l1.relation = l2.relation and l2.granted) JOIN pg_stat_activity l ON (l2.pid = l.pid) JOIN pg_stat_user_tables t ON (l1.relation = t.relid) WHERE w.waiting; cache_hit_ratio.sql SELECT pg_stat_database.datname, pg_stat_database.blks_read, pg_stat_database.blks_hit, round((pg_stat_database.blks_hit::double precision / (pg_stat_database.blks_read + pg_stat_database.blks_hit +1)::double precision * 100::double precision)::numeric, 2) AS cachehitratio FROM pg_stat_database WHERE pg_stat_database.datname !~ '^(template(0|1)|postgres)$'::text ORDER BY round((pg_stat_database.blks_hit::double precision / (pg_stat_database.blks_read + pg_stat_database.blks_hit + 1)::double precision * 100::double precision)::numeric, 2) DESC; connection_counts.sql: SELECT COUNT(*) FROM pg_stat_activity; SELECT usename, count(*) FROM pg_stat_activity GROUP BY 1 ORDER BY 1; SELECT datname, usename, count(*) FROM pg_stat_activity GROUP BY 1, 2 ORDER BY 1, 2; SELECT usename, datname, count(*) FROM pg_stat_activity GROUP BY 1, 2 ORDER BY 1, 2; current_locks.sql:SELECT database, relation, n.nspname, c.relname, pid, a.usename, locktype, mode, granted, tuple FROM pg_locks l JOIN pg_class c ON (c.oid = l.relation) JOIN pg_namespace n ON (n.oid = c.relnamespace) JOIN pg_stat_activity a ON (a.procpid = l.pid) ORDER BY database, relation, pid;
  • 4. current_queries.sql: SELECT a.datname, a.procpid as pid, CASE WHEN a.client_addr IS NULL THEN 'local' ELSE a.client_addr::text END as client_addr, a.usename as user, a.waiting, l.procpid as blocking_pid, l.usename as blicking_user, a.current_query, a.query_start, current_timestamp - a.query_start as duration FROM pg_stat_activity a LEFT JOIN pg_locks l1 ON (a.procpid = l1.pid ) LEFT JOIN pg_locks l2 on (l1.relation = l2.relation ) LEFT JOIN pg_stat_activity l ON (l2.pid = l.procpid) LEFT JOIN pg_stat_user_tables t ON (l1.relation = t.relid) WHERE pg_backend_pid() <> a.procpid ORDER BY a.datname, a.query_start; SELECT w.current_query as waiting_query, w.procpid as w_pid, w.usename as w_user, l.current_query as locking_query, l.procpid as l_pid, l.usename as l_user, t.schemaname || '.' || t.relname as tablename FROM pg_stat_activity w JOIN pg_locks l1 ON (w.procpid = l1.pid and not l1.granted) JOIN pg_locks l2 on (l1.relation = l2.relation and l2.granted) JOIN pg_stat_activity l ON (l2.pid = l.procpid) JOIN pg_stat_user_tables t ON (l1.relation = t.relid) WHERE w.waiting; SELECT datname, procpid as pid, client_addr, usename as user, current_query, CASE WHEN waiting = TRUE THEN 'BLOCKED' ELSE 'no' END as waiting, query_start, current_timestamp - query_start as duration FROM pg_stat_activity WHERE pg_backend_pid() <> procpid ORDER BY datname, query_start; current_queries_blocked.sql SELECT c.datname, c.pid as pid, c.client_addr, c.usename as user, c.query, CASE WHEN c.waiting = TRUE THEN 'BLOCKED' ELSE 'no' END as waiting, l.pid as blocked_by, current_timestamp - c.query_start as duration FROM pg_stat_activity c LEFT JOIN pg_locks l1 ON (c.pid = l1.pid and not l1.granted) LEFT JOIN pg_locks l2 on (l1.relation = l2.relation and l2.granted) LEFT JOIN pg_stat_activity l ON (l2.pid = l.pid) LEFT JOIN pg_stat_user_tables t ON (l1.relation = t.relid) WHERE pg_backend_pid() <> c.pid ORDER BY datname, query_start; most_active_tables.sql SELECT schemaname, relname, idx_tup_fetch + seq_tup_read as TotalReads FROM pg_stat_all_tables WHERE idx_tup_fetch + seq_tup_read != 0 AND schemaname NOT IN ( 'pg_catalog', 'pg_toast' ) order by TotalReads desc LIMIT 10; pg_runtime.sql SELECT pg_postmaster_start_time() as pg_start, current_timestamp - pg_postmaster_start_time() as runtime; table_row_counts.sql SELECT s.nspname, c.relname as table, c.reltuples::int4 as rows FROM pg_catalog.pg_class c JOIN pg_catalog.pg_namespace s ON (c.relnamespace = s.oid) WHERE relkind = 'r' AND c.reltuples::int4 > 0 ORDER BY rows DESC;
  • 5. pg_stat_all_tables.sql SELECT n.nspname, s.relname, c.reltuples::bigint,-- n_live_tup, n_tup_ins, n_tup_upd, n_tup_del, date_trunc('second', last_vacuum) as last_vacuum, date_trunc('second', last_autovacuum) as last_autovacuum, date_trunc('second', last_analyze) as last_analyze, date_trunc('second', last_autoanalyze) as last_autoanalyze , round( current_setting('autovacuum_vacuum_threshold')::integer + current_setting('autovacuum_vacuum_scale_factor')::numeric * C.reltuples) AS av_threshold /* ,CASE WHEN reltuples > 0 THEN round(100.0 * n_dead_tup / (reltuples)) ELSE 0 END AS pct_dead, CASE WHEN n_dead_tup > round( current_setting('autovacuum_vacuum_threshold')::integer + current_setting('autovacuum_vacuum_scale_factor')::numeric * C.reltuples) THEN 'VACUUM' ELSE 'ok' END AS "av_needed" */ FROM pg_stat_all_tables s JOIN pg_class c ON c.oid = s.relid JOIN pg_namespace n ON (n.oid = c.relnamespace) WHERE s.relname NOT LIKE 'pg_%' AND s.relname NOT LIKE 'sql_%' -- AND s.relname LIKE '%TBL%' ORDER by 1, 2; pg_stat_all_indexes.sql SELECT n.nspname as schema, i.relname as table, i.indexrelname as index, i.idx_scan, i.idx_tup_read, i.idx_tup_fetch, CASE WHEN idx.indisprimary THEN 'pkey' WHEN idx.indisunique THEN 'uidx' ELSE 'idx' END AS type, CASE WHEN idx.indisvalid THEN 'valid' ELSE 'INVALID' END as statusi, pg_relation_size( quote_ident(n.nspname) || '.' || quote_ident(i.relname) ) as size_in_bytes, pg_size_pretty(pg_relation_size(quote_ident(n.nspname) || '.' || quote_ident(i.relname) )) as size FROM pg_stat_all_indexes i JOIN pg_class c ON (c.oid = i.relid) JOIN pg_namespace n ON (n.oid = c.relnamespace) JOIN pg_index idx ON (idx.indexrelid = i.indexrelid ) WHERE i.relname LIKE '%%' AND n.nspname NOT LIKE 'pg_%' -- AND idx.indisunique = TRUE -- AND NOT idx.indisprimary -- AND i.indexrelname LIKE 'tmp%' -- AND idx.indisvalid IS false /* AND NOT idx.indisprimary AND NOT idx.indisunique AND idx_scan = 0 */ ORDER BY 1, 2, 3;
  • 6. blocked_transactions.sql /* Requires PostgreSQL version is 9.2 or Greater */ SELECT w.query as waiting_query, w.pid as w_pid, w.usename as w_user, l.query as locking_query, l.pid as l_pid, l.usename as l_user, t.schemaname || '.' || t.relname as tablename FROM pg_stat_activity w JOIN pg_locks l1 ON (w.pid = l1.pid and not l1.granted) JOIN pg_locks l2 on (l1.relation = l2.relation and l2.granted) JOIN pg_stat_activity l ON (l2.pid = l.pid) JOIN pg_stat_user_tables t ON (l1.relation = t.relid) WHERE w.waiting; cache_hit_ratio.sql SELECT pg_stat_database.datname, pg_stat_database.blks_read, pg_stat_database.blks_hit, round((pg_stat_database.blks_hit::double precision / (pg_stat_database.blks_read + pg_stat_database.blks_hit +1)::double precision * 100::double precision)::numeric, 2) AS cachehitratio FROM pg_stat_database WHERE pg_stat_database.datname !~ '^(template(0|1)|postgres)$'::text ORDER BY round((pg_stat_database.blks_hit::double precision / (pg_stat_database.blks_read + pg_stat_database.blks_hit + 1)::double precision * 100::double precision)::numeric, 2) DESC; connection_counts.sql: SELECT COUNT(*) FROM pg_stat_activity; SELECT usename, count(*) FROM pg_stat_activity GROUP BY 1 ORDER BY 1; SELECT datname, usename, count(*) FROM pg_stat_activity GROUP BY 1, 2 ORDER BY 1, 2; SELECT usename, datname, count(*) FROM pg_stat_activity GROUP BY 1, 2 ORDER BY 1, 2; current_locks.sql:SELECT database, relation, n.nspname, c.relname, pid, a.usename, locktype, mode, granted, tuple FROM pg_locks l JOIN pg_class c ON (c.oid = l.relation) JOIN pg_namespace n ON (n.oid = c.relnamespace) JOIN pg_stat_activity a ON (a.procpid = l.pid) ORDER BY database, relation, pid; current_queries.sql: SELECT a.datname, a.procpid as pid, CASE WHEN a.client_addr IS NULL THEN 'local' ELSE a.client_addr::text END as client_addr, a.usename as user, a.waiting, l.procpid as blocking_pid, l.usename as blicking_user, a.current_query, a.query_start, current_timestamp - a.query_start as duration FROM pg_stat_activity a LEFT JOIN pg_locks l1 ON (a.procpid = l1.pid ) LEFT JOIN pg_locks l2 on (l1.relation = l2.relation ) LEFT JOIN pg_stat_activity l ON (l2.pid = l.procpid) LEFT JOIN pg_stat_user_tables t ON (l1.relation = t.relid) WHERE pg_backend_pid() <> a.procpid ORDER BY a.datname, a.query_start; SELECT w.current_query as waiting_query, w.procpid as w_pid, w.usename as w_user, current_queries_blocked.sql SELECT c.datname, c.pid as pid, c.client_addr, c.usename as user, c.query, CASE WHEN c.waiting = TRUE THEN 'BLOCKED' ELSE 'no' END as waiting, l.pid as blocked_by, current_timestamp - c.query_start as duration FROM pg_stat_activity c LEFT JOIN pg_locks l1 ON (c.pid = l1.pid and not l1.granted) LEFT JOIN pg_locks l2 on (l1.relation = l2.relation and l2.granted) LEFT JOIN pg_stat_activity l ON (l2.pid = l.pid) LEFT JOIN pg_stat_user_tables t ON (l1.relation = t.relid) WHERE pg_backend_pid() <> c.pid ORDER BY datname, query_start; most_active_tables.sql SELECT schemaname, relname, idx_tup_fetch + seq_tup_read as TotalReads FROM pg_stat_all_tables WHERE idx_tup_fetch + seq_tup_read != 0
  • 7. pg_stat_all_tables.sql SELECT n.nspname, s.relname, c.reltuples::bigint,-- n_live_tup, n_tup_ins, n_tup_upd, n_tup_del, date_trunc('second', last_vacuum) as last_vacuum, date_trunc('second', last_autovacuum) as last_autovacuum, date_trunc('second', last_analyze) as last_analyze, date_trunc('second', last_autoanalyze) as last_autoanalyze , round( current_setting('autovacuum_vacuum_threshold')::integer + current_setting('autovacuum_vacuum_scale_factor')::numeric * C.reltuples) AS av_threshold /* ,CASE WHEN reltuples > 0 THEN round(100.0 * n_dead_tup / (reltuples)) ELSE 0 END AS pct_dead, CASE WHEN n_dead_tup > round( current_setting('autovacuum_vacuum_threshold')::integer + current_setting('autovacuum_vacuum_scale_factor')::numeric * C.reltuples) THEN 'VACUUM' ELSE 'ok' END AS "av_needed" */ FROM pg_stat_all_tables s JOIN pg_class c ON c.oid = s.relid JOIN pg_namespace n ON (n.oid = c.relnamespace) WHERE s.relname NOT LIKE 'pg_%' AND s.relname NOT LIKE 'sql_%' -- AND s.relname LIKE '%TBL%' ORDER by 1, 2; pg_stat_all_indexes.sql SELECT n.nspname as schema, i.relname as table, i.indexrelname as index, i.idx_scan, i.idx_tup_read, i.idx_tup_fetch, CASE WHEN idx.indisprimary THEN 'pkey' WHEN idx.indisunique THEN 'uidx' ELSE 'idx' END AS type, CASE WHEN idx.indisvalid THEN 'valid' ELSE 'INVALID' END as statusi, pg_relation_size( quote_ident(n.nspname) || '.' || quote_ident(i.relname) ) as size_in_bytes, pg_size_pretty(pg_relation_size(quote_ident(n.nspname) || '.' || quote_ident(i.relname) )) as size FROM pg_stat_all_indexes i JOIN pg_class c ON (c.oid = i.relid) JOIN pg_namespace n ON (n.oid = c.relnamespace) JOIN pg_index idx ON (idx.indexrelid = i.indexrelid ) WHERE i.relname LIKE '%%' AND n.nspname NOT LIKE 'pg_%' -- AND idx.indisunique = TRUE -- AND NOT idx.indisprimary -- AND i.indexrelname LIKE 'tmp%' -- AND idx.indisvalid IS false /* AND NOT idx.indisprimary AND NOT idx.indisunique AND idx_scan = 0 */ ORDER BY 1, 2, 3;
  • 8. Standard Statistics Views pg_stat_activity- One row per server process, showing information related to the current activity of that process, such as state and current query. pg_stat_archiver- One row only, showing statistics about the WAL archiver process's activity. pg_stat_bgwriter- One row only, showing statistics about the background writer process's activity. pg_stat_database- One row per database, showing database-wide statistics. pg_stat_all_tables-One row for each table in the current database, showing statistics about accesses to that specific table.. pg_stat_sys_tables- Same as pg_stat_all_tables, except that only system tables are shown. pg_stat_user_tables- Same as pg_stat_all_tables, except that only user tables are shown. pg_stat_xact_all_tables- Similar to pg_stat_all_tables, but counts actions taken so far within the current transaction (which are not yet included in pg_stat_all_tables and related views). The columns for numbers of live and dead rows and vacuum and analyze actions are not present in this view. pg_stat_xact_sys_tables -Same as pg_stat_xact_all_tables, except that only system tables are shown. pg_stat_xact_user_tables- Same as pg_stat_xact_all_tables, except that only user tables are shown. pg_statio_user_sequences-Same as pg_statio_all_sequences, except that only user sequences are shown. pg_stat_user_functions- One row for each tracked function, showing statistics about executions of that function. pg_stat_xact_user_functions-Similar to pg_stat_user_functions, but counts only calls during the current transaction (which are not yet included in pg_stat_user_functions).
  • 9. Standard Statistics Views pg_statio_sys_indexes- Same as pg_statio_all_indexes, except that only indexes on system tables are shown. pg_statio_user_indexes- Same as pg_statio_all_indexes, except that only indexes on user tables are shown. pg_statio_all_sequences-One row for each sequence in the current database, showing statistics about I/O on that specific sequence. See pg_statio_all_sequences for details. pg_statio_sys_sequences Same as pg_statio_all_sequences, except that only system sequences are shown. (Presently, no system sequences are defined, so this view is always empty.) pg_stat_replication-One row per WAL sender process, showing statistics about replication to that sender's connected standby server.. pg_stat_database_conflicts-One row per database, showing database-wide statistics about query cancels due to conflict with recovery on standby servers. ws pg_stat_all_indexes -One row for each index in the current database, showing statistics about accesses to that specific index. pg_stat_sys_indexes- Same as pg_stat_all_indexes, except that only indexes on system tables are shown. pg_stat_user_indexes- Same as pg_stat_all_indexes, except that only indexes on user tables are shown. pg_statio_all_tables- One row for each table in the current database, showing statistics about I/O on that specific table. pg_statio_sys_tables- Same as pg_statio_all_tables, except that only system tables are shown. pg_statio_user_tables- Same as pg_statio_all_tables, except that only user tables are shown. pg_statio_all_indexes- One row for each index in the current database, showing statistics about I/O on that specific index.
  • 10. connection_counts.sql SELECT COUNT(*) FROM pg_stat_activity; SELECT usename, count(*) FROM pg_stat_activity GROUP BY 1 ORDER BY 1; SELECT datname, usename, count(*) FROM pg_stat_activity GROUP BY 1, 2 ORDER BY 1, 2; SELECT usename, datname, count(*) FROM pg_stat_activity GROUP BY 1, 2 ORDER BY 1, 2; current_locks.sql SELECT database, relation, n.nspname, c.relname, pid, a.usename, locktype, mode, granted, tuple FROM pg_locks l JOIN pg_class c ON (c.oid = l.relation) JOIN pg_namespace n ON (n.oid = c.relnamespace) JOIN pg_stat_activity a ON (a.procpid = l.pid) ORDER BY database, relation, pid; current_queries.sql SELECT a.datname, a.procpid as pid, CASE WHEN a.client_addr IS NULL THEN 'local' ELSE a.client_addr::text END as client_addr, a.usename as user, a.waiting, l.procpid as blocking_pid, l.usename as blicking_user, a.current_query, a.query_start, current_timestamp - a.query_start as duration FROM pg_stat_activity a LEFT JOIN pg_locks l1 ON (a.procpid = l1.pid ) LEFT JOIN pg_locks l2 on (l1.relation = l2.relation ) LEFT JOIN pg_stat_activity l ON (l2.pid = l.procpid) LEFT JOIN pg_stat_user_tables t ON (l1.relation = t.relid) WHERE pg_backend_pid() <> a.procpid ORDER BY a.datname, a.query_start; SELECT w.current_query as waiting_query, w.procpid as w_pid, w.usename as w_user, l.current_query as locking_query, l.procpid as l_pid, l.usename as l_user, t.schemaname || '.' || t.relname as tablename FROM pg_stat_activity w JOIN pg_locks l1 ON (w.procpid = l1.pid and not l1.granted) JOIN pg_locks l2 on (l1.relation = l2.relation and l2.granted) JOIN pg_stat_activity l ON (l2.pid = l.procpid) JOIN pg_stat_user_tables t ON (l1.relation = t.relid) SELECT datname, procpid as pid, client_addr, usename as user, current_query, CASE WHEN waiting = TRUE THEN 'BLOCKED' ELSE 'no' END as waiting, query_start, current_timestamp - query_start as duration FROM pg_stat_activity WHERE pg_backend_pid() <> procpid ORDER BY datname, query_start; current_queries_blocked.sql /* Use for PostgreSQL 9.2 or greater */ SELECT c.datname, c.pid as pid, c.client_addr, c.usename as user, c.query, CASE WHEN c.waiting = TRUE THEN 'BLOCKED' ELSE 'no' END as waiting, l.pid as blocked_by, current_timestamp - c.query_start as duration FROM pg_stat_activity c LEFT JOIN pg_locks l1 ON (c.pid = l1.pid and not l1.granted) LEFT JOIN pg_locks l2 on (l1.relation = l2.relation and l2.granted) LEFT JOIN pg_stat_activity l ON (l2.pid = l.pid) LEFT JOIN pg_stat_user_tables t ON (l1.relation = t.relid) WHERE pg_backend_pid() <> c.pid ORDER BY datname, query_start; most_active_tables.sql SELECT schemaname, relname, idx_tup_fetch + seq_tup_read as TotalReads FROM pg_stat_all_tables WHERE idx_tup_fetch + seq_tup_read != 0 AND schemaname NOT IN ( 'pg_catalog', 'pg_toast' ) order by TotalReads desc LIMIT 10; gen_table_compare_counts.sql SELECT 'SELECT ''' || c.relname || ''' as table, ' || c.reltuples::int4 || ' as rows, ' || '(SELECT COUNT(*) FROM ' || s.nspname || '."' || c.relname || '") as cnt;' FROM pg_catalog.pg_class c
  • 11. What is Vacuum? • Vacuum does the following: • Gathering table and index statistics • Reorganize the table • Clean up tables and index dead blocks • Frozen by record XID to prevent XID Wraparound • #1 and #2 are generally required for DBMS management. But #3 and #4 are necessary because of the PostgreSQL MVCC feature VACUUM • •Restructures pages and reclaims space taken by dead rows (rows that were deleted BEFORE any of the current transactions started) • •Removes dead rows from indexes and TOAST tables • •Having long-running transactions can mess everything up(including long transactions on replica if hot_standby_feedback == on) • •Truncates the table if possible • •Updates free space map • •Done to avoid needing • VACUUM FULLNOT NEEDED: • •On replica • •After TRUNCATE
  • 12. VACUUM FULL •Shrinks table size(rewrites all “alive” tuples into a new file as compactly as possible) •Can only be launched manually(not by autovacuum) •OID of the relation stays the same, relfilenode (on-disk name) changesCons:•ACCESS EXCLUSIVE LOCK(no reading or writing allowed) •table size≤needed space ≤table size * 2 •Need a REINDEX •Takes a long time Alternative: pg_repack-does allow reads and writes, but needs more space (≥table size * 2)VACUUM FULL5
  • 15. Below have required changes to force the Autovacuum parameters for running frequently. First enable the log for Autovacuum process: log_autovacuum_min_duration = 0 Increase the size of worker to check table more: autovacuum_max_workers = 6 autovacuum_naptime = 15s Decrease the value of thresholds and auto analyze to trigger the sooner: autovacuum_vacuum_threshold = 25 autovacuum_vacuum_scale_factor = 0.1 autovacuum_analyze_threshold = 10 autovacuum_analyze_scale_factor = 0.05 Make autovacuum less interruptable: autovacuum_vacuum_cost_delay = 10ms autovacuum_vacuum_cost_limit = 1000
  • 16. Script to check the status of AutoVacuum for all Tables SELECT schemaname ,relname ,n_live_tup ,n_dead_tup ,last_autovacuum FROM pg_stat_all_tables ORDER BY n_dead_tup /(n_live_tup * current_setting('autovacuum_vacuum_scale_factor')::float8 + current_setting('autovacuum_vacuum_threshold')::float8) DESC;
  • 17. Controlling automatic database maintenance Autovacuum is enabled by default in PostgreSQL and mostly does a great job of maintaining your PostgreSQL database. We say mostly because it doesn't know everything you do about the database, such as the best time to perform maintenance actions. Let's explore the settings that can be tuned so that you can use vacuums efficiently. Exercising control requires some thinking about what you actually want: What are the best times of day to do things? When are system resources more available? Which days are quiet, and which are not? Which tables are critical to the application, and which are not? Perform the following steps: The first thing to do is make sure that autovacuum is switched on, which is the default. Check that you have the following parameters enabled in yourpostgresql.conffile: autovacuum = on track_counts = on PostgreSQL controls autovacuum with more than 40 individually tunable parameters that provide a wide range of...Get quickly up to speed on the latest tech
  • 18. Removing issues that cause bloat Bloat can be caused by long-running queries or long-running write transactions that execute alongside write-heavy workloads. Resolving that is mostly down to understanding the workloads running on the server. Look at the age of the oldest snapshots that are running, like this: postgres=# SELECT now() - CASE WHEN backend_xid IS NOT NULL THEN xact_start ELSE query_start END AS age , pid, backend_xid AS xid, backend_xmin AS xmin, stateFROM pg_stat_activity WHERE backend_type = 'client backend’ ORDER BY 1 DESC; age | pid | xid | xmin | state ----------------+-------+----------+----------+------------------ 00:00:25.791098 | 27624 | | 10671262 | active 00:00:08.018103 | 27591 | | | idle in transaction 00:00:00.002444 | 27630 | 10703641 | 10703639 | active 00:00:00.001506 | 27631 | 10703642 | 10703640 | active 00:00:00.000324 | 27632 | 10703643 | 10703641 | active 00:00:00...
  • 19. Identifying and fixing bloated tables and indexes PostgreSQL implements Multiversion Concurrency Control (MVCC), which allows users to read data at the same time as writers make changes. This is an important feature for concurrency in database applications, as it can allow the following: Better performance because of fewer locks Greatly reduced deadlocking Simplified application design and management Bloated tables and indexes are a natural consequence of MVCC design in PostgreSQL. It is caused mainly by updates, as we must retain both the old and new updates for a certain period of time. Bloating results in increased disk consumption, as well as performance loss—if a table is twice as big as it should be, scanning it takes twice as long. VACUUM is one of the best ways of removing bloat. Many users execute VACUUM far too frequently, while at the same time complaining about the cost of doing so. This recipe is all about understanding when you need to run VACUUM by estimating the amount of bloat...
  • 20. Monitoring and tuning a vacuum If you're currently waiting for a long-running vacuum (or autovacuum) to finish, go straight to the How to do it... section. If you've just had a long-running vacuum complete, then you may want to think about setting a few parameters. autovacuum_max_workers should always be set to more than 2. Setting it too high may not be very useful, and so you need to be careful. Setting vacuum_cost_delay too high is counterproductive. VACUUM is your friend, not your enemy, so delaying it until it doesn't happen at all just makes things worse. maintenance_work_mem should be set to anything up to 1 GB, according to how much memory you can allocate to this task at this time. Let's watch what happens when we run a large VACUUM. Don't run VACUUM FULL, because it runs for a long time while holding an AccessExclusiveLock on the table. First, locate which process is running the VACUUM by using the pg_stat_activity view to identify the specific pid (34399 is just an example... test=# SELECT oid::regclass::text AS table, age(relfrozenxid) AS xid_age, mxid_age(relminmxid) AS mxid_age, least( (SELECT setting::int FROM pg_settings WHERE name = 'autovacuum_freeze_max_age') - age(relfrozenxid), (SELECT setting::int FROM pg_settings WHERE name = 'autovacuum_multixact_freeze_max_age') - mxid_age(relminmxid) ) AS tx_before_wraparound_vacuum,pg_size_pretty(pg_total_relation_size(oid)) AS size, pg_stat_get_last_autovacuum_time(oid) AS last_autovacuum FROM pg_class WHERE relfrozenxid != 0 AND oid > 16384 ORDER BY tx_before_wraparound_vacuum;
  • 24. Demo - Optimizer Statistics
  • 25. Example - Updating Statistics
  • 30. Demo - Vacuum Command