SlideShare a Scribd company logo
JSON by example
FOSDEM PostgreSQL Devevoper Room
January 2016
Stefanie Janine Stölting
@sjstoelting
JSON
JavaScript Object Notation
Don't have to care about encoding, it is always
Unicode, most implemantations use UTF8
Used for data exchange in web application
Currently two standards RFC 7159 by Douglas
Crockford und ECMA-404
PostgreSQL impementation is RFC 7159
JSON Datatypes
JSON
Available since 9.2
BSON
Available as extension on GitHub since 2013
JSONB
Available since 9.4
Crompessed JSON
Fully transactionoal
Up to 1 GB (uses TOAST)
Performance
Test done by
EnterpriseDB,
see the article
by Marc Linster
JSON Functions
row_to_json({row})
Returns the row as JSON
array_to_json({array})
Returns the array as JSON
jsonb_to_recordset
Returns a recordset from JSONB
JSON Opertators
Array element
->{int}
Array element by name
->{text}
Object element
->> {text}
Value at path
#> {text}
Index on JSON
Index JSONB content for faster access with indexes
GIN index overall
CREATE INDEX idx_1 ON jsonb.actor USING
GIN (jsondata);
Even unique B-Tree indexes are possible
CREATE UNIQUE INDEX actor_id_2 ON
jsonb.actor((CAST(jsondata->>'actor_id' AS
INTEGER)));
New JSON functions
PostgreSQL 9.5 new JSONB functions:
jsonb_pretty: Formats JSONB human readable
jsonb_set: Update or add values
PostgreSQL 9.5 new JSONB operators:
||: Concatenate two JSONB
-: Delete key
Available as extions for 9.4 at PGXN: jsonbx
Data sources
The Chinook database is available
at chinookdatabase.codeplex.com
Amazon book reviews of 1998 are
available at
examples.citusdata.com/customer_review
Chinook Tables
CTE
Common Table Expressions will be used in examples
● Example:
WITH RECURSIVE t(n) AS (
VALUES (1)
UNION ALL
SELECT n+1 FROM t WHERE n < 100
)
SELECT sum(n), min(n), max(n) FROM t;
●
Result:
Live Examples
Let's see, how it does work.
Live with Chinook data
-- Step 1: Tracks as JSON with the album identifier
WITH tracks AS
(
SELECT "AlbumId" AS album_id
, "TrackId" AS track_id
, "Name" AS track_name
FROM "Track"
)
SELECT row_to_json(tracks) AS tracks
FROM tracks
;
Live with Chinook data
-- Step 2 Abums including tracks with aritst identifier
WITH tracks AS
(
SELECT "AlbumId" AS album_id
, "TrackId" AS track_id
, "Name" AS track_name
FROM "Track"
)
, json_tracks AS
(
SELECT row_to_json(tracks) AS tracks
FROM tracks
)
, albums AS
(
SELECT a."ArtistId" AS artist_id
, a."AlbumId" AS album_id
, a."Title" AS album_title
, array_agg(t.tracks) AS album_tracks
FROM "Album" AS a
INNER JOIN json_tracks AS t
ON a."AlbumId" = (t.tracks->>'album_id')::int
GROUP BY a."ArtistId"
, a."AlbumId"
, a."Title"
)
SELECT artist_id
, array_agg(row_to_json(albums)) AS album
FROM albums
GROUP BY artist_id
;
Live with Chinook data
Live with Chinook data
-- Step 3 Return one row for an artist with all albums as VIEW
CREATE OR REPLACE VIEW v_json_artist_data AS
WITH tracks AS
(
SELECT "AlbumId" AS album_id
, "TrackId" AS track_id
, "Name" AS track_name
, "MediaTypeId" AS media_type_id
, "Milliseconds" As milliseconds
, "UnitPrice" AS unit_price
FROM "Track"
)
, json_tracks AS
(
SELECT row_to_json(tracks) AS tracks
FROM tracks
)
, albums AS
(
SELECT a."ArtistId" AS artist_id
, a."AlbumId" AS album_id
, a."Title" AS album_title
, array_agg(t.tracks) AS album_tracks
FROM "Album" AS a
INNER JOIN json_tracks AS t
ON a."AlbumId" = (t.tracks->>'album_id')::int
GROUP BY a."ArtistId"
, a."AlbumId"
, a."Title"
)
, json_albums AS
(
SELECT artist_id
, array_agg(row_to_json(albums)) AS album
FROM albums
GROUP BY artist_id
)
-- -> Next Page
Live with Chinook data
-- Step 3 Return one row for an artist with all albums as VIEW
, artists AS
(
SELECT a."ArtistId" AS artist_id
, a."Name" AS artist
, jsa.album AS albums
FROM "Artist" AS a
INNER JOIN json_albums AS jsa
ON a."ArtistId" = jsa.artist_id
)
SELECT (row_to_json(artists))::jsonb AS artist_data
FROM artists
;
Live with Chinook data
-- Select data from the view
SELECT *
FROM v_json_artist_data
;
Live with Chinook data
-- SELECT data from that VIEW, that does querying
SELECT jsonb_pretty(artist_data)
FROM v_json_artist_data
WHERE artist_data->>'artist' IN ('Miles Davis', 'AC/DC')
;
Live with Chinook data
-- SELECT some data from that VIEW using JSON methods
SELECT artist_data->>'artist' AS artist
, artist_data#>'{albums, 1, album_title}' AS album_title
, jsonb_pretty(artist_data#>'{albums, 1, album_tracks}') AS album_tracks
FROM v_json_artist_data
WHERE artist_data->'albums' @> '[{"album_title":"Miles Ahead"}]'
;
Live with Chinook data
-- Array to records
SELECT artist_data->>'artist_id' AS artist_id
, artist_data->>'artist' AS artist
, jsonb_array_elements(artist_data#>'{albums}')->>'album_title' AS album_title
, jsonb_array_elements(jsonb_array_elements(artist_data#>'{albums}')#>'{album_tracks}')->>'track_name' AS song_titles
, jsonb_array_elements(jsonb_array_elements(artist_data#>'{albums}')#>'{album_tracks}')->>'track_id' AS song_id
FROM v_json_artist_data
WHERE artist_data->>'artist' = 'Metallica'
ORDER BY album_title
, song_id
;
Live with Chinook data
-- Convert albums to a recordset
SELECT *
FROM jsonb_to_recordset(
(
SELECT (artist_data->>'albums')::jsonb
FROM v_json_artist_data
WHERE (artist_data->>'artist_id')::int = 50
)
) AS x(album_id int, artist_id int, album_title text, album_tracks jsonb)
;
Live with Chinook data
-- Convert the tracks to a recordset
SELECT album_id
, track_id
, track_name
, media_type_id
, milliseconds
, unit_price
FROM jsonb_to_recordset(
(
SELECT artist_data#>'{albums, 1, album_tracks}'
FROM v_json_artist_data
WHERE (artist_data->>'artist_id')::int = 50
)
) AS x(album_id int, track_id int, track_name text, media_type_id int, milliseconds int, unit_price float)
;
Live with Chinook data
-- Create a function, which will be used for UPDATE on the view v_artrist_data
CREATE OR REPLACE FUNCTION trigger_v_json_artist_data_update()
RETURNS trigger AS
$BODY$
-- Data variables
DECLARE rec RECORD;
-- Error variables
DECLARE v_state TEXT;
DECLARE v_msg TEXT;
DECLARE v_detail TEXT;
DECLARE v_hint TEXT;
DECLARE v_context TEXT;
BEGIN
-- Update table Artist
IF (OLD.artist_data->>'artist')::varchar(120) <> (NEW.artist_data->>'artist')::varchar(120) THEN
UPDATE "Artist"
SET "Name" = (NEW.artist_data->>'artist')::varchar(120)
WHERE "ArtistId" = (OLD.artist_data->>'artist_id')::int;
END IF;
-- Update table Album with an UPSERT
-- Update table Track with an UPSERT
RETURN NEW;
EXCEPTION WHEN unique_violation THEN
RAISE NOTICE 'Sorry, but the something went wrong while trying to update artist data';
RETURN OLD;
WHEN others THEN
GET STACKED DIAGNOSTICS
v_state = RETURNED_SQLSTATE,
v_msg = MESSAGE_TEXT,
v_detail = PG_EXCEPTION_DETAIL,
v_hint = PG_EXCEPTION_HINT,
v_context = PG_EXCEPTION_CONTEXT;
RAISE NOTICE '%', v_msg;
RETURN OLD;
END;
$BODY$
LANGUAGE plpgsql;
Live with Chinook data
Live with Chinook data
-- The trigger will be fired instead of an UPDATE statement to save data
CREATE TRIGGER v_json_artist_data_instead_update INSTEAD OF UPDATE
ON v_json_artist_data
FOR EACH ROW
EXECUTE PROCEDURE trigger_v_json_artist_data_update()
;
Live with Chinook data
-- Manipulate data with jsonb_set
SELECT artist_data->>'artist_id' AS artist_id
, artist_data->>'artist' AS artist
, jsonb_set(artist_data, '{artist}', '"Whatever we want, it is just text"'::jsonb)->>'artist' AS new_artist
FROM v_json_artist_data
WHERE (artist_data->>'artist_id')::int = 50
;
Live with Chinook data
-- Update a JSONB column with a jsonb_set result
UPDATE v_json_artist_data
SET artist_data= jsonb_set(artist_data, '{artist}', '"NEW Metallica"'::jsonb)
WHERE (artist_data->>'artist_id')::int = 50
;
Live with Chinook data
-- View the changes done by the UPDATE statement
SELECT artist_data->>'artist_id' AS artist_id
, artist_data->>'artist' AS artist
FROM v_json_artist_data
WHERE (artist_data->>'artist_id')::int = 50
;
Live with Chinook data
-- Lets have a view on the explain plans
– SELECT the data from the view
Live with Chinook data
-- View the changes in in the table instead of the JSONB view
-- The result should be the same, only the column name differ
SELECT *
FROM "Artist"
WHERE "ArtistId" = 50
;
Live with Chinook data
-- Lets have a view on the explain plans
– SELECT the data from table Artist
-- Manipulate data with the concatenating / overwrite operator
SELECT artist_data->>'artist_id' AS artist_id
, artist_data->>'artist' AS artist
, jsonb_set(artist_data, '{artist}', '"Whatever we want, it is just text"'::jsonb)->>'artist' AS new_artist
, artist_data || '{"artist":"Metallica"}'::jsonb->>'artist' AS correct_name
FROM v_json_artist_data
WHERE (artist_data->>'artist_id')::int = 50
;
Live with Chinook data
Live with Chinook data
-- Revert the name change of Metallica with in a different way: With the replace operator
UPDATE v_json_artist_data
SET artist_data = artist_data || '{"artist":"Metallica"}'::jsonb
WHERE (artist_data->>'artist_id')::int = 50
;
Live with Chinook data
-- View the changes done by the UPDATE statement with the replace operator
SELECT artist_data->>'artist_id' AS artist_id
, artist_data->>'artist' AS artist
FROM v_json_artist_data
WHERE (artist_data->>'artist_id')::int = 50
;
Live with Chinook data
-- Remove some data with the - operator
SELECT jsonb_pretty(artist_data) AS complete
, jsonb_pretty(artist_data - 'albums') AS minus_albums
, jsonb_pretty(artist_data) = jsonb_pretty(artist_data - 'albums') AS is_different
FROM v_json_artist_data
WHERE artist_data->>'artist' IN ('Miles Davis', 'AC/DC')
;
Live Amazon reviews
-- Create a table for JSON data with 1998 Amazon reviews
CREATE TABLE reviews(review_jsonb jsonb);
Live Amazon reviews
-- Import customer reviews from a file
COPY reviews
FROM '/var/tmp/customer_reviews_nested_1998.json'
;
Live Amazon reviews
-- There should be 589.859 records imported into the table
SELECT count(*)
FROM reviews
;
Live Amazon reviews
SELECT jsonb_pretty(review_jsonb)
FROM reviews
LIMIT 1
;
Live Amazon reviews
-- Select data with JSON
SELECT
review_jsonb#>> '{product,title}' AS title
, avg((review_jsonb#>> '{review,rating}')::int) AS average_rating
FROM reviews
WHERE review_jsonb@>'{"product": {"category": "Sheet Music & Scores"}}'
GROUP BY title
ORDER BY average_rating DESC
;
Without an Index: 248ms
Live Amazon reviews
-- Create a GIN index
CREATE INDEX review_review_jsonb ON reviews USING GIN (review_jsonb);
Live Amazon reviews
-- Select data with JSON
SELECT review_jsonb#>> '{product,title}' AS title
, avg((review_jsonb#>> '{review,rating}')::int) AS average_rating
FROM reviews
WHERE review_jsonb@>'{"product": {"category": "Sheet Music & Scores"}}'
GROUP BY title
ORDER BY average_rating DESC
;
The same query as before with the previously created GIN Index: 7ms
Live Amazon reviews
-- SELECT some statistics from the JSON data
SELECT review_jsonb#>>'{product,category}' AS category
, avg((review_jsonb#>>'{review,rating}')::int) AS average_rating
, count((review_jsonb#>>'{review,rating}')::int) AS count_rating
FROM reviews
GROUP BY category
;
Without an Index: 9747ms
Live Amazon reviews
-- Create a B-Tree index on a JSON expression
CREATE INDEX reviews_product_category ON reviews ((review_jsonb#>>'{product,category}'));
Live Amazon reviews
-- SELECT some statistics from the JSON data
SELECT review_jsonb#>>'{product,category}' AS category
, avg((review_jsonb#>>'{review,rating}')::int) AS average_rating
, count((review_jsonb#>>'{review,rating}')::int) AS count_rating
FROM reviews
GROUP BY category
;
The same query as before with the previously created BTREE Index: 1605ms
JSON by example
This document by Stefanie Janine Stölting is covered by the
Creative Commons Attribution 4.0 International

More Related Content

What's hot (13)

PDF
Data Munging in R - Chicago R User Group
designandanalytics
 
PDF
Efficient Indexes in MySQL
Aleksandr Kuzminsky
 
KEY
Introduction to DBIx::Lite - Kyoto.pm tech talk #2
Hiroshi Shibamura
 
PPTX
R- create a table from a list of files.... before webmining
Gabriela Plantie
 
PPTX
Switching from java to groovy
Paul Woods
 
PDF
TDC218SP | Trilha Kotlin - DSLs in a Kotlin Way
tdc-globalcode
 
PDF
Drools5 Community Training Module 3 Drools Expert DRL Syntax
Mauricio (Salaboy) Salatino
 
PDF
perl-pocket
tutorialsruby
 
PDF
Collectors in the Wild
José Paumard
 
PDF
Drools5 Community Training HandsOn 1 Drools DRL Syntax
Mauricio (Salaboy) Salatino
 
KEY
Template Haskell とか
Hiromi Ishii
 
DOCX
supporting t-sql scripts for Heap vs clustered table
Mahabubur Rahaman
 
KEY
R for Pirates. ESCCONF October 27, 2011
Mandi Walls
 
Data Munging in R - Chicago R User Group
designandanalytics
 
Efficient Indexes in MySQL
Aleksandr Kuzminsky
 
Introduction to DBIx::Lite - Kyoto.pm tech talk #2
Hiroshi Shibamura
 
R- create a table from a list of files.... before webmining
Gabriela Plantie
 
Switching from java to groovy
Paul Woods
 
TDC218SP | Trilha Kotlin - DSLs in a Kotlin Way
tdc-globalcode
 
Drools5 Community Training Module 3 Drools Expert DRL Syntax
Mauricio (Salaboy) Salatino
 
perl-pocket
tutorialsruby
 
Collectors in the Wild
José Paumard
 
Drools5 Community Training HandsOn 1 Drools DRL Syntax
Mauricio (Salaboy) Salatino
 
Template Haskell とか
Hiromi Ishii
 
supporting t-sql scripts for Heap vs clustered table
Mahabubur Rahaman
 
R for Pirates. ESCCONF October 27, 2011
Mandi Walls
 

Viewers also liked (7)

PDF
The Complete MariaDB Server Tutorial - Percona Live 2015
Colin Charles
 
PDF
Webinar slides: Replication Topology Changes for MySQL and MariaDB
Severalnines
 
PDF
Overview of Postgres 9.5
EDB
 
PDF
PostgreSQL Streaming Replication Cheatsheet
Alexey Lesovsky
 
PDF
Performance improvements in PostgreSQL 9.5 and beyond
Tomas Vondra
 
PDF
Deep Dive Into How To Monitor MySQL or MariaDB Galera Cluster / Percona XtraD...
Severalnines
 
PPTX
What's new in MySQL Cluster 7.4 webinar charts
Andrew Morgan
 
The Complete MariaDB Server Tutorial - Percona Live 2015
Colin Charles
 
Webinar slides: Replication Topology Changes for MySQL and MariaDB
Severalnines
 
Overview of Postgres 9.5
EDB
 
PostgreSQL Streaming Replication Cheatsheet
Alexey Lesovsky
 
Performance improvements in PostgreSQL 9.5 and beyond
Tomas Vondra
 
Deep Dive Into How To Monitor MySQL or MariaDB Galera Cluster / Percona XtraD...
Severalnines
 
What's new in MySQL Cluster 7.4 webinar charts
Andrew Morgan
 
Ad

Similar to JSON By Example (20)

ODP
NoSQL as Not Only SQL (FrOSCon 11)
Stefanie Janine Stölting
 
PDF
Postgre(No)SQL - A JSON journey
Nicola Moretto
 
PPTX
IBM Db2 JSON 11.5
Phil Downey
 
PPTX
PostgreSQL 9.4 JSON Types and Operators
Nicholas Kiraly
 
PDF
Conquering JSONB in PostgreSQL
Ines Panker
 
PDF
JSON Support in MariaDB: News, non-news and the bigger picture
Sergey Petrunya
 
PPTX
The rise of json in rdbms land jab17
alikonweb
 
PPTX
Querying NoSQL with SQL - MIGANG - July 2017
Matthew Groves
 
PDF
No sql way_in_pg
Vibhor Kumar
 
PPTX
Utilizing Arrays: Modeling, Querying and Indexing
Keshav Murthy
 
PDF
Json improvements in my sql 8.0
Mysql User Camp
 
PDF
MySQL 5.7. Tutorial - Dutch PHP Conference 2015
Dave Stokes
 
PDF
MySQL 5.7 Tutorial Dutch PHP Conference 2015
Dave Stokes
 
PPTX
JSON improvements in MySQL 8.0
Mydbops
 
PPTX
Power JSON with PostgreSQL
EDB
 
PPT
The NoSQL Way in Postgres
EDB
 
PPT
Using JSON/BSON types in your hybrid application environment
Ajay Gupte
 
PPTX
Querying NoSQL with SQL - KCDC - August 2017
Matthew Groves
 
PDF
Hybrid Databases - PHP UK Conference 22 February 2019
Dave Stokes
 
PDF
CREATE INDEX … USING VODKA. VODKA CONNECTING INDEXES, Олег Бартунов, Александ...
Ontico
 
NoSQL as Not Only SQL (FrOSCon 11)
Stefanie Janine Stölting
 
Postgre(No)SQL - A JSON journey
Nicola Moretto
 
IBM Db2 JSON 11.5
Phil Downey
 
PostgreSQL 9.4 JSON Types and Operators
Nicholas Kiraly
 
Conquering JSONB in PostgreSQL
Ines Panker
 
JSON Support in MariaDB: News, non-news and the bigger picture
Sergey Petrunya
 
The rise of json in rdbms land jab17
alikonweb
 
Querying NoSQL with SQL - MIGANG - July 2017
Matthew Groves
 
No sql way_in_pg
Vibhor Kumar
 
Utilizing Arrays: Modeling, Querying and Indexing
Keshav Murthy
 
Json improvements in my sql 8.0
Mysql User Camp
 
MySQL 5.7. Tutorial - Dutch PHP Conference 2015
Dave Stokes
 
MySQL 5.7 Tutorial Dutch PHP Conference 2015
Dave Stokes
 
JSON improvements in MySQL 8.0
Mydbops
 
Power JSON with PostgreSQL
EDB
 
The NoSQL Way in Postgres
EDB
 
Using JSON/BSON types in your hybrid application environment
Ajay Gupte
 
Querying NoSQL with SQL - KCDC - August 2017
Matthew Groves
 
Hybrid Databases - PHP UK Conference 22 February 2019
Dave Stokes
 
CREATE INDEX … USING VODKA. VODKA CONNECTING INDEXES, Олег Бартунов, Александ...
Ontico
 
Ad

Recently uploaded (20)

PDF
[Solution] Why Choose the VeryPDF DRM Protector Custom-Built Solution for You...
Lingwen1998
 
PPTX
Transforming Mining & Engineering Operations with Odoo ERP | Streamline Proje...
SatishKumar2651
 
PPTX
Home Care Tools: Benefits, features and more
Third Rock Techkno
 
PDF
Odoo CRM vs Zoho CRM: Honest Comparison 2025
Odiware Technologies Private Limited
 
PDF
MiniTool Partition Wizard 12.8 Crack License Key LATEST
hashhshs786
 
PDF
Automate Cybersecurity Tasks with Python
VICTOR MAESTRE RAMIREZ
 
PPTX
In From the Cold: Open Source as Part of Mainstream Software Asset Management
Shane Coughlan
 
PDF
Linux Certificate of Completion - LabEx Certificate
VICTOR MAESTRE RAMIREZ
 
PDF
Empower Your Tech Vision- Why Businesses Prefer to Hire Remote Developers fro...
logixshapers59
 
PPTX
OpenChain @ OSS NA - In From the Cold: Open Source as Part of Mainstream Soft...
Shane Coughlan
 
PPTX
AEM User Group: India Chapter Kickoff Meeting
jennaf3
 
PDF
Driver Easy Pro 6.1.1 Crack Licensce key 2025 FREE
utfefguu
 
PDF
4K Video Downloader Plus Pro Crack for MacOS New Download 2025
bashirkhan333g
 
PDF
Download Canva Pro 2025 PC Crack Full Latest Version
bashirkhan333g
 
PDF
Digger Solo: Semantic search and maps for your local files
seanpedersen96
 
PDF
Wondershare PDFelement Pro Crack for MacOS New Version Latest 2025
bashirkhan333g
 
PPTX
Tally_Basic_Operations_Presentation.pptx
AditiBansal54083
 
PDF
MiniTool Partition Wizard Free Crack + Full Free Download 2025
bashirkhan333g
 
PDF
유니티에서 Burst Compiler+ThreadedJobs+SIMD 적용사례
Seongdae Kim
 
PDF
TheFutureIsDynamic-BoxLang witch Luis Majano.pdf
Ortus Solutions, Corp
 
[Solution] Why Choose the VeryPDF DRM Protector Custom-Built Solution for You...
Lingwen1998
 
Transforming Mining & Engineering Operations with Odoo ERP | Streamline Proje...
SatishKumar2651
 
Home Care Tools: Benefits, features and more
Third Rock Techkno
 
Odoo CRM vs Zoho CRM: Honest Comparison 2025
Odiware Technologies Private Limited
 
MiniTool Partition Wizard 12.8 Crack License Key LATEST
hashhshs786
 
Automate Cybersecurity Tasks with Python
VICTOR MAESTRE RAMIREZ
 
In From the Cold: Open Source as Part of Mainstream Software Asset Management
Shane Coughlan
 
Linux Certificate of Completion - LabEx Certificate
VICTOR MAESTRE RAMIREZ
 
Empower Your Tech Vision- Why Businesses Prefer to Hire Remote Developers fro...
logixshapers59
 
OpenChain @ OSS NA - In From the Cold: Open Source as Part of Mainstream Soft...
Shane Coughlan
 
AEM User Group: India Chapter Kickoff Meeting
jennaf3
 
Driver Easy Pro 6.1.1 Crack Licensce key 2025 FREE
utfefguu
 
4K Video Downloader Plus Pro Crack for MacOS New Download 2025
bashirkhan333g
 
Download Canva Pro 2025 PC Crack Full Latest Version
bashirkhan333g
 
Digger Solo: Semantic search and maps for your local files
seanpedersen96
 
Wondershare PDFelement Pro Crack for MacOS New Version Latest 2025
bashirkhan333g
 
Tally_Basic_Operations_Presentation.pptx
AditiBansal54083
 
MiniTool Partition Wizard Free Crack + Full Free Download 2025
bashirkhan333g
 
유니티에서 Burst Compiler+ThreadedJobs+SIMD 적용사례
Seongdae Kim
 
TheFutureIsDynamic-BoxLang witch Luis Majano.pdf
Ortus Solutions, Corp
 

JSON By Example

  • 1. JSON by example FOSDEM PostgreSQL Devevoper Room January 2016 Stefanie Janine Stölting @sjstoelting
  • 2. JSON JavaScript Object Notation Don't have to care about encoding, it is always Unicode, most implemantations use UTF8 Used for data exchange in web application Currently two standards RFC 7159 by Douglas Crockford und ECMA-404 PostgreSQL impementation is RFC 7159
  • 3. JSON Datatypes JSON Available since 9.2 BSON Available as extension on GitHub since 2013 JSONB Available since 9.4 Crompessed JSON Fully transactionoal Up to 1 GB (uses TOAST)
  • 4. Performance Test done by EnterpriseDB, see the article by Marc Linster
  • 5. JSON Functions row_to_json({row}) Returns the row as JSON array_to_json({array}) Returns the array as JSON jsonb_to_recordset Returns a recordset from JSONB
  • 6. JSON Opertators Array element ->{int} Array element by name ->{text} Object element ->> {text} Value at path #> {text}
  • 7. Index on JSON Index JSONB content for faster access with indexes GIN index overall CREATE INDEX idx_1 ON jsonb.actor USING GIN (jsondata); Even unique B-Tree indexes are possible CREATE UNIQUE INDEX actor_id_2 ON jsonb.actor((CAST(jsondata->>'actor_id' AS INTEGER)));
  • 8. New JSON functions PostgreSQL 9.5 new JSONB functions: jsonb_pretty: Formats JSONB human readable jsonb_set: Update or add values PostgreSQL 9.5 new JSONB operators: ||: Concatenate two JSONB -: Delete key Available as extions for 9.4 at PGXN: jsonbx
  • 9. Data sources The Chinook database is available at chinookdatabase.codeplex.com Amazon book reviews of 1998 are available at examples.citusdata.com/customer_review
  • 11. CTE Common Table Expressions will be used in examples ● Example: WITH RECURSIVE t(n) AS ( VALUES (1) UNION ALL SELECT n+1 FROM t WHERE n < 100 ) SELECT sum(n), min(n), max(n) FROM t; ● Result:
  • 12. Live Examples Let's see, how it does work.
  • 13. Live with Chinook data -- Step 1: Tracks as JSON with the album identifier WITH tracks AS ( SELECT "AlbumId" AS album_id , "TrackId" AS track_id , "Name" AS track_name FROM "Track" ) SELECT row_to_json(tracks) AS tracks FROM tracks ;
  • 14. Live with Chinook data -- Step 2 Abums including tracks with aritst identifier WITH tracks AS ( SELECT "AlbumId" AS album_id , "TrackId" AS track_id , "Name" AS track_name FROM "Track" ) , json_tracks AS ( SELECT row_to_json(tracks) AS tracks FROM tracks ) , albums AS ( SELECT a."ArtistId" AS artist_id , a."AlbumId" AS album_id , a."Title" AS album_title , array_agg(t.tracks) AS album_tracks FROM "Album" AS a INNER JOIN json_tracks AS t ON a."AlbumId" = (t.tracks->>'album_id')::int GROUP BY a."ArtistId" , a."AlbumId" , a."Title" ) SELECT artist_id , array_agg(row_to_json(albums)) AS album FROM albums GROUP BY artist_id ;
  • 16. Live with Chinook data -- Step 3 Return one row for an artist with all albums as VIEW CREATE OR REPLACE VIEW v_json_artist_data AS WITH tracks AS ( SELECT "AlbumId" AS album_id , "TrackId" AS track_id , "Name" AS track_name , "MediaTypeId" AS media_type_id , "Milliseconds" As milliseconds , "UnitPrice" AS unit_price FROM "Track" ) , json_tracks AS ( SELECT row_to_json(tracks) AS tracks FROM tracks ) , albums AS ( SELECT a."ArtistId" AS artist_id , a."AlbumId" AS album_id , a."Title" AS album_title , array_agg(t.tracks) AS album_tracks FROM "Album" AS a INNER JOIN json_tracks AS t ON a."AlbumId" = (t.tracks->>'album_id')::int GROUP BY a."ArtistId" , a."AlbumId" , a."Title" ) , json_albums AS ( SELECT artist_id , array_agg(row_to_json(albums)) AS album FROM albums GROUP BY artist_id ) -- -> Next Page
  • 17. Live with Chinook data -- Step 3 Return one row for an artist with all albums as VIEW , artists AS ( SELECT a."ArtistId" AS artist_id , a."Name" AS artist , jsa.album AS albums FROM "Artist" AS a INNER JOIN json_albums AS jsa ON a."ArtistId" = jsa.artist_id ) SELECT (row_to_json(artists))::jsonb AS artist_data FROM artists ;
  • 18. Live with Chinook data -- Select data from the view SELECT * FROM v_json_artist_data ;
  • 19. Live with Chinook data -- SELECT data from that VIEW, that does querying SELECT jsonb_pretty(artist_data) FROM v_json_artist_data WHERE artist_data->>'artist' IN ('Miles Davis', 'AC/DC') ;
  • 20. Live with Chinook data -- SELECT some data from that VIEW using JSON methods SELECT artist_data->>'artist' AS artist , artist_data#>'{albums, 1, album_title}' AS album_title , jsonb_pretty(artist_data#>'{albums, 1, album_tracks}') AS album_tracks FROM v_json_artist_data WHERE artist_data->'albums' @> '[{"album_title":"Miles Ahead"}]' ;
  • 21. Live with Chinook data -- Array to records SELECT artist_data->>'artist_id' AS artist_id , artist_data->>'artist' AS artist , jsonb_array_elements(artist_data#>'{albums}')->>'album_title' AS album_title , jsonb_array_elements(jsonb_array_elements(artist_data#>'{albums}')#>'{album_tracks}')->>'track_name' AS song_titles , jsonb_array_elements(jsonb_array_elements(artist_data#>'{albums}')#>'{album_tracks}')->>'track_id' AS song_id FROM v_json_artist_data WHERE artist_data->>'artist' = 'Metallica' ORDER BY album_title , song_id ;
  • 22. Live with Chinook data -- Convert albums to a recordset SELECT * FROM jsonb_to_recordset( ( SELECT (artist_data->>'albums')::jsonb FROM v_json_artist_data WHERE (artist_data->>'artist_id')::int = 50 ) ) AS x(album_id int, artist_id int, album_title text, album_tracks jsonb) ;
  • 23. Live with Chinook data -- Convert the tracks to a recordset SELECT album_id , track_id , track_name , media_type_id , milliseconds , unit_price FROM jsonb_to_recordset( ( SELECT artist_data#>'{albums, 1, album_tracks}' FROM v_json_artist_data WHERE (artist_data->>'artist_id')::int = 50 ) ) AS x(album_id int, track_id int, track_name text, media_type_id int, milliseconds int, unit_price float) ;
  • 24. Live with Chinook data -- Create a function, which will be used for UPDATE on the view v_artrist_data CREATE OR REPLACE FUNCTION trigger_v_json_artist_data_update() RETURNS trigger AS $BODY$ -- Data variables DECLARE rec RECORD; -- Error variables DECLARE v_state TEXT; DECLARE v_msg TEXT; DECLARE v_detail TEXT; DECLARE v_hint TEXT; DECLARE v_context TEXT; BEGIN -- Update table Artist IF (OLD.artist_data->>'artist')::varchar(120) <> (NEW.artist_data->>'artist')::varchar(120) THEN UPDATE "Artist" SET "Name" = (NEW.artist_data->>'artist')::varchar(120) WHERE "ArtistId" = (OLD.artist_data->>'artist_id')::int; END IF; -- Update table Album with an UPSERT -- Update table Track with an UPSERT RETURN NEW; EXCEPTION WHEN unique_violation THEN RAISE NOTICE 'Sorry, but the something went wrong while trying to update artist data'; RETURN OLD; WHEN others THEN GET STACKED DIAGNOSTICS v_state = RETURNED_SQLSTATE, v_msg = MESSAGE_TEXT, v_detail = PG_EXCEPTION_DETAIL, v_hint = PG_EXCEPTION_HINT, v_context = PG_EXCEPTION_CONTEXT; RAISE NOTICE '%', v_msg; RETURN OLD; END; $BODY$ LANGUAGE plpgsql;
  • 26. Live with Chinook data -- The trigger will be fired instead of an UPDATE statement to save data CREATE TRIGGER v_json_artist_data_instead_update INSTEAD OF UPDATE ON v_json_artist_data FOR EACH ROW EXECUTE PROCEDURE trigger_v_json_artist_data_update() ;
  • 27. Live with Chinook data -- Manipulate data with jsonb_set SELECT artist_data->>'artist_id' AS artist_id , artist_data->>'artist' AS artist , jsonb_set(artist_data, '{artist}', '"Whatever we want, it is just text"'::jsonb)->>'artist' AS new_artist FROM v_json_artist_data WHERE (artist_data->>'artist_id')::int = 50 ;
  • 28. Live with Chinook data -- Update a JSONB column with a jsonb_set result UPDATE v_json_artist_data SET artist_data= jsonb_set(artist_data, '{artist}', '"NEW Metallica"'::jsonb) WHERE (artist_data->>'artist_id')::int = 50 ;
  • 29. Live with Chinook data -- View the changes done by the UPDATE statement SELECT artist_data->>'artist_id' AS artist_id , artist_data->>'artist' AS artist FROM v_json_artist_data WHERE (artist_data->>'artist_id')::int = 50 ;
  • 30. Live with Chinook data -- Lets have a view on the explain plans – SELECT the data from the view
  • 31. Live with Chinook data -- View the changes in in the table instead of the JSONB view -- The result should be the same, only the column name differ SELECT * FROM "Artist" WHERE "ArtistId" = 50 ;
  • 32. Live with Chinook data -- Lets have a view on the explain plans – SELECT the data from table Artist
  • 33. -- Manipulate data with the concatenating / overwrite operator SELECT artist_data->>'artist_id' AS artist_id , artist_data->>'artist' AS artist , jsonb_set(artist_data, '{artist}', '"Whatever we want, it is just text"'::jsonb)->>'artist' AS new_artist , artist_data || '{"artist":"Metallica"}'::jsonb->>'artist' AS correct_name FROM v_json_artist_data WHERE (artist_data->>'artist_id')::int = 50 ; Live with Chinook data
  • 34. Live with Chinook data -- Revert the name change of Metallica with in a different way: With the replace operator UPDATE v_json_artist_data SET artist_data = artist_data || '{"artist":"Metallica"}'::jsonb WHERE (artist_data->>'artist_id')::int = 50 ;
  • 35. Live with Chinook data -- View the changes done by the UPDATE statement with the replace operator SELECT artist_data->>'artist_id' AS artist_id , artist_data->>'artist' AS artist FROM v_json_artist_data WHERE (artist_data->>'artist_id')::int = 50 ;
  • 36. Live with Chinook data -- Remove some data with the - operator SELECT jsonb_pretty(artist_data) AS complete , jsonb_pretty(artist_data - 'albums') AS minus_albums , jsonb_pretty(artist_data) = jsonb_pretty(artist_data - 'albums') AS is_different FROM v_json_artist_data WHERE artist_data->>'artist' IN ('Miles Davis', 'AC/DC') ;
  • 37. Live Amazon reviews -- Create a table for JSON data with 1998 Amazon reviews CREATE TABLE reviews(review_jsonb jsonb);
  • 38. Live Amazon reviews -- Import customer reviews from a file COPY reviews FROM '/var/tmp/customer_reviews_nested_1998.json' ;
  • 39. Live Amazon reviews -- There should be 589.859 records imported into the table SELECT count(*) FROM reviews ;
  • 40. Live Amazon reviews SELECT jsonb_pretty(review_jsonb) FROM reviews LIMIT 1 ;
  • 41. Live Amazon reviews -- Select data with JSON SELECT review_jsonb#>> '{product,title}' AS title , avg((review_jsonb#>> '{review,rating}')::int) AS average_rating FROM reviews WHERE review_jsonb@>'{"product": {"category": "Sheet Music & Scores"}}' GROUP BY title ORDER BY average_rating DESC ; Without an Index: 248ms
  • 42. Live Amazon reviews -- Create a GIN index CREATE INDEX review_review_jsonb ON reviews USING GIN (review_jsonb);
  • 43. Live Amazon reviews -- Select data with JSON SELECT review_jsonb#>> '{product,title}' AS title , avg((review_jsonb#>> '{review,rating}')::int) AS average_rating FROM reviews WHERE review_jsonb@>'{"product": {"category": "Sheet Music & Scores"}}' GROUP BY title ORDER BY average_rating DESC ; The same query as before with the previously created GIN Index: 7ms
  • 44. Live Amazon reviews -- SELECT some statistics from the JSON data SELECT review_jsonb#>>'{product,category}' AS category , avg((review_jsonb#>>'{review,rating}')::int) AS average_rating , count((review_jsonb#>>'{review,rating}')::int) AS count_rating FROM reviews GROUP BY category ; Without an Index: 9747ms
  • 45. Live Amazon reviews -- Create a B-Tree index on a JSON expression CREATE INDEX reviews_product_category ON reviews ((review_jsonb#>>'{product,category}'));
  • 46. Live Amazon reviews -- SELECT some statistics from the JSON data SELECT review_jsonb#>>'{product,category}' AS category , avg((review_jsonb#>>'{review,rating}')::int) AS average_rating , count((review_jsonb#>>'{review,rating}')::int) AS count_rating FROM reviews GROUP BY category ; The same query as before with the previously created BTREE Index: 1605ms
  • 47. JSON by example This document by Stefanie Janine Stölting is covered by the Creative Commons Attribution 4.0 International