SlideShare a Scribd company logo
Copyright 2018 FUJITSU LIMITED
Road to a Multi-model Database
-- making PostgreSQL the most
popular and versatile database
0
September 20, 2018
Aya Iwata
Fujitsu Limited
Copyright 2018 FUJITSU LIMITED
Who am I?
AYA IWATA
 Develop
FUJITSU Software Enterprise Postgres
(PostgreSQL-based product)
 Support open source PostgreSQL in various products
 Steering Committee Member of PGConf.ASIA 2018
1
Development in Npgsql Community
 Driver that enables
Windows application to
connect PostgreSQL
 Contributed some functions
of Visual Studio integration
(Npgsql 3.2.8)
PostgreSQL
Visual Studio
Npgsql
libraly
Visual Studio
plugin
http://www.npgsql.org/
Enhance functions
Follow PostgreSQL
Gather developers
Company CommunityActivity
Copyright 2018 FUJITSU LIMITED2
Evolving PostgreSQL with extensions
 PostgreSQL has many extensions (over 50)
 Developers around the world make up communities
Copyright 2018 FUJITSU LIMITED
PostgreSQL can meet various needs !
Easily database operation
with GUI
Developing
Windows Applications
Alert of flood warning system
with location information
Operation tool
(pgAdmin)
PostgreSQL
Linux-API
(JDBC, ODBC)
Audit logging
(pgaudit)
Support for GIS
(PostGIS)
High speed doc search
(pg_bigm)
Win-API
(Npqsql)
Control execute plan
(pg_hint_plan)
On-line table
(pg_repack)
3
Copyright 2018 FUJITSU LIMITED
Agenda
Why is multi-model necessary? (background)
What is multi-model database?
How should we implement it?
4
Copyright 2018 FUJITSU LIMITED
Why is multi-model necessary?
5
Copyright 2018 FUJITSU LIMITED
Big Data
Volume Velocity
Variety
6
Copyright 2018 FUJITSU LIMITED
Can PostgreSQL Handle Big Data?
Variety
Volume Velocity
Key-value model
hstore type
Document model
jsonb type
Partitioning
PostgreSQL 10~
Scaleout
Postgres-XL Citus
(fork) (extension)
GPU
PG-Strom
(extension)
Streaming
PipelineDB
(fork)
In-memory columnar
In developing
Persistent memory,
FPGA, SIMD
N/A
7
Why NoSQL Attracts Attention?
 Developer productivity with flexible data model
 Can handle various data types as-is (array, list, object, graph, etc.)
 No need to map to relational model (eliminate ORM)
 High scalability
 Can store and process voluminous data
 Can handle many requests simultaneously
 Fault tolerance
array list object
graph ・・・
relational modelMapping
is not needed
Copyright 2018 FUJITSU LIMITED
voluminous dataapplication
8
Copyright 2018 FUJITSU LIMITED
Data Models
Data model Representative DBMSs
Relational Oracle, MySQL, SQL Server, PostgreSQL
Key-value Redis, Memcached
Document MongoDB, CouchBase, MarkLogic
Graph Neo4j
Wide columnar Cassandra, Hbase
RDF MarkLogic, Virtuoso, Oracle
Text search Elasticsearch, Apache Solr
Time series InfluxDB
Multi-dimensional array rasdaman, SciDB
Event Event Store, NEventStore
Object InterSystems Cache
9
Polyglot Persistence
 Use multiple DBMSs in one system/application
 Spread by Martin Fowler
Graph
Key-value
Document Wide columnar
Data models in online shopping application
RDB
Web session
shopping cart
user profile
customer
order
recommendation product catalog Web access log
application
Copyright 2018 FUJITSU LIMITED10
Copyright 2018 FUJITSU LIMITED
 Leading tech companies use many DBMSs (ex. Netflix)
Data model DBMSs
Relational MySQL, Redshift
Key-value Memcached, Redis, Hollow (developed by Netflix )
Text search Elasticsearch
Wide columnar Cassandra
Time series Atlas (developed by Netflix )
Event Druid
Multiple DBMSs Use
11
Problems (1/2)
 Data silo to prevent cross-sectional data analysis
 Time-consuming and laborious ETL
 Complex logic in application (fetch, join, aggregation, sort)
 Data consistency among DBMSs
 Distributed transaction is not available in all DBMSs
 Infrastructure cost increase due to duplication of data
・・・ ・・・
Key-value Graph
RDB Document
Copyright 2018 FUJITSU LIMITED12
Copyright 2018 FUJITSU LIMITED
Problems (2/2)
 Operational complexity
 Product/OSS software management, support/service contracts
 Infrastructure provisioning (server, storage, network)
 Deployment, patching, testing, configuration, version control
 Security: user management, access control, encryption, auditing
 Monitoring and diagnosis, performance tuning, troubleshooting
 HA: backup/recovery, local failover, disaster recovery
 Steep learning curve for developers
 DBMS-specific non-SQL API and SQL-like query language
 Transaction control, consistency model, application tuning
 Lack of skilled personnel ?
?
13
Copyright 2018 FUJITSU LIMITED
What is multi-model database?
14
Overview
 Support multiple data models in one DBMS
etc ・・・
RDB Graph
Key-value Document
application
Very smart!
Copyright 2018 FUJITSU LIMITED15
Copyright 2018 FUJITSU LIMITED
Smooth data utilization
with less data integration
Higher developer productivity
Lower cost
for infrastructure and DBA
"All-in-one" is convenient,
just like a smartphone
Merits
16
Copyright 2018 FUJITSU LIMITED
Multi-model Database Examples
DBMS Supported data models
ArangoDB key-value, document, graph
Cosmos DB key-value, document, graph
CouchBase key-value, document
DataStax(on Cassandra) key-value, wide column, graph
MarkLogic document, text/binary, RDF
OrientDB key-value, document, graph, text/binary
17
Copyright 2018 FUJITSU LIMITED
Trends of Major DBMSs
 Major RDBMSs are adding data models
 NoSQL DBMSs are also adding data models
DBMS Key-value Document Wide column Graph
Oracle ++ +
MySQL ++ +
SQL Server + +
MongoDB + ++ +
PostgreSQL + +
Data model support in top 5 popular DBMSs
18
Copyright 2018 FUJITSU LIMITED
 Why based on RDBMS?
 Why based on PostgreSQL?
 Mature storage engine and transaction management
 Smart optimizer
 Prevalent RDBMS gives more people the chance to use
RDBMS has
 Extensibility as a data platform
 Liberal community open to niche data models
PostgreSQL has
PostgreSQL as a Multi-model Database
19
Copyright 2018 FUJITSU LIMITED
How should we implement
multi-model database?
20
What is Data Model?
Data model = Structure + Constraint + Operation
Data model
Structure
table, row, column
key, value
node, relationship,
property, label
Constraint
unique, referential,
check, not null, ...
unique
unique,
node existence
Operation
scan, join, restriction,
projection, aggregation,
set arithmetic, sort,
insert/delete/update …
get, put
scan, pattern match,
join, restriction,
projection, aggregation,
set arithmetic, sort,
add/delete/modify …
Relational
Key-value
Graph
Copyright 2018 FUJITSU LIMITED21
Copyright 2018 FUJITSU LIMITED
Query Language and API
 Adopt standard and well-known languages/APIs per data model
 Developer productivity: leverage skill/know-how/asset
 Rich information for learning
 Standard compliance and popularity for ecosystem
 Examples
Data model languages/APIs
Key-value Redis API, Memcached API
Document SQL/JSON path (SQL standard), MongoDB API
Graph Cypher, Gremlin
RDF SPARQL (W3C standard)
Array SQL/MDA (Multi-Dimensional Array) (future SQL standard)
22
Multi-model Approach 1
 Flexible Schema Data (FSD)
 Leverage RDBMS’s user defined data type, function, and index
 Store/access data in a table column with functions in SQL
 Used for XML, JSON, geospatial data
http://cidrdb.org/cidr2015/Papers/CIDR15_Paper5.pdfreference :
Relational Data
Flexible Schema
Data (FSD)
RDBMS
SQL
SQL
NoSQL API
application
user defined data type,
function, index
Copyright 2018 FUJITSU LIMITED23
Multi-model Approach 2
 Independent data model components
 Query language and API for each data model
 Data is optionally separated from relational data
 Use for Graph, RDF, time series, event…
 Independence ensures performance for each data model
Graph
Parser
RDF
Query
Processor
Cypher, Gremlin SPARQL
Parser
Transformer
Planner
Relational
Storage engine Storage engine
Copyright 2018 FUJITSU LIMITED
Parser
Transformer
Planner
Executor
SQL
application
Executor
24
Copyright 2018 FUJITSU LIMITED
Examples Based on Approach 2
 Graph model: AgensGraph (fork)
 https://github.com/bitnine-oss/agensgraph
 Time series model: TimescaleDB (extension)
 https://github.com/timescale/timescaledb
25
Copyright 2018 FUJITSU LIMITED
Pluggable Data Model
 Want to facilitate data model development
 Introduce 3 pluggable objects
 Query language: generate parse tree from query string
 Data model: generate query plan from parse tree and run it
 Region: combination of query language and data model
Query language
Data model
Query language
Data model
Graph model RDF model
Query language
Relational model
Data model as an extension
26
Copyright 2018 FUJITSU LIMITED
Multi-model Query
 Mix queries for multiple data models in a query string
 Execute query in a specified region in_region(region_name, query string)
 Convert data across regions
cast_region(source data, dest region name, dest container, dest schema)
-- Among Chinese restaurants in Tokyo,
-- list up to 5 top ones among friends' friends
SELECT r.name, g.num_likers FROM restaurant r,
cast_region(
in_region('graph_cypher',
'MATCH (:Person {name:"Taro"})-[:IS_FRIEND_OF*1..2]-(friend),
(friend)-[:LIKES]->(restaurant:Restaurant)
RETURN restaurant.name, count(*)'),
'relational', 'g', '(name text, num_likers int')
WHERE r.name = g.name AND r.city = 'Tokyo' AND r.cuisine = ‘chinese'
ORDER BY g.num_likers DESC LIMIT 5;
27
Copyright 2018 FUJITSU LIMITED
Mixed-model Query Execution
Multi-model query plan
relation:table/index scan
restaurant
relation:join
relation:sort
graph:pattern match
IS_FRIEND_OF
graph:node scan
Person
28
Copyright 2018 FUJITSU LIMITED
Document Model
 PostgreSQL supports JSON since 2012, but…
 Different SQL/JSON was standardized in SQL:2016
 Store JSON data in character/binary column
 Intuitive function and SQL/JSON path language
 Powerful JSON_TABLE function to map JSON to relational data
 Support for SQL/JSON is being developed in community
SELECT
JSON_VALUE(jcol, '$.name') AS name,
JSON_QUERY(jcol, '$.skills') AS skills
FROM emp
WHERE
JSON_EXISTS(jcol, '$.projects[*] ?
(@.category == "IoT")');
SELECT
jcol ->> 'name' AS name,
jcol -> 'skills' AS skills
FROM emp
WHERE
jcol @>
'{ "projects": [{ "category": "IoT" }] }';
Query in current PostgreSQL Query in SQL/JSON
29
 The key is performance in storage engine
 RDB is slow to traverse graph due to index scan
 Eliminate index scan using direct pointers between records
 Node traversal cost drops from O(n) to O(1)
Index
Friend Friend
FriendFriend
Native graph
Friend
Jill Jack
John Jack
John Jill
Jack Jill
Jack John
Graph in RDBMS
John Jack Jill
John
Jack
Copyright 2018 FUJITSU LIMITED
Graph Model
30
Copyright 2018 FUJITSU LIMITED
Key-value Model
 PostgreSQL has hstore data type, but
 Less performant than expected
 Unfamiliar API
 Solution: Redis in the background worker
 Maximal performance by bypassing SQL processor
 Familiar, developer-friendly Redis API
SQL processor storage engine
Redis API (get/put)
application
DiskTable
31
Copyright 2018 FUJITSU LIMITED
 Multi-model is necessary for broader use of PostgreSQL
PostgreSQL 11
PostgreSQL 12
 Build pluggable data model infrastructure
 Add/Improve popular data models:
key-value, SQL/JSON, graph
PostgreSQL 13
 Add other (niche?) data models
Conclusion
32
Copyright 2018 FUJITSU LIMITED
We want your ideas!
 I would like to discuss the implementation of multi-database
model with everyone in this venue
 Because specialist of various databases are gathered here
 Any idea/wish comment as a user is welcome
 Contact me if inconvenient (Japanese/English OK)
iwata.aya@jp.fujitsu.com
33
FUJITSU Software Enterprise Postgres 10
Copyright 2018 FUJITSU LIMITED
Enterprise Postgres
Oracle連携/互換性
PostgreSQL 10 アプリケーション
インターフェース
ECOBPG
(埋め込みSQL用
COBOLプリプロセッサ)
:PostgreSQLの周辺ツール(OSS)
JDBC Driver
psqlODBC
Npgsql
:Enterprise Postgres強化機能
信頼性 性能
パーティショニング
並列検索の強化
ロジカル・
レプリケーション
:PostgreSQL本体およびcontribモジュール
セキュリティ
透過的データ暗号化
秘匿化
既存システムとの
連携/互換性
NCHAR
他社DB互換構文
運用管理
WebAdmin
pg_statsinfo
pg_rman
信頼性
データベース二重化
連携性
性能
インメモリ機能
全文検索
pg_bigm
外部データ連携
file_fdw
外部データ連携
postgres_fdw
クォーラムベース・
レプリケーション
災害対策 高速バックアップ
高速ローダ―
監査ログ
orafce
pgAdmin
oracle_fdw
全文検索
pg_trgm
34
Copyright 2018 FUJITSU LIMITED35
Copyright 2018 FUJITSU LIMITED
Questions?
36

More Related Content

What's hot (20)

PDF
GraphTech Ecosystem - part 3: Graph Visualization
Linkurious
 
PDF
The Vision for Graph Database from Postgres
EDB
 
PPTX
Operational Machine Learning: Using Microsoft Technologies for Applied Data S...
Khalid Salama
 
PPTX
Solution architecture for big data projects
Sandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW
 
PDF
The Data Web and PLM
Koneksys
 
PPTX
Graph Analytics on Data from Meetup.com
Karin Patenge
 
PPTX
Big Data with SQL Server
Mark Kromer
 
PDF
Egeria and graphs
ODPi
 
PPTX
DataverseEU as multilingual repository
vty
 
ODP
Open Source Business Intelligence Overview
Alex Meadows
 
PDF
Bigdata Machine Learning Platform
Mk Kim
 
PPTX
Data Virtualization and ETL
Lily Luo
 
PPT
Graph Analytics for big data
Sigmoid
 
PPTX
PiLOD 2013: Is Linked Data the future of data integration in the enterprise?
John Walker
 
PPTX
tecFinal 451 webinar deck
Basho Technologies
 
PPTX
Data science big data and analytics
Sandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW
 
PDF
Unstructured Datasets Analysis: Thesaurus Model
Editor IJCATR
 
PPTX
Using GDAL In Your GIS Workflow
Gerry James
 
GraphTech Ecosystem - part 3: Graph Visualization
Linkurious
 
The Vision for Graph Database from Postgres
EDB
 
Operational Machine Learning: Using Microsoft Technologies for Applied Data S...
Khalid Salama
 
Solution architecture for big data projects
Sandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW
 
The Data Web and PLM
Koneksys
 
Graph Analytics on Data from Meetup.com
Karin Patenge
 
Big Data with SQL Server
Mark Kromer
 
Egeria and graphs
ODPi
 
DataverseEU as multilingual repository
vty
 
Open Source Business Intelligence Overview
Alex Meadows
 
Bigdata Machine Learning Platform
Mk Kim
 
Data Virtualization and ETL
Lily Luo
 
Graph Analytics for big data
Sigmoid
 
PiLOD 2013: Is Linked Data the future of data integration in the enterprise?
John Walker
 
tecFinal 451 webinar deck
Basho Technologies
 
Data science big data and analytics
Sandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW
 
Unstructured Datasets Analysis: Thesaurus Model
Editor IJCATR
 
Using GDAL In Your GIS Workflow
Gerry James
 

Similar to [db tech showcase Tokyo 2018] #dbts2018 #C25 『マルチモデル・データベースへの道: PostgreSQLを最も人気で多才なデータベースに』 (20)

PDF
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 Dataset
TigerGraph
 
PPTX
Intro to hadoop ecosystem
Grzegorz Kolpuc
 
PPTX
A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...
Srivatsan Ramanujam
 
PDF
Virtualisation de données : Enjeux, Usages & Bénéfices
Denodo
 
PPTX
Master Meta Data
Digikrit
 
PPTX
PostgreSQL as a Strategic Tool
EDB
 
PPTX
Deep Learning for Recommender Systems
Nick Pentreath
 
PPTX
Szabaduljon ki az Oracle szorításából
EDB
 
PPTX
Sawmill - Integrating R and Large Data Clouds
Robert Grossman
 
PPTX
Big Data with hadoop, Spark and BigQuery (Google cloud next Extended 2017 Kar...
Imam Raza
 
PDF
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
cscpconf
 
PDF
Distributed deep learning reference architecture v3.2l
Ganesan Narayanasamy
 
PDF
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
Márton Kodok
 
PDF
Big Data with Hadoop – For Data Management, Processing and Storing
IRJET Journal
 
PDF
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
csandit
 
PPTX
Enterprise-class security with PostgreSQL - 2
Ashnikbiz
 
PDF
Big Data Architectures @ JAX / BigDataCon 2016
Guido Schmutz
 
PDF
Exploring BigData with Google BigQuery
Dharmesh Vaya
 
PDF
PostgreSQL 10; Long Awaited Enterprise Solutions
Julyanto SUTANDANG
 
PDF
FIWARE Training: Introduction to Smart Data Models
FIWARE
 
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 Dataset
TigerGraph
 
Intro to hadoop ecosystem
Grzegorz Kolpuc
 
A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...
Srivatsan Ramanujam
 
Virtualisation de données : Enjeux, Usages & Bénéfices
Denodo
 
Master Meta Data
Digikrit
 
PostgreSQL as a Strategic Tool
EDB
 
Deep Learning for Recommender Systems
Nick Pentreath
 
Szabaduljon ki az Oracle szorításából
EDB
 
Sawmill - Integrating R and Large Data Clouds
Robert Grossman
 
Big Data with hadoop, Spark and BigQuery (Google cloud next Extended 2017 Kar...
Imam Raza
 
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
cscpconf
 
Distributed deep learning reference architecture v3.2l
Ganesan Narayanasamy
 
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
Márton Kodok
 
Big Data with Hadoop – For Data Management, Processing and Storing
IRJET Journal
 
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
csandit
 
Enterprise-class security with PostgreSQL - 2
Ashnikbiz
 
Big Data Architectures @ JAX / BigDataCon 2016
Guido Schmutz
 
Exploring BigData with Google BigQuery
Dharmesh Vaya
 
PostgreSQL 10; Long Awaited Enterprise Solutions
Julyanto SUTANDANG
 
FIWARE Training: Introduction to Smart Data Models
FIWARE
 
Ad

More from Insight Technology, Inc. (20)

PDF
グラフデータベースは如何に自然言語を理解するか?
Insight Technology, Inc.
 
PDF
Docker and the Oracle Database
Insight Technology, Inc.
 
PDF
Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~
Insight Technology, Inc.
 
PDF
事例を通じて機械学習とは何かを説明する
Insight Technology, Inc.
 
PDF
仮想通貨ウォレットアプリで理解するデータストアとしてのブロックチェーン
Insight Technology, Inc.
 
PDF
MBAAで覚えるDBREの大事なおしごと
Insight Technology, Inc.
 
PDF
グラフデータベースは如何に自然言語を理解するか?
Insight Technology, Inc.
 
PDF
DBREから始めるデータベースプラットフォーム
Insight Technology, Inc.
 
PDF
SQL Server エンジニアのためのコンテナ入門
Insight Technology, Inc.
 
PDF
Lunch & Learn, AWS NoSQL Services
Insight Technology, Inc.
 
PDF
db tech showcase2019オープニングセッション @ 森田 俊哉
Insight Technology, Inc.
 
PDF
db tech showcase2019 オープニングセッション @ 石川 雅也
Insight Technology, Inc.
 
PDF
db tech showcase2019 オープニングセッション @ マイナー・アレン・パーカー
Insight Technology, Inc.
 
PPTX
難しいアプリケーション移行、手軽に試してみませんか?
Insight Technology, Inc.
 
PPTX
Attunityのソリューションと異種データベース・クラウド移行事例のご紹介
Insight Technology, Inc.
 
PPTX
そのデータベース、クラウドで使ってみませんか?
Insight Technology, Inc.
 
PPTX
コモディティサーバー3台で作る高速処理 “ハイパー・コンバージド・データベース・インフラストラクチャー(HCDI)” システム『Insight Qube』...
Insight Technology, Inc.
 
PDF
複数DBのバックアップ・切り戻し運用手順が異なって大変?!運用性の大幅改善、その先に。。
Insight Technology, Inc.
 
PPTX
Attunity社のソリューションの日本国内外適用事例及びロードマップ紹介[ATTUNITY & インサイトテクノロジー IoT / Big Data フ...
Insight Technology, Inc.
 
PPTX
レガシーに埋もれたデータをリアルタイムでクラウドへ [ATTUNITY & インサイトテクノロジー IoT / Big Data フォーラム 2018]
Insight Technology, Inc.
 
グラフデータベースは如何に自然言語を理解するか?
Insight Technology, Inc.
 
Docker and the Oracle Database
Insight Technology, Inc.
 
Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~
Insight Technology, Inc.
 
事例を通じて機械学習とは何かを説明する
Insight Technology, Inc.
 
仮想通貨ウォレットアプリで理解するデータストアとしてのブロックチェーン
Insight Technology, Inc.
 
MBAAで覚えるDBREの大事なおしごと
Insight Technology, Inc.
 
グラフデータベースは如何に自然言語を理解するか?
Insight Technology, Inc.
 
DBREから始めるデータベースプラットフォーム
Insight Technology, Inc.
 
SQL Server エンジニアのためのコンテナ入門
Insight Technology, Inc.
 
Lunch & Learn, AWS NoSQL Services
Insight Technology, Inc.
 
db tech showcase2019オープニングセッション @ 森田 俊哉
Insight Technology, Inc.
 
db tech showcase2019 オープニングセッション @ 石川 雅也
Insight Technology, Inc.
 
db tech showcase2019 オープニングセッション @ マイナー・アレン・パーカー
Insight Technology, Inc.
 
難しいアプリケーション移行、手軽に試してみませんか?
Insight Technology, Inc.
 
Attunityのソリューションと異種データベース・クラウド移行事例のご紹介
Insight Technology, Inc.
 
そのデータベース、クラウドで使ってみませんか?
Insight Technology, Inc.
 
コモディティサーバー3台で作る高速処理 “ハイパー・コンバージド・データベース・インフラストラクチャー(HCDI)” システム『Insight Qube』...
Insight Technology, Inc.
 
複数DBのバックアップ・切り戻し運用手順が異なって大変?!運用性の大幅改善、その先に。。
Insight Technology, Inc.
 
Attunity社のソリューションの日本国内外適用事例及びロードマップ紹介[ATTUNITY & インサイトテクノロジー IoT / Big Data フ...
Insight Technology, Inc.
 
レガシーに埋もれたデータをリアルタイムでクラウドへ [ATTUNITY & インサイトテクノロジー IoT / Big Data フォーラム 2018]
Insight Technology, Inc.
 
Ad

Recently uploaded (20)

PPTX
✨Unleashing Collaboration: Salesforce Channels & Community Power in Patna!✨
SanjeetMishra29
 
PDF
Persuasive AI: risks and opportunities in the age of digital debate
Speck&Tech
 
PDF
SFWelly Summer 25 Release Highlights July 2025
Anna Loughnan Colquhoun
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PDF
NewMind AI Journal - Weekly Chronicles - July'25 Week II
NewMind AI
 
PDF
Complete JavaScript Notes: From Basics to Advanced Concepts.pdf
haydendavispro
 
PDF
Meetup Kickoff & Welcome - Rohit Yadav, CSIUG Chairman
ShapeBlue
 
PPTX
Building and Operating a Private Cloud with CloudStack and LINBIT CloudStack ...
ShapeBlue
 
PPTX
Darren Mills The Migration Modernization Balancing Act: Navigating Risks and...
AWS Chicago
 
PDF
Empowering Cloud Providers with Apache CloudStack and Stackbill
ShapeBlue
 
PDF
TrustArc Webinar - Data Privacy Trends 2025: Mid-Year Insights & Program Stra...
TrustArc
 
PDF
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
PDF
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
PDF
Smart Air Quality Monitoring with Serrax AQM190 LITE
SERRAX TECHNOLOGIES LLP
 
PPTX
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
PDF
Français Patch Tuesday - Juillet
Ivanti
 
PDF
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PDF
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
PPTX
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
✨Unleashing Collaboration: Salesforce Channels & Community Power in Patna!✨
SanjeetMishra29
 
Persuasive AI: risks and opportunities in the age of digital debate
Speck&Tech
 
SFWelly Summer 25 Release Highlights July 2025
Anna Loughnan Colquhoun
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
NewMind AI Journal - Weekly Chronicles - July'25 Week II
NewMind AI
 
Complete JavaScript Notes: From Basics to Advanced Concepts.pdf
haydendavispro
 
Meetup Kickoff & Welcome - Rohit Yadav, CSIUG Chairman
ShapeBlue
 
Building and Operating a Private Cloud with CloudStack and LINBIT CloudStack ...
ShapeBlue
 
Darren Mills The Migration Modernization Balancing Act: Navigating Risks and...
AWS Chicago
 
Empowering Cloud Providers with Apache CloudStack and Stackbill
ShapeBlue
 
TrustArc Webinar - Data Privacy Trends 2025: Mid-Year Insights & Program Stra...
TrustArc
 
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
Smart Air Quality Monitoring with Serrax AQM190 LITE
SERRAX TECHNOLOGIES LLP
 
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
Français Patch Tuesday - Juillet
Ivanti
 
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 

[db tech showcase Tokyo 2018] #dbts2018 #C25 『マルチモデル・データベースへの道: PostgreSQLを最も人気で多才なデータベースに』

  • 1. Copyright 2018 FUJITSU LIMITED Road to a Multi-model Database -- making PostgreSQL the most popular and versatile database 0 September 20, 2018 Aya Iwata Fujitsu Limited
  • 2. Copyright 2018 FUJITSU LIMITED Who am I? AYA IWATA  Develop FUJITSU Software Enterprise Postgres (PostgreSQL-based product)  Support open source PostgreSQL in various products  Steering Committee Member of PGConf.ASIA 2018 1
  • 3. Development in Npgsql Community  Driver that enables Windows application to connect PostgreSQL  Contributed some functions of Visual Studio integration (Npgsql 3.2.8) PostgreSQL Visual Studio Npgsql libraly Visual Studio plugin http://www.npgsql.org/ Enhance functions Follow PostgreSQL Gather developers Company CommunityActivity Copyright 2018 FUJITSU LIMITED2
  • 4. Evolving PostgreSQL with extensions  PostgreSQL has many extensions (over 50)  Developers around the world make up communities Copyright 2018 FUJITSU LIMITED PostgreSQL can meet various needs ! Easily database operation with GUI Developing Windows Applications Alert of flood warning system with location information Operation tool (pgAdmin) PostgreSQL Linux-API (JDBC, ODBC) Audit logging (pgaudit) Support for GIS (PostGIS) High speed doc search (pg_bigm) Win-API (Npqsql) Control execute plan (pg_hint_plan) On-line table (pg_repack) 3
  • 5. Copyright 2018 FUJITSU LIMITED Agenda Why is multi-model necessary? (background) What is multi-model database? How should we implement it? 4
  • 6. Copyright 2018 FUJITSU LIMITED Why is multi-model necessary? 5
  • 7. Copyright 2018 FUJITSU LIMITED Big Data Volume Velocity Variety 6
  • 8. Copyright 2018 FUJITSU LIMITED Can PostgreSQL Handle Big Data? Variety Volume Velocity Key-value model hstore type Document model jsonb type Partitioning PostgreSQL 10~ Scaleout Postgres-XL Citus (fork) (extension) GPU PG-Strom (extension) Streaming PipelineDB (fork) In-memory columnar In developing Persistent memory, FPGA, SIMD N/A 7
  • 9. Why NoSQL Attracts Attention?  Developer productivity with flexible data model  Can handle various data types as-is (array, list, object, graph, etc.)  No need to map to relational model (eliminate ORM)  High scalability  Can store and process voluminous data  Can handle many requests simultaneously  Fault tolerance array list object graph ・・・ relational modelMapping is not needed Copyright 2018 FUJITSU LIMITED voluminous dataapplication 8
  • 10. Copyright 2018 FUJITSU LIMITED Data Models Data model Representative DBMSs Relational Oracle, MySQL, SQL Server, PostgreSQL Key-value Redis, Memcached Document MongoDB, CouchBase, MarkLogic Graph Neo4j Wide columnar Cassandra, Hbase RDF MarkLogic, Virtuoso, Oracle Text search Elasticsearch, Apache Solr Time series InfluxDB Multi-dimensional array rasdaman, SciDB Event Event Store, NEventStore Object InterSystems Cache 9
  • 11. Polyglot Persistence  Use multiple DBMSs in one system/application  Spread by Martin Fowler Graph Key-value Document Wide columnar Data models in online shopping application RDB Web session shopping cart user profile customer order recommendation product catalog Web access log application Copyright 2018 FUJITSU LIMITED10
  • 12. Copyright 2018 FUJITSU LIMITED  Leading tech companies use many DBMSs (ex. Netflix) Data model DBMSs Relational MySQL, Redshift Key-value Memcached, Redis, Hollow (developed by Netflix ) Text search Elasticsearch Wide columnar Cassandra Time series Atlas (developed by Netflix ) Event Druid Multiple DBMSs Use 11
  • 13. Problems (1/2)  Data silo to prevent cross-sectional data analysis  Time-consuming and laborious ETL  Complex logic in application (fetch, join, aggregation, sort)  Data consistency among DBMSs  Distributed transaction is not available in all DBMSs  Infrastructure cost increase due to duplication of data ・・・ ・・・ Key-value Graph RDB Document Copyright 2018 FUJITSU LIMITED12
  • 14. Copyright 2018 FUJITSU LIMITED Problems (2/2)  Operational complexity  Product/OSS software management, support/service contracts  Infrastructure provisioning (server, storage, network)  Deployment, patching, testing, configuration, version control  Security: user management, access control, encryption, auditing  Monitoring and diagnosis, performance tuning, troubleshooting  HA: backup/recovery, local failover, disaster recovery  Steep learning curve for developers  DBMS-specific non-SQL API and SQL-like query language  Transaction control, consistency model, application tuning  Lack of skilled personnel ? ? 13
  • 15. Copyright 2018 FUJITSU LIMITED What is multi-model database? 14
  • 16. Overview  Support multiple data models in one DBMS etc ・・・ RDB Graph Key-value Document application Very smart! Copyright 2018 FUJITSU LIMITED15
  • 17. Copyright 2018 FUJITSU LIMITED Smooth data utilization with less data integration Higher developer productivity Lower cost for infrastructure and DBA "All-in-one" is convenient, just like a smartphone Merits 16
  • 18. Copyright 2018 FUJITSU LIMITED Multi-model Database Examples DBMS Supported data models ArangoDB key-value, document, graph Cosmos DB key-value, document, graph CouchBase key-value, document DataStax(on Cassandra) key-value, wide column, graph MarkLogic document, text/binary, RDF OrientDB key-value, document, graph, text/binary 17
  • 19. Copyright 2018 FUJITSU LIMITED Trends of Major DBMSs  Major RDBMSs are adding data models  NoSQL DBMSs are also adding data models DBMS Key-value Document Wide column Graph Oracle ++ + MySQL ++ + SQL Server + + MongoDB + ++ + PostgreSQL + + Data model support in top 5 popular DBMSs 18
  • 20. Copyright 2018 FUJITSU LIMITED  Why based on RDBMS?  Why based on PostgreSQL?  Mature storage engine and transaction management  Smart optimizer  Prevalent RDBMS gives more people the chance to use RDBMS has  Extensibility as a data platform  Liberal community open to niche data models PostgreSQL has PostgreSQL as a Multi-model Database 19
  • 21. Copyright 2018 FUJITSU LIMITED How should we implement multi-model database? 20
  • 22. What is Data Model? Data model = Structure + Constraint + Operation Data model Structure table, row, column key, value node, relationship, property, label Constraint unique, referential, check, not null, ... unique unique, node existence Operation scan, join, restriction, projection, aggregation, set arithmetic, sort, insert/delete/update … get, put scan, pattern match, join, restriction, projection, aggregation, set arithmetic, sort, add/delete/modify … Relational Key-value Graph Copyright 2018 FUJITSU LIMITED21
  • 23. Copyright 2018 FUJITSU LIMITED Query Language and API  Adopt standard and well-known languages/APIs per data model  Developer productivity: leverage skill/know-how/asset  Rich information for learning  Standard compliance and popularity for ecosystem  Examples Data model languages/APIs Key-value Redis API, Memcached API Document SQL/JSON path (SQL standard), MongoDB API Graph Cypher, Gremlin RDF SPARQL (W3C standard) Array SQL/MDA (Multi-Dimensional Array) (future SQL standard) 22
  • 24. Multi-model Approach 1  Flexible Schema Data (FSD)  Leverage RDBMS’s user defined data type, function, and index  Store/access data in a table column with functions in SQL  Used for XML, JSON, geospatial data http://cidrdb.org/cidr2015/Papers/CIDR15_Paper5.pdfreference : Relational Data Flexible Schema Data (FSD) RDBMS SQL SQL NoSQL API application user defined data type, function, index Copyright 2018 FUJITSU LIMITED23
  • 25. Multi-model Approach 2  Independent data model components  Query language and API for each data model  Data is optionally separated from relational data  Use for Graph, RDF, time series, event…  Independence ensures performance for each data model Graph Parser RDF Query Processor Cypher, Gremlin SPARQL Parser Transformer Planner Relational Storage engine Storage engine Copyright 2018 FUJITSU LIMITED Parser Transformer Planner Executor SQL application Executor 24
  • 26. Copyright 2018 FUJITSU LIMITED Examples Based on Approach 2  Graph model: AgensGraph (fork)  https://github.com/bitnine-oss/agensgraph  Time series model: TimescaleDB (extension)  https://github.com/timescale/timescaledb 25
  • 27. Copyright 2018 FUJITSU LIMITED Pluggable Data Model  Want to facilitate data model development  Introduce 3 pluggable objects  Query language: generate parse tree from query string  Data model: generate query plan from parse tree and run it  Region: combination of query language and data model Query language Data model Query language Data model Graph model RDF model Query language Relational model Data model as an extension 26
  • 28. Copyright 2018 FUJITSU LIMITED Multi-model Query  Mix queries for multiple data models in a query string  Execute query in a specified region in_region(region_name, query string)  Convert data across regions cast_region(source data, dest region name, dest container, dest schema) -- Among Chinese restaurants in Tokyo, -- list up to 5 top ones among friends' friends SELECT r.name, g.num_likers FROM restaurant r, cast_region( in_region('graph_cypher', 'MATCH (:Person {name:"Taro"})-[:IS_FRIEND_OF*1..2]-(friend), (friend)-[:LIKES]->(restaurant:Restaurant) RETURN restaurant.name, count(*)'), 'relational', 'g', '(name text, num_likers int') WHERE r.name = g.name AND r.city = 'Tokyo' AND r.cuisine = ‘chinese' ORDER BY g.num_likers DESC LIMIT 5; 27
  • 29. Copyright 2018 FUJITSU LIMITED Mixed-model Query Execution Multi-model query plan relation:table/index scan restaurant relation:join relation:sort graph:pattern match IS_FRIEND_OF graph:node scan Person 28
  • 30. Copyright 2018 FUJITSU LIMITED Document Model  PostgreSQL supports JSON since 2012, but…  Different SQL/JSON was standardized in SQL:2016  Store JSON data in character/binary column  Intuitive function and SQL/JSON path language  Powerful JSON_TABLE function to map JSON to relational data  Support for SQL/JSON is being developed in community SELECT JSON_VALUE(jcol, '$.name') AS name, JSON_QUERY(jcol, '$.skills') AS skills FROM emp WHERE JSON_EXISTS(jcol, '$.projects[*] ? (@.category == "IoT")'); SELECT jcol ->> 'name' AS name, jcol -> 'skills' AS skills FROM emp WHERE jcol @> '{ "projects": [{ "category": "IoT" }] }'; Query in current PostgreSQL Query in SQL/JSON 29
  • 31.  The key is performance in storage engine  RDB is slow to traverse graph due to index scan  Eliminate index scan using direct pointers between records  Node traversal cost drops from O(n) to O(1) Index Friend Friend FriendFriend Native graph Friend Jill Jack John Jack John Jill Jack Jill Jack John Graph in RDBMS John Jack Jill John Jack Copyright 2018 FUJITSU LIMITED Graph Model 30
  • 32. Copyright 2018 FUJITSU LIMITED Key-value Model  PostgreSQL has hstore data type, but  Less performant than expected  Unfamiliar API  Solution: Redis in the background worker  Maximal performance by bypassing SQL processor  Familiar, developer-friendly Redis API SQL processor storage engine Redis API (get/put) application DiskTable 31
  • 33. Copyright 2018 FUJITSU LIMITED  Multi-model is necessary for broader use of PostgreSQL PostgreSQL 11 PostgreSQL 12  Build pluggable data model infrastructure  Add/Improve popular data models: key-value, SQL/JSON, graph PostgreSQL 13  Add other (niche?) data models Conclusion 32
  • 34. Copyright 2018 FUJITSU LIMITED We want your ideas!  I would like to discuss the implementation of multi-database model with everyone in this venue  Because specialist of various databases are gathered here  Any idea/wish comment as a user is welcome  Contact me if inconvenient (Japanese/English OK) [email protected] 33
  • 35. FUJITSU Software Enterprise Postgres 10 Copyright 2018 FUJITSU LIMITED Enterprise Postgres Oracle連携/互換性 PostgreSQL 10 アプリケーション インターフェース ECOBPG (埋め込みSQL用 COBOLプリプロセッサ) :PostgreSQLの周辺ツール(OSS) JDBC Driver psqlODBC Npgsql :Enterprise Postgres強化機能 信頼性 性能 パーティショニング 並列検索の強化 ロジカル・ レプリケーション :PostgreSQL本体およびcontribモジュール セキュリティ 透過的データ暗号化 秘匿化 既存システムとの 連携/互換性 NCHAR 他社DB互換構文 運用管理 WebAdmin pg_statsinfo pg_rman 信頼性 データベース二重化 連携性 性能 インメモリ機能 全文検索 pg_bigm 外部データ連携 file_fdw 外部データ連携 postgres_fdw クォーラムベース・ レプリケーション 災害対策 高速バックアップ 高速ローダ― 監査ログ orafce pgAdmin oracle_fdw 全文検索 pg_trgm 34
  • 37. Copyright 2018 FUJITSU LIMITED Questions? 36