Real-time Analytics with Trino and Apache Pinot

Real-time Analytics with
Trino and Apache Pinot
Xiang Fu, Elon Azoulay
Oct 22, 2021

Speaker Intro
Xiang Fu
● Co-founder - StarTree
● Previously: Streaming Platform @ Uber
● PMC & Committer: Apache Pinot
Elon Azoulay
● Software Engineer - Stealth Mode Startup
● Previously: Data @ Facebook

Agenda
● Today’s Compromises: Latency vs. Flexibility
● Not trade-off using Apache Pinot
● Trino Pinot Connector
● Benchmark

Today’s Compromises: Latency vs Flexibility

Flexibility takes time: Join on the Fly
customers
orders
customers.state =
‘California’ AND
customers.gender
= ‘Female’
JOIN customers ON
(customers.customer_id
= orders.customer_id)
Group By
customers.city,
Month(orders.date)
sum(orders.amount)
FILTER JOIN GROUP BY AGGREGATION
- Flexible to do any computation
- High query cost: disk & network
I/O, Data Partitioning, Data Serde

ETL Trade-offs: Pre-joined Table
customers
orders
state =
‘California’
AND gender
= ‘Female’
JOIN customers ON
(customers.customer_id
= orders.customer_id)
Group By city,
Month(orders.
date)
sum(amount)
FILTER
JOIN GROUP BY AGGREGATION
user_orders
_joined
Pre-Joined
Table
- Flexible to explore user dimensions
- Query time is still proportional to the
data scan, not predictable

ETL Trade-offs: Pre-aggregated Table
state =
‘California’
AND gender
= ‘Female’
Group By city,
Month(orders.
date)
sum(sum_amount)
FILTER GROUP BY AGGREGATION
user_orders
_joined
Pre-Joined
Table
user_orders_
aggregated
Pre-Aggregated
Table
SELECT
sum(amount) as
sum_amount, date,
city GROUP BY
date, city
Aggregation
+ GroupBy
- Reduced query runtime workload
- Query time is still proportional to the
multiplication of non-groupBy columns

ETL Trade-offs: Pre-cubed Table
state =
‘California’
AND gender
= ‘Female’
sum_amount,
month, city
FILTER
user_orders
_joined
Pre-Joined
Table
user_orders
_cubed
Pre-Aggregated
Table
SELECT
sum(amount) as
sum_amount, date,
Month(date) as
month, city
GROUP BY CUBE
(date, city,
Month(date))
Cubing
PROJECTION
- Predictable query runtime
- Storage overhead: one raw record
translates to multiple records
- Dimension explosion

Fact Table
Dimension Table Pre-Join Pre-Aggregation Pre-Cube
Latency
Flexibility
low
high
low
high
Not to Trade-off Using Apache Pinot
Throughput high
low

User Facing Applications Business Facing Metrics
Apache Pinot Overview
Anomaly Detection
- Ingestion: Millions of events/sec
- Workload: Thousands of queries/sec
- Performance: Millisecond
- Operation: Thousands Nodes Cluster
ADLS
GCS
Real-Time Offline

BI Visualization Data Products Anomaly Detection
ADLS
GCS
Real-Time Offline
OLTP
Server1 Server2 Server3
Zookeeper
Broker 1 Broker 2
Controller

Secrets Behind Apache Pinot
Scan
Aggregation
Filter
Storage
Bloom
Filter
Inverted Index
Columnar Store
Byte
Encoding
Sorted
Index ❏Common Techniques
❏Pinot
Compression
Star-Tree Pre-aggregation
Star-
Tree
Index
Bit/RLE
Encoding
Per-segment flexible query planning
Range
Index
Text
Index

Apache Pinot - StarTree Index
• Configurable trade-off between latency and space by partial pre-aggregation
technique
• Be able to achieve a hard upper bound for query latencies
No pre-computation
Latency
Storage
Full Pre-Cube
(KV Store)
Partial pre-computation
(Startree Index)
T= 10000
T= 100

Trino Overview
Source: Trino Architecture

BI Visualization Data Products Anomaly Detection Ad hoc Analysis
ADLS
GCS
Trino Pinot Connector

Trino Pinot Connector: Aggregation Pushdown
Chasing the light: Aggregation pushdown
- Issue single Pinot broker request
- Best-effort push down for aggregations like
count/sum/min/max/distinct/approximate_distinct, etc
- 10~100x latency improvement

Passthrough Broker Queries
SELECT CASE WHEN team = ‘Giants’ then ‘BIG’ else ‘SMALL’
END AS size, team, count(*) FROM
pinot.default.”SELECT team
FROM baseball_stats WHERE conference = ‘America East’”
GROUP BY CASE WHEN team = ‘Giants’ then ‘BIG’ else
‘SMALL’ END, 2
Group by Expression Support

Passthrough Broker Queries
SELECT team, count(*) FROM
pinot.default.”SELECT team, player
FROM baseball_stats WHERE conference = ‘America East’”
ORDER BY CASE WHEN team = ‘Giants’ then ‘BIG’ else
‘SMALL’ END, 2
Order by Expression Support

Trino Pinot Connector: Server Query + Pinot Streaming API
Pinot Streaming(gRpc) Connector
- Distributed workload in parallel among Trino workers
- Configurable memory footprint for data pulling from Pinot
- Open the gate of queries requires full table scan or join

Ongoing and Future Work on the Connector
● Data Insertion
○ Push segments to the controller
○ Adds or replaces segments.

Ongoing and Future Work on the Connector
● Pinot Segments Deletion
● Table & Column Creation/Alter/Drop
CREATE TABLE DIMTABLE
(LONG_COL bigint, STRING_COL varchar)
WITH (
PRIMARY_KEY_COLUMNS = ARRAY['long_col'],
OFFLINE_CONFIG = '{
"tableName": "dimtable",
"tableType": "OFFLINE",
"isDimTable": true,
"segmentsConfig": {
"segmentPushType": "REFRESH",
"replication": "1"
},
"tenants": {
"broker": "DefaultTenant",
"server": "DefaultTenant"
},
"tableIndexConfig": {
"loadMode": "MMAP"
}
}’
);

Perf Benchmark
Benchmark Config:
- 1 Pinot Controllers (4 cores/8GB)
- 1 Pinot Brokers (4 cores/8GB)
- 3 Pinot Servers (4 cores/8GB)
- 1 Trino Coordinator (4 cores/8GB)
- 1 Trino Workers (4 cores/8GB)
Data Set:
- 40 Million rows data set
Query:
- Aggregation GroupBy + Predicate
Trino:
- Aggregation Pushdown enable/disable

Perf Benchmark
Query Type: Aggregation Group By + Predicate pushdown

Thank you
- Getting Started: https://tinyurl.com/trinoPinotTutorial
- Run Trino in Kubernetes: https://github.com/trinodb/charts
- StarTree: https://www.startree.ai/
- Apache Pinot: https://pinot.apache.org/
- Pinot on github: https://github.com/apache/pinot
- Pinot slack: https://tinyurl.com/pinotSlackChannel
- Apache Pinot Twitter: https://twitter.com/ApachePinot
- Apache Pinot Meetup: https://www.meetup.com/apache-pinot
- Starburst: https://www.starburst.io/
- Trino: https://trino.io/
- Trino on github: https://github.com/trinodb/trino
- Trino slack: https://trino.io/slack.html

Real-time Analytics with Trino and Apache Pinot

More Related Content

What's hot (20)

Similar to Real-time Analytics with Trino and Apache Pinot (20)

Recently uploaded (20)

Real-time Analytics with Trino and Apache Pinot

Editor's Notes