SlideShare a Scribd company logo
From Trill to Quill: Pushing the Envelope of Functionality and Scale
Badrish Chandramouli @ DEBS 2016
From Trill to Quill: Pushing the Envelope of Functionality and Scale
• Real-time
raise alerts
• Real-time with historical
• Correlate
• Offline
• Develop initial monitoring query
• Back-test
• Progressive
Non-temporal analysis
Engine
+ Fabric
Interactive Query Authoring
Real-Time
Dashboard
Badrish Chandramouli @ DEBS 2016
• Performance
• Fabric & language integration
• Query model
Scenarios
• monitor
telemetry &
raise alerts
• correlate real-
time with logs
• develop initial
monitoring
query
• back-test over
historical logs
• offline analysis
(BI) with early
results
Badrish Chandramouli @ DEBS 2016
• Performance
• Fabric & language integration
• Query model
Badrish Chandramouli @ DEBS 2016
Q
1
2
3
2
1
5min Window
snapshots
logical time
Input
T-1
T-2
T-3
Output
Q = COUNT(*)
3
Relational
Model
Tempo-Relational
Model
QQQ Q Q𝜹𝜹𝜹 𝜹 𝜹
Supports broad & rich analytics
scenarios (relational, progressive,
time-based)
Badrish Chandramouli @ DEBS 2016
• Key enabler: performance +
fabric & language integration +
query model
Badrish Chandramouli @ DEBS 2016
struct ClickEvent { long ClickTime; long User; long AdId; }
var str = Network.ToStream(e => e.ClickTime, Latency(10secs));
var query =
str.Where(e => e.User % 100 < 5)
.Select(e => { e.AdId })
.GroupApply( e => e.AdId,
s => s.Window(5min).Aggregate(w => w.Count()));
query.Subscribe(e => Console.Write(e)); // write results to console
Badrish Chandramouli @ DEBS 2016
stream of batches
• More load  larger batches  better throughput
…
𝑜𝑝2
…
…
𝑜𝑝1
Badrish Chandramouli @ DEBS 2016
class DataBatch {
long[] SyncTime;
...
Bitvector BV;
}
class UserData_Gen : DataBatch {
long[] c_ClickTime;
long[] c_User;
long[] c_AdId;
}
…
𝑜𝑝2
…
…
𝑜𝑝1
timestamp payload columns
bitvector
Badrish Chandramouli @ DEBS 2016
str.Where(e => e.User % 100<5);
Send(events)
...
Application
Receive(results)
On(Batch b) {
for i = 0 to b.Size {
if !(b.c_User[i]%100 < 5)
set b.bitvector[i]
}
next-operator.On(b)
}
Trill
Badrish Chandramouli @ DEBS 2016
Func<TState> InitialState();
Func<TState, long, TInput, TState> Accumulate();
Func<TState,long, TInput, TState> Deaccumulate();
Func<TState, TState, TState> Sum();
Func<TState, TState, TState> Difference();
Func<TState, TResult> ComputeResult();
InitialState: () => 0
Accumulate: (oldCount, timestamp, input) => oldCount + 1
Deaccumulate: (oldCount, timestamp, input) => oldCount - 1
Sum: (leftCount, rightCount) => leftCount + rightCount
Difference: (leftCount, rightCount) => leftCount - rightCount
ComputeResult: count => count
Badrish Chandramouli @ DEBS 2016
session windows,
http://aka.ms/trill
Badrish Chandramouli @ DEBS 2016
Badrish Chandramouli @ DEBS 2016
From Trill to Quill: Pushing the Envelope of Functionality and Scale
• Increasing interest in real-time processing over
out-of-order streams
0
20
40
60
80
100
Refresh every second
Badrish Chandramouli @ DEBS 2016
Up to 8X faster
Badrish Chandramouli @ DEBS 2016
use existing high-perf in-order Trill operators unchanged
Badrish Chandramouli @ DEBS 2016
Low-latency
Completeness
1 sec, 98%
1 hour, 100%
?
0
20
40
60
80
100
0
20
40
60
80
100
0
20
40
60
80
100
0
20
40
60
80
100
0
20
40
60
80
100
10 seconds
Refresh every secondCloud telemetry log
Badrish Chandramouli @ DEBS 2016
Impatience framework gives us low latency, high
completeness, high throughput, and low memory usage
Latency Completeness
{1 sec} ~ 1 sec 98%
{1 hour} ~ 1 hour 100%
{1 sec}
+ {1 min}
+ {1 hour}
~ 1 sec 100%
{1 sec,
1 min,
1 hour}
~ 1 sec 100%
0
2
4
6
8
10
12
14
Throughput(million/sec)
Throughput
{1sec, 1min, 1hour} {1sec}+{1min}+{1hour}
1
10
100
1000
Memoryusage(MB),logscale
Memory usage
{1sec, 1min, 1hour} {1sec}+{1min}+{1hour}
Badrish Chandramouli @ DEBS 2016
Badrish Chandramouli @ DEBS 2016
From Trill to Quill: Pushing the Envelope of Functionality and Scale
no overlapping lifetimes
0
20
40
60
80
100
Badrish Chandramouli @ DEBS 2016
data streams and operations
arrays of numerical values
Badrish Chandramouli @ DEBS 2016
Badrish Chandramouli @ DEBS 2016
Badrish Chandramouli @ DEBS 2016
Badrish Chandramouli @ DEBS 2016
Badrish Chandramouli @ DEBS 2016
From Trill to Quill: Pushing the Envelope of Functionality and Scale
rich space
temporal logic
• Transfer
ShardedStreamable
Badrish Chandramouli @ DEBS 2016
shards
• querying
• data movement
• keying
Operation Description
Query Applies unmodified query on each
(keyed) shard
Broadcast Duplicate each shard’s contents on
all shards
Multicast Copy tuples from each input shard
to zero or more specific result
shards
ReShard Load balance across shards
ReDistribute Move tuples so that same key
resides in same result shard
ReKey Changes key associated with each
row in each shard
…
…
…
…
Badrish Chandramouli @ DEBS 2016
Badrish Chandramouli @ DEBS 2016
e => e.Count()
Flat re-
distribute
e => e.Count()
e => e.Sum()
Badrish Chandramouli @ DEBS 2016
e => e.Count()
[ReDist]
Union
[ReDist]
Union
[ReKey] [ReKey]
AGG AGG
[ReDist]
Union
[ReDist]
Union
[ReKey] [ReKey]
[ReDist]
Union
[ReDist]
Union
AGG AGG
[ReDist]
Union
[ReDist]
Union
AGG AGG
AGG AGG
e => e.Sum()
Badrish Chandramouli @ DEBS 2016
(l,r) => l.Join(r, …)
(l,r) => l.Join(r, …)
Flat re-
distribute
Flat
broadcast
No data
movement
Badrish Chandramouli @ DEBS 2016
str => str.SlidingWindow(Y).Count()
.Where(c => c > threshold)
(l, r) => l.WhereNotExists(y)
str => str.HoppingWindow(Z).Count()
Badrish Chandramouli @ DEBS 2016
•
•
•
•
•
Badrish Chandramouli @ DEBS 2016
Badrish Chandramouli @ DEBS 2016
Scan (Quill vs. SparkSQL) Time taken & scheduling overhead
Badrish Chandramouli @ DEBS 2016
Grouped agg with 40M groups Hopping window (Github data)
Badrish Chandramouli @ DEBS 2016
http://badrish.net/papers/shrink-TR.pdf
Badrish Chandramouli @ DEBS 2016
From Trill to Quill: Pushing the Envelope of Functionality and Scale
https://www.microsoft.com/en-us/research/people/badrishc/
http://aka.ms/streams/
Badrish Chandramouli @ DEBS 2016
From Trill to Quill: Pushing the Envelope of Functionality and Scale

More Related Content

What's hot (20)

PDF
Bloom filter
Hamid Feizabadi
 
PPTX
Hashing
Aafaqueahmad Khan
 
PDF
Finding similar items in high dimensional spaces locality sensitive hashing
Dmitriy Selivanov
 
PDF
Deep dive into deeplearn.js
Kai Sasaki
 
PDF
Hashing
Ramzi Alqrainy
 
PDF
Azure Stream Analytics Project : On-demand real-time analytics
Lamprini Koutsokera
 
PDF
Too Much Data? - Just Sample, Just Hash, ...
Andrii Gakhov
 
PDF
The Weather of the Century
MongoDB
 
PPTX
La R Users Group Survey Of R Graphics
guest43ed8709
 
PDF
The Weather of the Century Part 3: Visualization
MongoDB
 
PDF
3.5 webinar
ArangoDB Database
 
DOCX
R Data Visualization-Spatial data and Maps in R: Using R as a GIS
Dr. Volkan OBAN
 
PPT
Python Coding Examples for Drive Time Analysis
Wisconsin Land Information Association
 
PPTX
Weather of the Century: Visualization
MongoDB
 
PDF
DeepLearning 6_5 ~ 6_5_3
Shopetan shoppe
 
PDF
Cloud flare jgc bigo meetup rolling hashes
Cloudflare
 
PDF
GeoMesa on Apache Spark SQL with Anthony Fox
Databricks
 
PDF
Using PyPy instead of Python for speed
Enplore AB
 
PPT
Bloom filter
wang ping
 
PDF
Map reduce: beyond word count
Jeff Patti
 
Bloom filter
Hamid Feizabadi
 
Finding similar items in high dimensional spaces locality sensitive hashing
Dmitriy Selivanov
 
Deep dive into deeplearn.js
Kai Sasaki
 
Azure Stream Analytics Project : On-demand real-time analytics
Lamprini Koutsokera
 
Too Much Data? - Just Sample, Just Hash, ...
Andrii Gakhov
 
The Weather of the Century
MongoDB
 
La R Users Group Survey Of R Graphics
guest43ed8709
 
The Weather of the Century Part 3: Visualization
MongoDB
 
3.5 webinar
ArangoDB Database
 
R Data Visualization-Spatial data and Maps in R: Using R as a GIS
Dr. Volkan OBAN
 
Python Coding Examples for Drive Time Analysis
Wisconsin Land Information Association
 
Weather of the Century: Visualization
MongoDB
 
DeepLearning 6_5 ~ 6_5_3
Shopetan shoppe
 
Cloud flare jgc bigo meetup rolling hashes
Cloudflare
 
GeoMesa on Apache Spark SQL with Anthony Fox
Databricks
 
Using PyPy instead of Python for speed
Enplore AB
 
Bloom filter
wang ping
 
Map reduce: beyond word count
Jeff Patti
 

Similar to From Trill to Quill: Pushing the Envelope of Functionality and Scale (20)

PPTX
From Trill to Quill and Beyond
Badrish Chandramouli
 
PDF
NoSQL overview #phptostart turin 11.07.2011
David Funaro
 
PPTX
Data Streaming (in a Nutshell) ... and Spark's window operations
Vincenzo Gulisano
 
PDF
High Performance Systems Without Tears - Scala Days Berlin 2018
Zahari Dichev
 
PDF
Spark architecture
datamantra
 
PDF
Sqlmr
blogboy
 
PDF
Cjoin
blogboy
 
PDF
OLAP
blogboy
 
PPTX
Tutorial: The Role of Event-Time Analysis Order in Data Streaming
Vincenzo Gulisano
 
PDF
Distributed RDBMS: Challenges, Solutions & Trade-offs
Ahmed Magdy Ezzeldin, MSc.
 
PDF
Vertica trace
Zvika Gutkin
 
PDF
Time Series With OrientDB - Fosdem 2015
wolf4ood
 
PDF
CSE545 sp23 (2) Streaming Algorithms 2-4.pdf
AlexanderKyalo3
 
PDF
CSE545 sp23 (2) Streaming Algorithms 2-4.pdf
Gabriel Kamau
 
PPTX
Zaharia spark-scala-days-2012
Skills Matter Talks
 
PDF
A Brief History of Stream Processing
Aleksandr Kuboskin, CFA
 
PPTX
Processing Flows of Information DEBS 2011
Alessandro Margara
 
PDF
Bringing back the excitement to data analysis
Data Science London
 
PPTX
OrientDB - Time Series and Event Sequences - Codemotion Milan 2014
Luigi Dell'Aquila
 
PPTX
Il tempo vola: rappresentare e manipolare sequenze di eventi e time series co...
Codemotion
 
From Trill to Quill and Beyond
Badrish Chandramouli
 
NoSQL overview #phptostart turin 11.07.2011
David Funaro
 
Data Streaming (in a Nutshell) ... and Spark's window operations
Vincenzo Gulisano
 
High Performance Systems Without Tears - Scala Days Berlin 2018
Zahari Dichev
 
Spark architecture
datamantra
 
Sqlmr
blogboy
 
Cjoin
blogboy
 
OLAP
blogboy
 
Tutorial: The Role of Event-Time Analysis Order in Data Streaming
Vincenzo Gulisano
 
Distributed RDBMS: Challenges, Solutions & Trade-offs
Ahmed Magdy Ezzeldin, MSc.
 
Vertica trace
Zvika Gutkin
 
Time Series With OrientDB - Fosdem 2015
wolf4ood
 
CSE545 sp23 (2) Streaming Algorithms 2-4.pdf
AlexanderKyalo3
 
CSE545 sp23 (2) Streaming Algorithms 2-4.pdf
Gabriel Kamau
 
Zaharia spark-scala-days-2012
Skills Matter Talks
 
A Brief History of Stream Processing
Aleksandr Kuboskin, CFA
 
Processing Flows of Information DEBS 2011
Alessandro Margara
 
Bringing back the excitement to data analysis
Data Science London
 
OrientDB - Time Series and Event Sequences - Codemotion Milan 2014
Luigi Dell'Aquila
 
Il tempo vola: rappresentare e manipolare sequenze di eventi e time series co...
Codemotion
 
Ad

Recently uploaded (20)

PDF
apidays Helsinki & North 2025 - Monetizing AI APIs: The New API Economy, Alla...
apidays
 
PPTX
ER_Model_with_Diagrams_Presentation.pptx
dharaadhvaryu1992
 
PDF
apidays Singapore 2025 - Building a Federated Future, Alex Szomora (GSMA)
apidays
 
PDF
apidays Singapore 2025 - From API Intelligence to API Governance by Harsha Ch...
apidays
 
PDF
apidays Singapore 2025 - The API Playbook for AI by Shin Wee Chuang (PAND AI)
apidays
 
PPTX
Numbers of a nation: how we estimate population statistics | Accessible slides
Office for National Statistics
 
PPTX
apidays Helsinki & North 2025 - Running a Successful API Program: Best Practi...
apidays
 
PPT
AI Future trends and opportunities_oct7v1.ppt
SHIKHAKMEHTA
 
PDF
apidays Helsinki & North 2025 - API-Powered Journeys: Mobility in an API-Driv...
apidays
 
PDF
Research Methodology Overview Introduction
ayeshagul29594
 
PDF
apidays Singapore 2025 - Surviving an interconnected world with API governanc...
apidays
 
PDF
JavaScript - Good or Bad? Tips for Google Tag Manager
📊 Markus Baersch
 
PDF
Product Management in HealthTech (Case Studies from SnappDoctor)
Hamed Shams
 
PPTX
apidays Singapore 2025 - Generative AI Landscape Building a Modern Data Strat...
apidays
 
PDF
OOPs with Java_unit2.pdf. sarthak bookkk
Sarthak964187
 
PDF
Simplifying Document Processing with Docling for AI Applications.pdf
Tamanna
 
PPTX
b6057ea5-8e8c-4415-90c0-ed8e9666ffcd.pptx
Anees487379
 
PPTX
Aict presentation on dpplppp sjdhfh.pptx
vabaso5932
 
PDF
apidays Singapore 2025 - Streaming Lakehouse with Kafka, Flink and Iceberg by...
apidays
 
PPTX
Listify-Intelligent-Voice-to-Catalog-Agent.pptx
nareshkottees
 
apidays Helsinki & North 2025 - Monetizing AI APIs: The New API Economy, Alla...
apidays
 
ER_Model_with_Diagrams_Presentation.pptx
dharaadhvaryu1992
 
apidays Singapore 2025 - Building a Federated Future, Alex Szomora (GSMA)
apidays
 
apidays Singapore 2025 - From API Intelligence to API Governance by Harsha Ch...
apidays
 
apidays Singapore 2025 - The API Playbook for AI by Shin Wee Chuang (PAND AI)
apidays
 
Numbers of a nation: how we estimate population statistics | Accessible slides
Office for National Statistics
 
apidays Helsinki & North 2025 - Running a Successful API Program: Best Practi...
apidays
 
AI Future trends and opportunities_oct7v1.ppt
SHIKHAKMEHTA
 
apidays Helsinki & North 2025 - API-Powered Journeys: Mobility in an API-Driv...
apidays
 
Research Methodology Overview Introduction
ayeshagul29594
 
apidays Singapore 2025 - Surviving an interconnected world with API governanc...
apidays
 
JavaScript - Good or Bad? Tips for Google Tag Manager
📊 Markus Baersch
 
Product Management in HealthTech (Case Studies from SnappDoctor)
Hamed Shams
 
apidays Singapore 2025 - Generative AI Landscape Building a Modern Data Strat...
apidays
 
OOPs with Java_unit2.pdf. sarthak bookkk
Sarthak964187
 
Simplifying Document Processing with Docling for AI Applications.pdf
Tamanna
 
b6057ea5-8e8c-4415-90c0-ed8e9666ffcd.pptx
Anees487379
 
Aict presentation on dpplppp sjdhfh.pptx
vabaso5932
 
apidays Singapore 2025 - Streaming Lakehouse with Kafka, Flink and Iceberg by...
apidays
 
Listify-Intelligent-Voice-to-Catalog-Agent.pptx
nareshkottees
 
Ad

From Trill to Quill: Pushing the Envelope of Functionality and Scale