SlideShare a Scribd company logo
Realtime Classroom Analytics
Powered By Apache Druid
Karthik Deivasigamani, Chief Architect, Noon - The Social Learning Platform
Agenda
● Who We Are
● Live Online Classroom
● Quality Of Experience
● Why Apache Druid
● Realtime Classroom Monitoring
● Key Lessons
● Q & A
Who Are We?
Noon has evolved into a ‘Social Learning’
platform three years ago to craft the most
engaging learning experience.
● Our mission is to radically change the
way people learn.
● Make learning more social and fun.
● 10M+ users from over 5 countries
● 1M+ MAU with 50+ mins per active day
per student
Live Online Classroom
Students spend a significant amount of their
time on Noon learning from their teacher
within the online classrooms.
Classroom Features
● Video, Audio, Chat and Whiteboard
● Breakouts, Raise Hand
● Peak 10K students / session
Live Classroom - Challenges
Audio
Voice is broken
● Teacher’s uplink quality
● Issues with microphone
● Student’s downlink
quality
● ISP policies
Whiteboard
Lag in whiteboard
● Loss of drawing events
due to unstable network
● Heavy CPU usage on the
mobile device
● Software Bug
Quality Of Experience
“Quality of experience is a measure
of the delight or annoyance of a
customer's experiences with a
service.” - Wikipedia
Monitoring The Classroom
Metrics
● Uplink/Downlink Network Quality
● Packet Loss
● Remote/Local Audio Quality
● Mic Status
● Jitter Buffer Delay
● frameFrozenRate
● Uplink/Downlink BitRate
Dimensions
● Country
● Region
● City
● Session
● User
● ISP
● Network Type
Aggregations
● Percentile
● Count
● Average
● Distinct Count
● Standard Deviation
System Characteristics
● Real Time Ingestion
● Scale Horizontally
● High Cardinality Data
● Subsecond Query Latency
● Fast Aggregation
● Zoom In & Zoom Out
● Highly Available
Why Apache Druid
● Real Time Ingestion From Kafka Through Spec Files
● Data & Query Nodes Allows For Horizontal Scaling
● Sketches For High Cardinality Columns
● Low-Latency Querying
● Rich Built In Capabilities For Exact & Approx Aggregation
● Data Rollups
● Fault Tolerance At Multiple Levels
Data Collection - Network & Audio
WebRTC Stats
Sent BitRate
Received BitRate
Audio Packet Loss
Audio Level
Bytes Sent/Received
Audio Frame Freeze Rate
Network Quality
Audio Quality
Data Collection - Whiteboard
Whiteboard Stats
Stroke Difference
Drift Percentage
Ingestion
● All ingestions happen via Kafka in real
time
● Flink Topology
● Split & Format to conform with
ingestion spec
● Rollup Enabled At Ingestion Time
● Conditional transformation
● Looking forward to using Lag Based
AutoScaler.
Making Ingestion Easy
● Well defined event (ProtoBuf) schema
serialized as JSON.
● Jsonpath based DSL defining
transformers & ingestion spec.
● Parsing & Transformation based on
the configuration file in a flink
topology.
● Ingestion Spec Auto Generated from
JSON configuration file.
● Automated Deployments Via Jenkins
Schema Design
● Always start from your use-cases.
● Identify Dimensions & Metrics
● Aggregations & Approximation (hyperloglog,
quantiles sketches)
● Query Granularity
● Partitions
● Deep Storage
● Data Retention
Self Serve Dashboard - Zoom Out & Zoom In
Country Level View
Sessions Inside A
Country
Session Level View
Students Inside A
Session View
Student Session
Level View
Our Druid Cluster
Topology
● Master (m5.2xl)
● Data Node (i3.2xl)
○ Tiered
○ 24 slots
● Query Node (m5.2xl)
● External ZK, MySQL, S3
Deep Storage
Monitoring Numbers
● Datadog-Druid
● System Resources
● Ingestion Lag
● Number of Segments
● Query Time
● JVM Memory Usage
● 15+ dims, 50+ metrics
● 105 M events per day
● 2B rows @ Avg Row Size
1K
● 4k-5k Segment
● p90 latency ~ 850 ms
Putting Together
Business Impact
● Quickly Identify Problems
● Validation of fixes put in to improve quality
● Self Serve Tool, reducing burden on
developers
● Improved transparency & trust between
OPS and developers
● Student NPS score improved
Challenges & Key Lessons
● Rollups are your best friend
● Ingestion Time Transformation > Query Time
Transformation
● Approximation - Hyperloglog, Data Sketches
● Late Arrival Of Messages & Compaction
● Query Performance depends on your data model
● Setup takes time to stabilize.
● druid-user group is super helpful!
Questions?
Thank you
Contact: karthik@noonacademy.com

More Related Content

What's hot (20)

PDF
Model View Controller (MVC)
Javier Antonio Humarán Peñuñuri
 
DOC
Creating an executable jar file
Ankush Srivastava
 
PDF
The Functional Programming Toolkit (NDC Oslo 2019)
Scott Wlaschin
 
PDF
An Introduction To PostgreSQL Triggers
Jim Mlodgenski
 
PDF
Java - File Input Output Concepts
Victer Paul
 
PDF
Cloud Computing Technology Overview 2012
Janine Anthony Bowen, Esq.
 
PDF
Java 8 Lambda Built-in Functional Interfaces
Ganesh Samarthyam
 
PPT
Step-by-Step Introduction to Apache Flink
Slim Baltagi
 
PPTX
Lecture 7 arrays
manish kumar
 
PPTX
graphics programming in java
Abinaya B
 
PDF
Introduction to functional programming (In Arabic)
Omar Abdelhafith
 
PDF
Zero-Copy Event-Driven Servers with Netty
Daniel Bimschas
 
PPTX
Java 8 - Features Overview
Sergii Stets
 
PPTX
Reactive programming by spring webflux - DN Scrum Breakfast - Nov 2018
Scrum Breakfast Vietnam
 
PPTX
User, roles and privileges
Yogiji Creations
 
PPTX
Fragmentation and types of fragmentation in Distributed Database
Abhilasha Lahigude
 
PPT
Java beans
sptatslide
 
PDF
Java Concurrency by Example
Ganesh Samarthyam
 
Model View Controller (MVC)
Javier Antonio Humarán Peñuñuri
 
Creating an executable jar file
Ankush Srivastava
 
The Functional Programming Toolkit (NDC Oslo 2019)
Scott Wlaschin
 
An Introduction To PostgreSQL Triggers
Jim Mlodgenski
 
Java - File Input Output Concepts
Victer Paul
 
Cloud Computing Technology Overview 2012
Janine Anthony Bowen, Esq.
 
Java 8 Lambda Built-in Functional Interfaces
Ganesh Samarthyam
 
Step-by-Step Introduction to Apache Flink
Slim Baltagi
 
Lecture 7 arrays
manish kumar
 
graphics programming in java
Abinaya B
 
Introduction to functional programming (In Arabic)
Omar Abdelhafith
 
Zero-Copy Event-Driven Servers with Netty
Daniel Bimschas
 
Java 8 - Features Overview
Sergii Stets
 
Reactive programming by spring webflux - DN Scrum Breakfast - Nov 2018
Scrum Breakfast Vietnam
 
User, roles and privileges
Yogiji Creations
 
Fragmentation and types of fragmentation in Distributed Database
Abhilasha Lahigude
 
Java beans
sptatslide
 
Java Concurrency by Example
Ganesh Samarthyam
 

Similar to Realtime classroom analytics powered by apache druid (20)

PDF
LLM-based Multi-Agent Systems to Replace Traditional Software
Ivo Andreev
 
PPTX
Engineering Netflix Global Operations in the Cloud
Josh Evans
 
PDF
#OSSPARIS19 - How to improve database observability - CHARLES JUDITH, Criteo
Paris Open Source Summit
 
PDF
OSMC 2019 | How to improve database Observability by Charles Judith
NETWAYS
 
PPTX
demo
Saurabh497412
 
PPTX
Netflix Data Pipeline With Kafka
Allen (Xiaozhong) Wang
 
PPTX
Netflix Data Pipeline With Kafka
Steven Wu
 
PPTX
Dynomite @ RedisConf 2017
Ioannis Papapanagiotou
 
PDF
Machine learning and big data @ uber a tale of two systems
Zhenxiao Luo
 
PPTX
RedisConf17 - Dynomite - Making Non-distributed Databases Distributed
Redis Labs
 
PPTX
Node.js Web Apps @ ebay scale
Dmytro Semenov
 
PPTX
Instruments to play microservice
Chandresh Pancholi
 
PDF
Druid @ branch
Biswajit Das
 
PPTX
Design patterns for scaling web applications
Ivan Dimitrov
 
PDF
Java Based RFID Attendance Management System Graduation Project Presentation
Ibrahim Abdel Fattah Mohamed
 
PPTX
How Precisely and Splunk Can Help You Better Manage Your IBM Z and IBM i Envi...
Precisely
 
PDF
Journey and evolution of Presto@Grab
Shubham Tagra
 
PDF
Vedantu @ Kranky Geek
Piyush Punam Bansiwal
 
PDF
Real-time applications with sockets and websockets. Introduction to Smartfoxs...
Pablo Monterde Perez
 
PDF
Multi-Agent Era will Define the Future of Software
Ivo Andreev
 
LLM-based Multi-Agent Systems to Replace Traditional Software
Ivo Andreev
 
Engineering Netflix Global Operations in the Cloud
Josh Evans
 
#OSSPARIS19 - How to improve database observability - CHARLES JUDITH, Criteo
Paris Open Source Summit
 
OSMC 2019 | How to improve database Observability by Charles Judith
NETWAYS
 
Netflix Data Pipeline With Kafka
Allen (Xiaozhong) Wang
 
Netflix Data Pipeline With Kafka
Steven Wu
 
Dynomite @ RedisConf 2017
Ioannis Papapanagiotou
 
Machine learning and big data @ uber a tale of two systems
Zhenxiao Luo
 
RedisConf17 - Dynomite - Making Non-distributed Databases Distributed
Redis Labs
 
Node.js Web Apps @ ebay scale
Dmytro Semenov
 
Instruments to play microservice
Chandresh Pancholi
 
Druid @ branch
Biswajit Das
 
Design patterns for scaling web applications
Ivan Dimitrov
 
Java Based RFID Attendance Management System Graduation Project Presentation
Ibrahim Abdel Fattah Mohamed
 
How Precisely and Splunk Can Help You Better Manage Your IBM Z and IBM i Envi...
Precisely
 
Journey and evolution of Presto@Grab
Shubham Tagra
 
Vedantu @ Kranky Geek
Piyush Punam Bansiwal
 
Real-time applications with sockets and websockets. Introduction to Smartfoxs...
Pablo Monterde Perez
 
Multi-Agent Era will Define the Future of Software
Ivo Andreev
 
Ad

Recently uploaded (20)

PDF
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PDF
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
PPTX
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PDF
Blockchain Transactions Explained For Everyone
CIFDAQ
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PDF
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
PDF
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
PPTX
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
PDF
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PDF
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
PPTX
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
PDF
IoT-Powered Industrial Transformation – Smart Manufacturing to Connected Heal...
Rejig Digital
 
PDF
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
Blockchain Transactions Explained For Everyone
CIFDAQ
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
IoT-Powered Industrial Transformation – Smart Manufacturing to Connected Heal...
Rejig Digital
 
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
Ad

Realtime classroom analytics powered by apache druid

  • 1. Realtime Classroom Analytics Powered By Apache Druid Karthik Deivasigamani, Chief Architect, Noon - The Social Learning Platform
  • 2. Agenda ● Who We Are ● Live Online Classroom ● Quality Of Experience ● Why Apache Druid ● Realtime Classroom Monitoring ● Key Lessons ● Q & A
  • 3. Who Are We? Noon has evolved into a ‘Social Learning’ platform three years ago to craft the most engaging learning experience. ● Our mission is to radically change the way people learn. ● Make learning more social and fun. ● 10M+ users from over 5 countries ● 1M+ MAU with 50+ mins per active day per student
  • 4. Live Online Classroom Students spend a significant amount of their time on Noon learning from their teacher within the online classrooms. Classroom Features ● Video, Audio, Chat and Whiteboard ● Breakouts, Raise Hand ● Peak 10K students / session
  • 5. Live Classroom - Challenges Audio Voice is broken ● Teacher’s uplink quality ● Issues with microphone ● Student’s downlink quality ● ISP policies Whiteboard Lag in whiteboard ● Loss of drawing events due to unstable network ● Heavy CPU usage on the mobile device ● Software Bug
  • 6. Quality Of Experience “Quality of experience is a measure of the delight or annoyance of a customer's experiences with a service.” - Wikipedia
  • 7. Monitoring The Classroom Metrics ● Uplink/Downlink Network Quality ● Packet Loss ● Remote/Local Audio Quality ● Mic Status ● Jitter Buffer Delay ● frameFrozenRate ● Uplink/Downlink BitRate Dimensions ● Country ● Region ● City ● Session ● User ● ISP ● Network Type Aggregations ● Percentile ● Count ● Average ● Distinct Count ● Standard Deviation
  • 8. System Characteristics ● Real Time Ingestion ● Scale Horizontally ● High Cardinality Data ● Subsecond Query Latency ● Fast Aggregation ● Zoom In & Zoom Out ● Highly Available
  • 9. Why Apache Druid ● Real Time Ingestion From Kafka Through Spec Files ● Data & Query Nodes Allows For Horizontal Scaling ● Sketches For High Cardinality Columns ● Low-Latency Querying ● Rich Built In Capabilities For Exact & Approx Aggregation ● Data Rollups ● Fault Tolerance At Multiple Levels
  • 10. Data Collection - Network & Audio WebRTC Stats Sent BitRate Received BitRate Audio Packet Loss Audio Level Bytes Sent/Received Audio Frame Freeze Rate Network Quality Audio Quality
  • 11. Data Collection - Whiteboard Whiteboard Stats Stroke Difference Drift Percentage
  • 12. Ingestion ● All ingestions happen via Kafka in real time ● Flink Topology ● Split & Format to conform with ingestion spec ● Rollup Enabled At Ingestion Time ● Conditional transformation ● Looking forward to using Lag Based AutoScaler.
  • 13. Making Ingestion Easy ● Well defined event (ProtoBuf) schema serialized as JSON. ● Jsonpath based DSL defining transformers & ingestion spec. ● Parsing & Transformation based on the configuration file in a flink topology. ● Ingestion Spec Auto Generated from JSON configuration file. ● Automated Deployments Via Jenkins
  • 14. Schema Design ● Always start from your use-cases. ● Identify Dimensions & Metrics ● Aggregations & Approximation (hyperloglog, quantiles sketches) ● Query Granularity ● Partitions ● Deep Storage ● Data Retention
  • 15. Self Serve Dashboard - Zoom Out & Zoom In Country Level View Sessions Inside A Country Session Level View Students Inside A Session View Student Session Level View
  • 16. Our Druid Cluster Topology ● Master (m5.2xl) ● Data Node (i3.2xl) ○ Tiered ○ 24 slots ● Query Node (m5.2xl) ● External ZK, MySQL, S3 Deep Storage Monitoring Numbers ● Datadog-Druid ● System Resources ● Ingestion Lag ● Number of Segments ● Query Time ● JVM Memory Usage ● 15+ dims, 50+ metrics ● 105 M events per day ● 2B rows @ Avg Row Size 1K ● 4k-5k Segment ● p90 latency ~ 850 ms
  • 18. Business Impact ● Quickly Identify Problems ● Validation of fixes put in to improve quality ● Self Serve Tool, reducing burden on developers ● Improved transparency & trust between OPS and developers ● Student NPS score improved
  • 19. Challenges & Key Lessons ● Rollups are your best friend ● Ingestion Time Transformation > Query Time Transformation ● Approximation - Hyperloglog, Data Sketches ● Late Arrival Of Messages & Compaction ● Query Performance depends on your data model ● Setup takes time to stabilize. ● druid-user group is super helpful!