SlideShare a Scribd company logo
6
Most read
15
Most read
16
Most read
High Concurrency
Architecture
09-2019. TIKI.VN
Contributors
Bùi Anh Dũng
Principal Engineer
Nguyễn Hoàng Bách
Senior Principal
Engineer
Phan Công Huân
Senior Software
Engineer
Lê Minh Nghĩa
Senior Architect
Trần Nguyên Bản
Principal Engineer
Agenda
- Principles
- Pegasus - Highest throughput API
- Arcturus - High concurrency inventory API
- Conclusion
Principles
- Use local memory to handle high concurrency transaction
- Non blocking architecture
- Trade-offs consistency vs eventual consistency
- Reliable replication
- Authority: each service owns its problems.
Pegasus - Highest Throughput API at TIKI
Problems:
- Handle the most of traffic to fetch product information
- Has to handle at least 10k request/s, tp95 < 5ms
Pegasus - Architecture
Pegasus - Architecture
Practises:
- Cache product data in local memory
- Subscribe product change event to
invalid cache
- Non-Blocking http web server
- Compress to reduce payload size
Technology:
- Java
- Guava - In Memory Cache
- Gridgo - Async IO & Event driven
framework
Pegasus - Compression
Two cases:
- Get single product by id
- Get multiple product by a list id
Solutions:
- Using gzip to compress product data to store in
cache. Reduce from 200kb text to 3Kb
- Handle single product request: return
compressed gzip bytes to client directly
- Handle multiple product request: store plain
product data and merge them to build a list of
products, then use Snappy format to compress
data and return to client
- Gzip format compress better than Snappy and
is supported natively by most http client, but
Snappy compress faster and use less CPU
- Output cache can be enabled in hot-deal
situation
Pegasus - Technology
- Java
- Gridgo
- Guava
- Kafka
Pegasus - Benchmark
Benchmark use WRK tools, 4VM,
L4 LB, GCP:
- 90%<6.6ms
- 96k request/s
Production
- 200k request/minutes (all
platforms)
- <2ms
Cache hit: 85%. Increased to 95% with Consistent Hashing With Bounded Load.
Average payload: Single (3KB), Multi (60KB)
Pegasus - Benchmark/Replication
Arcturus - High concurrency Inventory API
Problems:
- Handle inventory transaction when customers place orders
- Handle high concurrency transaction for extreme hot deals with very cheap
and low quantity. Exp: Only ten 1đ iPhone XS, only in ten minutes from 9AM
to 9h10 AM.
- Guarantee eventual consistency of inventory data between many systems
Arcturus - Context Diagram
Arcturus - Architecture
Arcturus - Problems and Solutions
Problems:
- Many customers may place order at the same time, especially for hot skus.
- System has to be deterministic and can be recovered after crashing
Solutions:
- All write requests are published to Kafka with only one partition to guarantee
the ordering of message and recover if system crashes.
- Non Blocking In Memory Cache for both read and write
- All changes are flushed to database asynchronously
- Recovery based on the offset of write command
Arcturus - In Memory Cache Data Structure
Problem:
- There will be a race condition when many
customer place the same skus at the
same time
- Building an in memory cache structure
that can buffer data changes and flush to
database asynchronously
Approach:
- Each record have a key (string, bytes…)
and a value (8 bytes long number).
- A transaction is a set of record updates.
- Transactions must be ordered (e.g. key_1
must +3 before can be -1).
- It can be millions of keys and process upto
10k TPS.
- Using ring buffer data structure to
guarantee the ordering of changes
* The hardest thing is to keep
everything ordered when make it
fast enough.
Arcturus - Old solution
- Multi threading application
- DB transaction, row locking
Pros
- Strong consistency
- Simple implementation
Cons
- Slow
- Deadlock/timeout implicit risk
Arcturus - LMax Architecture
- Non Blocking architecture using single
thread business logic processor
- Use Ring Buffer data structure to
communicate between processors
- Handle million transaction per seconds
Arcturus - Batching marshaller
vBatching: use multi thread marshaller
● Pros
- Fast (400k-600k ops/s*)
- Avoid multi key locking
- Write only most updated value
● Cons
- Inconsistent transaction offset
→ use when you just want to make your code as fast
as possible.
* Macbook pro 2016 15’’ 2.7GHz Core i7, 16G ram. Without I/O, single update per
transaction
*** Test without I/O: [Single ops vbatch] Total time running for 1000000 transactions: 1085 ms; at rate: 921,658.99 ops/s
Arcturus - Batching marshaller
hBatching: use single marshaller thread
● Pros
- Ensure transaction offset
- Ordered db updating
● Cons
- A little bit slower than vBatching.
→ use when you need to re-create application state
after fail.
*** Test without I/O: [Single ops hbatch] Total time running for 1000000 transactions: 1056 ms; at rate: 946,969.70 ops/s
Arcturus - Recovery/Replay
Arcturus - Technology
- Java
- Kafka
- ZeroMQ
- Gridgo
- MySQL
- LMax/Ring Buffer
Conclusion
- Has to trade-off between consistency and eventual consistency
- In Memory is a great way to improve the performance
- Reliable replication is the key to split and scale system
- Non Blocking architecture is a great way to utilize hardware resources
efficiently.

More Related Content

What's hot (20)

PDF
Domain Driven Design và Event Driven Architecture
IT Expert Club
 
PPTX
Grokking Techtalk #37: Software design and refactoring
Grokking VN
 
PPTX
Toi uu hoa he thong 30 trieu nguoi dung
IT Expert Club
 
PDF
Event Driven-Architecture from a Scalability perspective
Jonas Bonér
 
PDF
Grokking Techtalk: Problem solving for sw engineers
9diov
 
PDF
Apache Zookeeper
Nguyen Quang
 
PDF
Introduction to the Disruptor
Trisha Gee
 
PDF
Sapo Microservices Architecture
Khôi Nguyễn Minh
 
PPTX
Grokking Techtalk #40: Consistency and Availability tradeoff in database cluster
Grokking VN
 
PDF
Kinh nghiệm triển khai Microservices tại Sapo.vn
Dotnet Open Group
 
PDF
Bizweb Microservices Architecture
Khôi Nguyễn Minh
 
PPSX
LMAX Disruptor - High Performance Inter-Thread Messaging Library
Sebastian Andrasoni
 
PDF
LMAX Architecture
Stephan Schmidt
 
PPTX
The RabbitMQ Message Broker
Martin Toshev
 
PPSX
Event Sourcing & CQRS, Kafka, Rabbit MQ
Araf Karsh Hamid
 
PPTX
Microservices Part 3 Service Mesh and Kafka
Araf Karsh Hamid
 
PPTX
My sql failover test using orchestrator
YoungHeon (Roy) Kim
 
PPSX
LMAX Disruptor as real-life example
Guy Nir
 
PPTX
Asynchronous processing in big system
Nghia Minh
 
PDF
Introduction to AMQP Messaging with RabbitMQ
Dmitriy Samovskiy
 
Domain Driven Design và Event Driven Architecture
IT Expert Club
 
Grokking Techtalk #37: Software design and refactoring
Grokking VN
 
Toi uu hoa he thong 30 trieu nguoi dung
IT Expert Club
 
Event Driven-Architecture from a Scalability perspective
Jonas Bonér
 
Grokking Techtalk: Problem solving for sw engineers
9diov
 
Apache Zookeeper
Nguyen Quang
 
Introduction to the Disruptor
Trisha Gee
 
Sapo Microservices Architecture
Khôi Nguyễn Minh
 
Grokking Techtalk #40: Consistency and Availability tradeoff in database cluster
Grokking VN
 
Kinh nghiệm triển khai Microservices tại Sapo.vn
Dotnet Open Group
 
Bizweb Microservices Architecture
Khôi Nguyễn Minh
 
LMAX Disruptor - High Performance Inter-Thread Messaging Library
Sebastian Andrasoni
 
LMAX Architecture
Stephan Schmidt
 
The RabbitMQ Message Broker
Martin Toshev
 
Event Sourcing & CQRS, Kafka, Rabbit MQ
Araf Karsh Hamid
 
Microservices Part 3 Service Mesh and Kafka
Araf Karsh Hamid
 
My sql failover test using orchestrator
YoungHeon (Roy) Kim
 
LMAX Disruptor as real-life example
Guy Nir
 
Asynchronous processing in big system
Nghia Minh
 
Introduction to AMQP Messaging with RabbitMQ
Dmitriy Samovskiy
 

Similar to Grokking TechTalk #33: High Concurrency Architecture at TIKI (20)

PPTX
Mykhailo Hryhorash: Архітектура IT-рішень (Частина 1) (UA)
Lviv Startup Club
 
PDF
Tweaking performance on high-load projects
Dmitriy Dumanskiy
 
PDF
Service-Oriented Design and Implement with Rails3
Wen-Tien Chang
 
PPTX
Mykhailo Hryhorash: Архітектура IT-рішень (Частина 1) (UA)
content75
 
PPTX
Black Friday and Cyber Monday- Best Practices for Your E-Commerce Database
Tim Vaillancourt
 
PDF
Big Trouble in Little Networks, new and improved
Stacy Devino
 
PDF
Scalability broad strokes
Gagan Bajpai
 
KEY
High performance network programming on the jvm oscon 2012
Erik Onnen
 
PPTX
Flashback: QCon San Francisco 2012
Sergejus Barinovas
 
PPTX
Skeuomorphs, Databases, and Mobile Performance
Sam Ramji
 
PDF
Architecture Patterns - Open Discussion
Nguyen Tung
 
PPTX
Skeuomorphs, Databases, and Mobile Performance
Apigee | Google Cloud
 
PDF
Can Your Mobile Infrastructure Survive 1 Million Concurrent Users?
TechWell
 
PDF
Preparing your web services for Android and your Android app for web services...
Droidcon Eastern Europe
 
PDF
Rest on steroids
fds2
 
PPTX
First spring
Didac Montero
 
PPTX
Leapfrogging with legacy
clive boulton
 
KEY
Synchronous Reads Asynchronous Writes RubyConf 2009
pauldix
 
PDF
Scalability Considerations
Navid Malek
 
PDF
Voldemort Nosql
elliando dias
 
Mykhailo Hryhorash: Архітектура IT-рішень (Частина 1) (UA)
Lviv Startup Club
 
Tweaking performance on high-load projects
Dmitriy Dumanskiy
 
Service-Oriented Design and Implement with Rails3
Wen-Tien Chang
 
Mykhailo Hryhorash: Архітектура IT-рішень (Частина 1) (UA)
content75
 
Black Friday and Cyber Monday- Best Practices for Your E-Commerce Database
Tim Vaillancourt
 
Big Trouble in Little Networks, new and improved
Stacy Devino
 
Scalability broad strokes
Gagan Bajpai
 
High performance network programming on the jvm oscon 2012
Erik Onnen
 
Flashback: QCon San Francisco 2012
Sergejus Barinovas
 
Skeuomorphs, Databases, and Mobile Performance
Sam Ramji
 
Architecture Patterns - Open Discussion
Nguyen Tung
 
Skeuomorphs, Databases, and Mobile Performance
Apigee | Google Cloud
 
Can Your Mobile Infrastructure Survive 1 Million Concurrent Users?
TechWell
 
Preparing your web services for Android and your Android app for web services...
Droidcon Eastern Europe
 
Rest on steroids
fds2
 
First spring
Didac Montero
 
Leapfrogging with legacy
clive boulton
 
Synchronous Reads Asynchronous Writes RubyConf 2009
pauldix
 
Scalability Considerations
Navid Malek
 
Voldemort Nosql
elliando dias
 
Ad

More from Grokking VN (20)

PDF
Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
Grokking VN
 
PDF
Grokking Techtalk #45: First Principles Thinking
Grokking VN
 
PDF
Grokking Techtalk #42: Engineering challenges on building data platform for M...
Grokking VN
 
PDF
Grokking Techtalk #43: Payment gateway demystified
Grokking VN
 
PPTX
Grokking Techtalk #40: AWS’s philosophy on designing MLOps platform
Grokking VN
 
PDF
Grokking Techtalk #39: Gossip protocol and applications
Grokking VN
 
PDF
Grokking Techtalk #38: Escape Analysis in Go compiler
Grokking VN
 
PDF
Grokking TechTalk #35: Efficient spellchecking
Grokking VN
 
PDF
Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
Grokking VN
 
PDF
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...
Grokking VN
 
PDF
Grokking TechTalk #31: Asynchronous Communications
Grokking VN
 
PDF
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at Scale
Grokking VN
 
PDF
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
Grokking VN
 
PDF
Grokking TechTalk #27: Optimal Binary Search Tree
Grokking VN
 
PDF
Grokking TechTalk #26: Kotlin, Understand the Magic
Grokking VN
 
PDF
Grokking TechTalk #26: Compare ios and android platform
Grokking VN
 
PPTX
Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & Pos...
Grokking VN
 
PDF
Grokking TechTalk #24: Kafka's principles and protocols
Grokking VN
 
PDF
Grokking TechTalk #21: Deep Learning in Computer Vision
Grokking VN
 
PDF
Grokking TechTalk #20: PostgreSQL Internals 101
Grokking VN
 
Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
Grokking VN
 
Grokking Techtalk #45: First Principles Thinking
Grokking VN
 
Grokking Techtalk #42: Engineering challenges on building data platform for M...
Grokking VN
 
Grokking Techtalk #43: Payment gateway demystified
Grokking VN
 
Grokking Techtalk #40: AWS’s philosophy on designing MLOps platform
Grokking VN
 
Grokking Techtalk #39: Gossip protocol and applications
Grokking VN
 
Grokking Techtalk #38: Escape Analysis in Go compiler
Grokking VN
 
Grokking TechTalk #35: Efficient spellchecking
Grokking VN
 
Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
Grokking VN
 
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...
Grokking VN
 
Grokking TechTalk #31: Asynchronous Communications
Grokking VN
 
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at Scale
Grokking VN
 
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
Grokking VN
 
Grokking TechTalk #27: Optimal Binary Search Tree
Grokking VN
 
Grokking TechTalk #26: Kotlin, Understand the Magic
Grokking VN
 
Grokking TechTalk #26: Compare ios and android platform
Grokking VN
 
Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & Pos...
Grokking VN
 
Grokking TechTalk #24: Kafka's principles and protocols
Grokking VN
 
Grokking TechTalk #21: Deep Learning in Computer Vision
Grokking VN
 
Grokking TechTalk #20: PostgreSQL Internals 101
Grokking VN
 
Ad

Recently uploaded (20)

PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PDF
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
PPTX
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
PDF
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
PDF
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
 
DOCX
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
PDF
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
PDF
Kit-Works Team Study_20250627_한달만에만든사내서비스키링(양다윗).pdf
Wonjun Hwang
 
DOCX
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
PDF
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PDF
NLJUG Speaker academy 2025 - first session
Bert Jan Schrijver
 
PDF
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
PDF
Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
PPTX
MuleSoft MCP Support (Model Context Protocol) and Use Case Demo
shyamraj55
 
PDF
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
PDF
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
PDF
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
 
PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
 
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
Kit-Works Team Study_20250627_한달만에만든사내서비스키링(양다윗).pdf
Wonjun Hwang
 
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
NLJUG Speaker academy 2025 - first session
Bert Jan Schrijver
 
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
MuleSoft MCP Support (Model Context Protocol) and Use Case Demo
shyamraj55
 
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 

Grokking TechTalk #33: High Concurrency Architecture at TIKI

  • 2. Contributors Bùi Anh Dũng Principal Engineer Nguyễn Hoàng Bách Senior Principal Engineer Phan Công Huân Senior Software Engineer Lê Minh Nghĩa Senior Architect Trần Nguyên Bản Principal Engineer
  • 3. Agenda - Principles - Pegasus - Highest throughput API - Arcturus - High concurrency inventory API - Conclusion
  • 4. Principles - Use local memory to handle high concurrency transaction - Non blocking architecture - Trade-offs consistency vs eventual consistency - Reliable replication - Authority: each service owns its problems.
  • 5. Pegasus - Highest Throughput API at TIKI Problems: - Handle the most of traffic to fetch product information - Has to handle at least 10k request/s, tp95 < 5ms
  • 7. Pegasus - Architecture Practises: - Cache product data in local memory - Subscribe product change event to invalid cache - Non-Blocking http web server - Compress to reduce payload size Technology: - Java - Guava - In Memory Cache - Gridgo - Async IO & Event driven framework
  • 8. Pegasus - Compression Two cases: - Get single product by id - Get multiple product by a list id Solutions: - Using gzip to compress product data to store in cache. Reduce from 200kb text to 3Kb - Handle single product request: return compressed gzip bytes to client directly - Handle multiple product request: store plain product data and merge them to build a list of products, then use Snappy format to compress data and return to client - Gzip format compress better than Snappy and is supported natively by most http client, but Snappy compress faster and use less CPU - Output cache can be enabled in hot-deal situation
  • 9. Pegasus - Technology - Java - Gridgo - Guava - Kafka
  • 10. Pegasus - Benchmark Benchmark use WRK tools, 4VM, L4 LB, GCP: - 90%<6.6ms - 96k request/s Production - 200k request/minutes (all platforms) - <2ms Cache hit: 85%. Increased to 95% with Consistent Hashing With Bounded Load. Average payload: Single (3KB), Multi (60KB)
  • 12. Arcturus - High concurrency Inventory API Problems: - Handle inventory transaction when customers place orders - Handle high concurrency transaction for extreme hot deals with very cheap and low quantity. Exp: Only ten 1đ iPhone XS, only in ten minutes from 9AM to 9h10 AM. - Guarantee eventual consistency of inventory data between many systems
  • 15. Arcturus - Problems and Solutions Problems: - Many customers may place order at the same time, especially for hot skus. - System has to be deterministic and can be recovered after crashing Solutions: - All write requests are published to Kafka with only one partition to guarantee the ordering of message and recover if system crashes. - Non Blocking In Memory Cache for both read and write - All changes are flushed to database asynchronously - Recovery based on the offset of write command
  • 16. Arcturus - In Memory Cache Data Structure Problem: - There will be a race condition when many customer place the same skus at the same time - Building an in memory cache structure that can buffer data changes and flush to database asynchronously Approach: - Each record have a key (string, bytes…) and a value (8 bytes long number). - A transaction is a set of record updates. - Transactions must be ordered (e.g. key_1 must +3 before can be -1). - It can be millions of keys and process upto 10k TPS. - Using ring buffer data structure to guarantee the ordering of changes * The hardest thing is to keep everything ordered when make it fast enough.
  • 17. Arcturus - Old solution - Multi threading application - DB transaction, row locking Pros - Strong consistency - Simple implementation Cons - Slow - Deadlock/timeout implicit risk
  • 18. Arcturus - LMax Architecture - Non Blocking architecture using single thread business logic processor - Use Ring Buffer data structure to communicate between processors - Handle million transaction per seconds
  • 19. Arcturus - Batching marshaller vBatching: use multi thread marshaller ● Pros - Fast (400k-600k ops/s*) - Avoid multi key locking - Write only most updated value ● Cons - Inconsistent transaction offset → use when you just want to make your code as fast as possible. * Macbook pro 2016 15’’ 2.7GHz Core i7, 16G ram. Without I/O, single update per transaction *** Test without I/O: [Single ops vbatch] Total time running for 1000000 transactions: 1085 ms; at rate: 921,658.99 ops/s
  • 20. Arcturus - Batching marshaller hBatching: use single marshaller thread ● Pros - Ensure transaction offset - Ordered db updating ● Cons - A little bit slower than vBatching. → use when you need to re-create application state after fail. *** Test without I/O: [Single ops hbatch] Total time running for 1000000 transactions: 1056 ms; at rate: 946,969.70 ops/s
  • 22. Arcturus - Technology - Java - Kafka - ZeroMQ - Gridgo - MySQL - LMax/Ring Buffer
  • 23. Conclusion - Has to trade-off between consistency and eventual consistency - In Memory is a great way to improve the performance - Reliable replication is the key to split and scale system - Non Blocking architecture is a great way to utilize hardware resources efficiently.