SlideShare a Scribd company logo
Call Data Analysis
for Asterisk & FreeSWITCH
      with MongoDB

     Arezqui Belaid @areskib
     <info@star2billing.com>
Problems to solve

             - Millions of Call records
             - Multiple sources
             - Multiple data formats
             - Replication
             - Fast Analytics
             - Multi-Tenant
             - Realtime
             - Fraud detection
Why MongoDB
- NoSQL - Schema-Less
- Capacity / Sharding
- Upserts
- Replication : Increase read capacity
- Async writes : Millions of entries / acceptable losses
- Compared to CouchDB - native drivers
What does it look like?   Dashboard
Hourly / Daily / Monthly reporting
Compare call traffic
World Map
Realtime
Under the hood
- FreeSWITCH (freeswitch.org)
- Asterisk (asterisk.org)
- Django (djangoproject.com)
- Celery (celeryproject.org)
- RabbitMQ (rabbitmq.com)
- Socket.IO (socket.io)
- MongoDB (mongo.org)
- PyMongo (api.mongo.org)
- and more...
Our Data - Call Detail Record (CDR)
1) Call info :



2) BSON :    CDR = {                                                'hangup_cause_q850':'20',
               ...                                                  'hangup_cause':'NORMAL_CLEARING',
               'callflow':{                                         'sip_received_ip':'192.168.1.21',
                 'caller_profile':{                                 'sip_from_host':'127.0.0.1',
                                                                    'tts_voice':'kal',7',
                   'username':'1000',
                                                                    'accountcode':'1000',
                   'destination_number':'5578193435',               'sip_user_agent':'Blink 0.2.8 (Linux)',
                   'ani':'71737224',                                'answerusec':'0',
                   'caller_id_name':'71737224',                     'caller_id':'71737224',
                   ...                                              'call_uuid':'adee0934-a51b-11e1-a18c-
                 },                                             00231470a30c',
                 ...                                                'answer_stamp':'2012-05-23 15:45:09.856463',
               },                                                   'outbound_caller_id_name':'FreeSWITCH',
               'variables':{                                        'billsec':'66',
                 'mduration':'12960',                               'progress_uepoch':'0',
                 'effective_caller_id_name':'Extension 1000',       'answermsec':'0',
                                                                    'sip_via_rport':'60536',
                 'outbound_caller_id_number':'0000000000',
                                                                    'uduration':'12959984',
                 'duration':'3',                                    'sip_local_sdp_str':'v=0no=FreeSWITCH
                 'end_stamp':'2012-05-23 15:45:12.856527',      1327491731n'
                 'answer_uepoch':'1327521953952257',              },
                 'billmsec':'12960',                            ...
             ...
3) Insert Mongo : db.cdr.insert(CDR);
Pre-Aggregate
Pre-Aggregate - Daily Collection
Produce data easier to manipulate :
              current_y_m_d = datetime.strptime(str(start_uepoch)[:10], "%Y-%m-%d")
              CDR_DAILY.update({
                       'date_y_m_d': current_y_m_d,
                       'destination_number': destination_number,
                       'hangup_cause_id': hangup_cause_id,
                       'accountcode': accountcode,
                       'switch_id': switch.id,
                   },{
                       '$inc':
                          {'calls': 1,
                           'duration': int(cdr['variables']['duration']) }
                   }, upsert=True)

Output db.CDR_DAILY.find() :
{ "_id" : ..., "date_y_m_d" : ISODate("2012-04-30T00:00:00Z"), "accountcode" : "1000", "calls" : 1, "destination_number"
: "0045277522", "duration" : 23, "hangup_cause_id" :9, "switch_id" :1 }
...


                                                                           - Faster to query pre-aggregate data
                                                           - Upsert is your friend / update if exists - insert if not
Map-Reduce - Emit Step
- MapReduce is a batch processing of data
- Applying to previous pre-aggregate collection (Faster / Less data)

             map = mark_safe(u'''
                 function(){
                      emit( {
                           a_Year: this.date_y_m_d.getFullYear(),
                           b_Month: this.date_y_m_d.getMonth() + 1,
                           c_Day: this.date_y_m_d.getDate(),
                           f_Switch: this.switch_id
                         },
                         {calldate__count: 1, duration__sum: this.duration} )
                 }''')
Map-Reduce - Reduce Step
Reduce Step is trivial, it simply sums up and counts :

             reduce = mark_safe(u'''
                 function(key,vals) {
                    var ret = {
                                calldate__count : 0,
                                duration__sum: 0,
                                duration__avg: 0
                            };

                         for (var i=0; i < vals.length; i++){
                            ret.calldate__count += parseInt(vals[i].calldate__count);
                            ret.duration__sum += parseInt(vals[i].duration__sum);
                         }
                         return ret;
                  }
                  ''')
Map-Reduce
Query :
                  out = 'aggregate_cdr_daily'
                  calls_in_day = daily_data.map_reduce(map, reduce, out, query=query_var)


Output db.aggregate_cdr_daily.find() :
{ "_id" : { "a_Year" : 2012, "b_Month" : 5, "c_Day" : 13, "f_Switch" :1 }, "value" : { "calldate__count" : 91,
"duration__sum" : 5559, "duration__avg" : 0 } }
{ "_id" : { "a_Year" : 2012, "b_Month" : 5, "c_Day" : 14, "f_Switch" :1 }, "value" : { "calldate__count" : 284,
"duration__sum" : 13318, "duration__avg" : 0 } }
...
Roadmap

- Quality monitoring
- Audio recording
- Add support for other telecoms switches
- Improve - refactor (Beta)
- Testing
- Listen and Learn
WAT else...?

- Website : http://www.cdr-stats.org

- Code : github.com/star2billing/cdr-stats

- FOSS / Licensed MPLv2

- Get started : Install script
  Try it, it's easy!!!
Questions ?
  Twitter : @areskib
Email : areski@gmail.com

Slides : http://goo.gl/TZLF9

More Related Content

What's hot (14)

PDF
Certified Pseudonym Colligated with Master Secret Key
Vijay Pasupathinathan, PhD
 
PDF
생산적인 개발을 위한 지속적인 테스트
기룡 남
 
PPTX
Bitcoin & Bitcoin Mining
Abdullah Khan Zehady
 
PDF
The Ring programming language version 1.8 book - Part 54 of 202
Mahmoud Samir Fayed
 
PDF
Writing SOLID C++ [gbgcpp meetup @ Zenseact]
Dimitrios Platis
 
PDF
EWD 3 Training Course Part 24: Traversing a Document's Leaf Nodes
Rob Tweed
 
PDF
J slider
Sesum Dragomir
 
PDF
Aggregation Pipeline Power++: MongoDB 4.2 파이프 라인 쿼리, 업데이트 및 구체화된 뷰 소개 [MongoDB]
MongoDB
 
PPTX
Azure Video Analyzer OpenVino Extension Module on Raspberry Pi with Movidius
Knowledge & Experience
 
PDF
Html5 game programming overview
민태 김
 
PDF
자바스크립트 비동기 코드(Javascript asyncronous code)
Kongson Park
 
PPT
JavaTalks: OOD principles
stanislav bashkirtsev
 
PDF
Gaurav Jatav , BCA Third Year
dezyneecole
 
PPTX
HCE tutorial
Chien-Ming Chou
 
Certified Pseudonym Colligated with Master Secret Key
Vijay Pasupathinathan, PhD
 
생산적인 개발을 위한 지속적인 테스트
기룡 남
 
Bitcoin & Bitcoin Mining
Abdullah Khan Zehady
 
The Ring programming language version 1.8 book - Part 54 of 202
Mahmoud Samir Fayed
 
Writing SOLID C++ [gbgcpp meetup @ Zenseact]
Dimitrios Platis
 
EWD 3 Training Course Part 24: Traversing a Document's Leaf Nodes
Rob Tweed
 
J slider
Sesum Dragomir
 
Aggregation Pipeline Power++: MongoDB 4.2 파이프 라인 쿼리, 업데이트 및 구체화된 뷰 소개 [MongoDB]
MongoDB
 
Azure Video Analyzer OpenVino Extension Module on Raspberry Pi with Movidius
Knowledge & Experience
 
Html5 game programming overview
민태 김
 
자바스크립트 비동기 코드(Javascript asyncronous code)
Kongson Park
 
JavaTalks: OOD principles
stanislav bashkirtsev
 
Gaurav Jatav , BCA Third Year
dezyneecole
 
HCE tutorial
Chien-Ming Chou
 

Viewers also liked (9)

PDF
SIP Server Optimizations for Mobile Networks
Daniel-Constantin Mierla
 
PDF
Newfies dialer Auto dialer Software
Areski Belaid
 
PDF
A2Billing : Turning VoIP into business
Areski Belaid
 
PDF
Aynchronous Processing in Kamailio Configuration File
Daniel-Constantin Mierla
 
PDF
Newfies-Dialer : Autodialer software - Documentation version 1.1.0
Areski Belaid
 
PDF
Kamailio - Large Unified Communication Platforms
Daniel-Constantin Mierla
 
PDF
Push to Me: Mobile Push Notifications (Zend Framework)
Mike Willbanks
 
PDF
Newfies dialer Brief Introduction
Areski Belaid
 
PDF
Asterisk, IM and Presence: how?
Saúl Ibarra Corretgé
 
SIP Server Optimizations for Mobile Networks
Daniel-Constantin Mierla
 
Newfies dialer Auto dialer Software
Areski Belaid
 
A2Billing : Turning VoIP into business
Areski Belaid
 
Aynchronous Processing in Kamailio Configuration File
Daniel-Constantin Mierla
 
Newfies-Dialer : Autodialer software - Documentation version 1.1.0
Areski Belaid
 
Kamailio - Large Unified Communication Platforms
Daniel-Constantin Mierla
 
Push to Me: Mobile Push Notifications (Zend Framework)
Mike Willbanks
 
Newfies dialer Brief Introduction
Areski Belaid
 
Asterisk, IM and Presence: how?
Saúl Ibarra Corretgé
 
Ad

Similar to Cdr stats-vo ip-analytics_solution_mongodb_meetup (20)

PPTX
Operational Intelligence with MongoDB Webinar
MongoDB
 
PDF
Making Big Data Analytics Interactive and Real-­Time
Seven Nguyen
 
PDF
Spark and shark
DataWorks Summit
 
PDF
API Days 2012 - 1 billion SMS through an API !
Guilhem Ensuque
 
PDF
Introduction to Couchbase Server 2.0 - CouchConf SF - Tour and Demo
Dipti Borkar
 
KEY
Mongo scaling
Simon Maynard
 
PPTX
Spark and Shark: Lightning-Fast Analytics over Hadoop and Hive Data
Jetlore
 
PDF
MongoDB & Hadoop: Flexible Hourly Batch Processing Model
Takahiro Inoue
 
KEY
Schema Design at Scale
Rick Copeland
 
PDF
Fluentd meetup #3
Treasure Data, Inc.
 
PDF
Customer Satisfaction Prediction_pdf.pdf
fxq4v3
 
KEY
Cloudwatch - The In's and Out's
beaknit
 
PPT
Si pp introduction_2
kamrandb2
 
PPTX
Introduction to Couchbase Server 2.0
Dipti Borkar
 
PDF
Couchbase Korea User Gorup 2nd Meetup #1
won min jang
 
PDF
Navigating the Transition from relational to NoSQL - CloudCon Expo 2012
Dipti Borkar
 
PDF
Transition from relational to NoSQL Philly DAMA Day
Dipti Borkar
 
PPTX
Best Practices in Handling Performance Issues
Odoo
 
PDF
Storage, retreival and process of continuous streaming data in a widearea fre...
Nitesh Pandit
 
PDF
Wide area frequency easurement system iitb
PanditNitesh
 
Operational Intelligence with MongoDB Webinar
MongoDB
 
Making Big Data Analytics Interactive and Real-­Time
Seven Nguyen
 
Spark and shark
DataWorks Summit
 
API Days 2012 - 1 billion SMS through an API !
Guilhem Ensuque
 
Introduction to Couchbase Server 2.0 - CouchConf SF - Tour and Demo
Dipti Borkar
 
Mongo scaling
Simon Maynard
 
Spark and Shark: Lightning-Fast Analytics over Hadoop and Hive Data
Jetlore
 
MongoDB & Hadoop: Flexible Hourly Batch Processing Model
Takahiro Inoue
 
Schema Design at Scale
Rick Copeland
 
Fluentd meetup #3
Treasure Data, Inc.
 
Customer Satisfaction Prediction_pdf.pdf
fxq4v3
 
Cloudwatch - The In's and Out's
beaknit
 
Si pp introduction_2
kamrandb2
 
Introduction to Couchbase Server 2.0
Dipti Borkar
 
Couchbase Korea User Gorup 2nd Meetup #1
won min jang
 
Navigating the Transition from relational to NoSQL - CloudCon Expo 2012
Dipti Borkar
 
Transition from relational to NoSQL Philly DAMA Day
Dipti Borkar
 
Best Practices in Handling Performance Issues
Odoo
 
Storage, retreival and process of continuous streaming data in a widearea fre...
Nitesh Pandit
 
Wide area frequency easurement system iitb
PanditNitesh
 
Ad

More from christkv (9)

PDF
From SQL to MongoDB
christkv
 
PDF
New in MongoDB 2.6
christkv
 
PDF
Lessons from 4 years of driver develoment
christkv
 
PPTX
Storage talk
christkv
 
KEY
Mongo db ecommerce
christkv
 
KEY
Mongodb intro
christkv
 
KEY
Schema design
christkv
 
KEY
Node js mongodriver
christkv
 
PDF
Node.js and ruby
christkv
 
From SQL to MongoDB
christkv
 
New in MongoDB 2.6
christkv
 
Lessons from 4 years of driver develoment
christkv
 
Storage talk
christkv
 
Mongo db ecommerce
christkv
 
Mongodb intro
christkv
 
Schema design
christkv
 
Node js mongodriver
christkv
 
Node.js and ruby
christkv
 

Recently uploaded (20)

PDF
Complete JavaScript Notes: From Basics to Advanced Concepts.pdf
haydendavispro
 
PDF
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PPTX
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
PDF
Python basic programing language for automation
DanialHabibi2
 
PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PDF
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
PDF
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
PDF
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PPTX
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
PDF
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
PDF
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PDF
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
PDF
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
Complete JavaScript Notes: From Basics to Advanced Concepts.pdf
haydendavispro
 
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
Python basic programing language for automation
DanialHabibi2
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 

Cdr stats-vo ip-analytics_solution_mongodb_meetup

  • 1. Call Data Analysis for Asterisk & FreeSWITCH with MongoDB Arezqui Belaid @areskib <[email protected]>
  • 2. Problems to solve - Millions of Call records - Multiple sources - Multiple data formats - Replication - Fast Analytics - Multi-Tenant - Realtime - Fraud detection
  • 3. Why MongoDB - NoSQL - Schema-Less - Capacity / Sharding - Upserts - Replication : Increase read capacity - Async writes : Millions of entries / acceptable losses - Compared to CouchDB - native drivers
  • 4. What does it look like? Dashboard
  • 5. Hourly / Daily / Monthly reporting
  • 9. Under the hood - FreeSWITCH (freeswitch.org) - Asterisk (asterisk.org) - Django (djangoproject.com) - Celery (celeryproject.org) - RabbitMQ (rabbitmq.com) - Socket.IO (socket.io) - MongoDB (mongo.org) - PyMongo (api.mongo.org) - and more...
  • 10. Our Data - Call Detail Record (CDR) 1) Call info : 2) BSON : CDR = { 'hangup_cause_q850':'20', ... 'hangup_cause':'NORMAL_CLEARING', 'callflow':{ 'sip_received_ip':'192.168.1.21', 'caller_profile':{ 'sip_from_host':'127.0.0.1', 'tts_voice':'kal',7', 'username':'1000', 'accountcode':'1000', 'destination_number':'5578193435', 'sip_user_agent':'Blink 0.2.8 (Linux)', 'ani':'71737224', 'answerusec':'0', 'caller_id_name':'71737224', 'caller_id':'71737224', ... 'call_uuid':'adee0934-a51b-11e1-a18c- }, 00231470a30c', ... 'answer_stamp':'2012-05-23 15:45:09.856463', }, 'outbound_caller_id_name':'FreeSWITCH', 'variables':{ 'billsec':'66', 'mduration':'12960', 'progress_uepoch':'0', 'effective_caller_id_name':'Extension 1000', 'answermsec':'0', 'sip_via_rport':'60536', 'outbound_caller_id_number':'0000000000', 'uduration':'12959984', 'duration':'3', 'sip_local_sdp_str':'v=0no=FreeSWITCH 'end_stamp':'2012-05-23 15:45:12.856527', 1327491731n' 'answer_uepoch':'1327521953952257', }, 'billmsec':'12960', ... ... 3) Insert Mongo : db.cdr.insert(CDR);
  • 12. Pre-Aggregate - Daily Collection Produce data easier to manipulate : current_y_m_d = datetime.strptime(str(start_uepoch)[:10], "%Y-%m-%d") CDR_DAILY.update({ 'date_y_m_d': current_y_m_d, 'destination_number': destination_number, 'hangup_cause_id': hangup_cause_id, 'accountcode': accountcode, 'switch_id': switch.id, },{ '$inc': {'calls': 1, 'duration': int(cdr['variables']['duration']) } }, upsert=True) Output db.CDR_DAILY.find() : { "_id" : ..., "date_y_m_d" : ISODate("2012-04-30T00:00:00Z"), "accountcode" : "1000", "calls" : 1, "destination_number" : "0045277522", "duration" : 23, "hangup_cause_id" :9, "switch_id" :1 } ... - Faster to query pre-aggregate data - Upsert is your friend / update if exists - insert if not
  • 13. Map-Reduce - Emit Step - MapReduce is a batch processing of data - Applying to previous pre-aggregate collection (Faster / Less data) map = mark_safe(u''' function(){ emit( { a_Year: this.date_y_m_d.getFullYear(), b_Month: this.date_y_m_d.getMonth() + 1, c_Day: this.date_y_m_d.getDate(), f_Switch: this.switch_id }, {calldate__count: 1, duration__sum: this.duration} ) }''')
  • 14. Map-Reduce - Reduce Step Reduce Step is trivial, it simply sums up and counts : reduce = mark_safe(u''' function(key,vals) { var ret = { calldate__count : 0, duration__sum: 0, duration__avg: 0 }; for (var i=0; i < vals.length; i++){ ret.calldate__count += parseInt(vals[i].calldate__count); ret.duration__sum += parseInt(vals[i].duration__sum); } return ret; } ''')
  • 15. Map-Reduce Query : out = 'aggregate_cdr_daily' calls_in_day = daily_data.map_reduce(map, reduce, out, query=query_var) Output db.aggregate_cdr_daily.find() : { "_id" : { "a_Year" : 2012, "b_Month" : 5, "c_Day" : 13, "f_Switch" :1 }, "value" : { "calldate__count" : 91, "duration__sum" : 5559, "duration__avg" : 0 } } { "_id" : { "a_Year" : 2012, "b_Month" : 5, "c_Day" : 14, "f_Switch" :1 }, "value" : { "calldate__count" : 284, "duration__sum" : 13318, "duration__avg" : 0 } } ...
  • 16. Roadmap - Quality monitoring - Audio recording - Add support for other telecoms switches - Improve - refactor (Beta) - Testing - Listen and Learn
  • 17. WAT else...? - Website : http://www.cdr-stats.org - Code : github.com/star2billing/cdr-stats - FOSS / Licensed MPLv2 - Get started : Install script Try it, it's easy!!!
  • 18. Questions ? Twitter : @areskib Email : [email protected] Slides : http://goo.gl/TZLF9