SlideShare a Scribd company logo
Petabytes for Peanuts! Making sense of “Ambient Data” SQL Server Stream InsightIng. Eduardo Castro, PhDComunidad Windowsecastro@grupoasesor.nethttp://ecastrom.blogspot.com
Key Takeaways…Massive shift in how we process dataIncredible data volumesRemaking how we discoverChanging the Scientific MethodReducing latency & impedanceExtreme Scale Data ProcessingStream Processing (Several Views)From “programs” to “queries”What’s up with this “anti-SQL” stuff anyhow?
1997Storage Cost: $~1.00Transfer Time: ½ hour2009Storage Cost: ~0.1₵Transfer Time: 8 sec.1982Storage Cost: $~2000Transfer Time: 1 day“Free” Storage Power
Ambient Data?Over 84 percent of Americans have cell phones, according to Steve Largent, president and CEO of CTIA. While two trillion minutes were used in 2007, an 18 percent increase over 2006 talk times. More than 48 billion text messages were sent in the month of December 2007, an average 1.6 billion messages per day. The rate of text messaging represented a 157 percent increase over December 2006 texting. http://www.clickz.com/3628985Text Message Traffic in US: 	160GB / day  58TB / yearVoice traffic in US (GSM encoding)	200PB / year
The Old WorldData volumes constrained by human typing speedApp & Data formed closed systemAppAssume 200M people in US typing 8 hr / day @ 10K keystokes / hour: 2TB/hror ~6PB / yearDB
The Old New WorldAvailable data explodedAvailable DataQuestions toAnswerWhat data shouldwe throw out?Design SchemaDesign ETLWhat if we havea new question?DW Nirvana!
The New World of Abundant DataSave All Available DataHypothesize  Theorize  TestNew Question to AnswerAlgorithmicProcessingRun “query”over data…ExploitCorrelation…Correlation isEnough!Analyze reduced dataThe CMS front end of the Large Hadron Collider records 1TB/sec!http://blogs.discovermagazine.com/cosmicvariance/2006/09/27/lhc-factoids/Interesting Read: The Petabyte Age: Because More Isn't Just More — More Is Differenthttp://www.wired.com/science/discoveries/magazine/16-07/pb_intro
Analyze  Model  Monitor1Event Stream both stored and processedEvent ProcessingEngine4Produce real time alerts and actionEvent StreamAlerts & Action3Models installed in event processing engineCorrelation Model2Analysis produces event correlation modelsAnalysis
Extreme Scale Data ProcessingSourceDWTraditional Data WarehouseSourceSourceETLSourceSourceAnalysis / ReportingSourceSourceExtreme ScaleData ProcessingDWNon-traditionalSources12Majority of data filtered or discardedAll data retained and reprocessedAnalysis / ReportingAnalysis
SQL Server 2008 R2 – StreamInsight TechnologyData volumes are exploding with event data streaming from sources such as RFID, sensors and web logs The size and frequency of the data make it challenging to store for data mining and analysis. The ability to monitor, analyze and take business decisions in near real-time
SQL Server StreamInsight’sSQL Server StreamInsight’s ability to derive insights from data streams and act in near real time provides significant business benefits. Some of the possible scenarios include: Algorithmic trading and fraud detection for financial services Industrial process control (chemicals, oil and gas) for manufacturing Electric grid monitoring and advanced metering for utilitiesClick stream web analyticsNetwork and data center system monitoring.
.NETC#LINQStreamInsight Application DevelopmentStreamInsight Application at RuntimeEvent sourcesEvent targetsInputAdaptersOutputAdaptersStreamInsight EngineDevices, SensorsPagers &Monitoring devicesStanding QueriesKPI Dashboards, SharePoint UIWeb serversQuery LogicQuery LogicTrading stationsEvent stores & DatabasesQuery LogicEvent stores & DatabasesStock ticker, news feedsStreamInsight Platform
SQL Server 2008 R2 StreamInsight
EventsRepresent the user payload along with temporal characteristicsStreamsSequence of eventsFlows into (one or more) standing queries in StreamInsightengineQueriesOperate on event streamsApply desired semantics on eventsAdaptersConvert custom data from event sources to / from StreamInsight eventsKey Concepts
EventComplex Event Processing (CEP) is the continuous and incremental processing of event streams from multiple sources based on declarative query and pattern specifications with near-zero latency. requestoutput streaminput streamresponseWhat is CEP?
LatencyRelational Database ApplicationsCEP Target ScenariosOperational Analytics Applications, Logistics, etc.Data Warehousing ApplicationsWeb Analytics ApplicationsManufacturing Applications        Financial Trading ApplicationsMonitoring ApplicationsAggregate Data Rate (Events/sec)Event Processing Scenarios
Use Case: Customer SegmentationAnalysis of Click Streams on MSN.comWeb Server log streamed into StreamInsightCategorizing user behavior based on URL:Click targetsSearch keywordsSegmentation of user IDs into marketsAdapting navigational structure and ad placement in real timePatterns over time windows: user first clicks PageA, then PageB, then PageC within X secondsHigh performance requirementsMillions of online usersLow latency (seconds)Possible late events
SQL Server 2008 R2 StreamInsight
Use Case: NBC Sunday Night Football1Telemetry Receiver4StreamInsightListener AdapterGeoTag and group by regionSQL AdapterPerfCounter Adapter2Count total eventsCount session startsCount active sessions3
Use Case: Data CenterPower ConsumptionVisualizeProcess InformationComplex Aggregations/CorrelationsCentraltime seriesarchiveQueryETWInput AdapterQuery21QueryPower MeterInput Adapter3
ChallengesHow do I …detect interesting patterns?reason about temporal semantics?correlate data?aggregate data?avoid writing custom imperative code?create a runtime environment for continuous and event-driven processing?    As a developer, I need a platform!
Query ExpressivenessSelection of events (filter)Calculations on the payload (project)Correlation of streams (join)Stream partitioning (group and apply)Aggregation (sum, count, …) over event windowsRanking over event windows (topK)
ProjectionFilterCorrelation (Join)Aggregation over windowsGroup and AggregateQuery Expressivenessvar result = from e ininputStreamgroup e by e.id intoeachGroupfrom win ineachGroup.TumblingWindow(TimeSpan.FromSeconds(10))selectnew { eachGroup.Key,avg = win.Avg(e => e.W) };
ConclusionCEP Platform & APIEvent-triggered, fast ComputationAPI for Adapters, Queries, ApplicationsDeclarative LINQFlexible Adapter APIExtensibleSupportability
Q&A
Linkshttp://comunidadwindows.orghttp://ecastrom.blogspot.comhttp://www.microsoft.com/sql

More Related Content

PDF
Empower Splunk and other SIEMs with the Databricks Lakehouse for Cybersecurity
Databricks
 
PPT
VeriSign iDefense Security Intelligence Services
TechBiz Forense Digital
 
PDF
Big Data Analytics for Real Time Systems
Kamalika Dutta
 
PDF
Introduction to Big Data Analytics: Batch, Real-Time, and the Best of Both Wo...
WSO2
 
PPTX
Digital Velocity 2014: "The Holy Grail of Digital Data Analytics"
Tealium
 
PDF
Innovating With Data and Analytics
VMware Tanzu
 
PPTX
User and entity behavior analytics: building an effective solution
Yolanta Beresna
 
PPT
My other computer is a datacentre - 2012 edition
Steve Loughran
 
Empower Splunk and other SIEMs with the Databricks Lakehouse for Cybersecurity
Databricks
 
VeriSign iDefense Security Intelligence Services
TechBiz Forense Digital
 
Big Data Analytics for Real Time Systems
Kamalika Dutta
 
Introduction to Big Data Analytics: Batch, Real-Time, and the Best of Both Wo...
WSO2
 
Digital Velocity 2014: "The Holy Grail of Digital Data Analytics"
Tealium
 
Innovating With Data and Analytics
VMware Tanzu
 
User and entity behavior analytics: building an effective solution
Yolanta Beresna
 
My other computer is a datacentre - 2012 edition
Steve Loughran
 

What's hot (20)

PDF
DataPortal Presentation
DataPortal
 
PDF
Introducing Databricks Delta
Databricks
 
PDF
Threat Detection and Response at Scale with Dominique Brezinski
Databricks
 
PPTX
Unlocking Operational Intelligence from the Data Lake
MongoDB
 
PDF
Creating a Modern Data Architecture for Digital Transformation
MongoDB
 
PDF
Deep Learning in Security - Examples, Infrastructure, Challenges, and Suggest...
DataWorks Summit
 
PDF
Streaming analytics
Gerard McNamee
 
PPTX
Shikha fdp 62_14july2017
Dr. Shikha Mehta
 
PDF
Lessons from building a stream-first metadata platform | Shirshanka Das, Stealth
HostedbyConfluent
 
PDF
The Central Hub: Defining the Data Lake
Eric Kavanagh
 
PPTX
November 2013 HUG: Cyber Security with Hadoop
Yahoo Developer Network
 
PPTX
Requirements document for big data use cases
Allied Consultants
 
PDF
Tutorial - Modern Real Time Streaming Architectures
Karthik Ramasamy
 
PPTX
Solving the Disconnected Data Problem in Healthcare Using MongoDB
MongoDB
 
PDF
Big data storage
Vikram Nandini
 
PPTX
Predictive maintenance withsensors_in_utilities_
Tina Zhang
 
PPTX
Data to Insight in a Flash: Introduction to Real-Time Analytics with WSO2 Com...
WSO2
 
PDF
Data Virtualization: From Zero to Hero (Middle East)
Denodo
 
PDF
Maximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
Denodo
 
PDF
(Tugdual grall) no sql-hadoop
NAVER D2
 
DataPortal Presentation
DataPortal
 
Introducing Databricks Delta
Databricks
 
Threat Detection and Response at Scale with Dominique Brezinski
Databricks
 
Unlocking Operational Intelligence from the Data Lake
MongoDB
 
Creating a Modern Data Architecture for Digital Transformation
MongoDB
 
Deep Learning in Security - Examples, Infrastructure, Challenges, and Suggest...
DataWorks Summit
 
Streaming analytics
Gerard McNamee
 
Shikha fdp 62_14july2017
Dr. Shikha Mehta
 
Lessons from building a stream-first metadata platform | Shirshanka Das, Stealth
HostedbyConfluent
 
The Central Hub: Defining the Data Lake
Eric Kavanagh
 
November 2013 HUG: Cyber Security with Hadoop
Yahoo Developer Network
 
Requirements document for big data use cases
Allied Consultants
 
Tutorial - Modern Real Time Streaming Architectures
Karthik Ramasamy
 
Solving the Disconnected Data Problem in Healthcare Using MongoDB
MongoDB
 
Big data storage
Vikram Nandini
 
Predictive maintenance withsensors_in_utilities_
Tina Zhang
 
Data to Insight in a Flash: Introduction to Real-Time Analytics with WSO2 Com...
WSO2
 
Data Virtualization: From Zero to Hero (Middle East)
Denodo
 
Maximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
Denodo
 
(Tugdual grall) no sql-hadoop
NAVER D2
 
Ad

Similar to SQL Server 2008 R2 StreamInsight (20)

PPTX
Big data in Private Banking
Jérôme Kehrli
 
PDF
Self-Tuning Data Centers
Reza Rahimi
 
PDF
Introduction Big Data
Frank Kienle
 
PDF
Advanced Analytics and Machine Learning with Data Virtualization
Denodo
 
PPT
Best practices and trends in people soft
Hazelknight Media & Entertainment Pvt Ltd
 
PDF
Actionable Insights - Thompson
Prolifics
 
PPTX
Microsoft SQL Server - StreamInsight Overview Presentation
Microsoft Private Cloud
 
PDF
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
confluent
 
PPTX
WebAction-Sami Abkay
Inside Analysis
 
PDF
Spark Streaming and IoT by Mike Freedman
Spark Summit
 
PDF
Introduction to Stream Processing
Guido Schmutz
 
PPS
Qo Introduction V2
Joe_F
 
PDF
Webinar Data Mesh - Part 3
Jeffrey T. Pollock
 
PDF
Big Data Paris - A Modern Enterprise Architecture
MongoDB
 
PPTX
ParStream - Big Data for Business Users
ParStream Inc.
 
PDF
Elastic Stack: Using data for insight and action
Elasticsearch
 
PDF
Dell NVIDIA AI Powered Transformation in Financial Services Webinar
Bill Wong
 
PDF
Spark meetup stream processing use cases
punesparkmeetup
 
PDF
Les objets connectés : de nombreux cas d'usage
Jedha Bootcamp
 
PDF
Event Stream Processing SAP
Gaurav Ahluwalia
 
Big data in Private Banking
Jérôme Kehrli
 
Self-Tuning Data Centers
Reza Rahimi
 
Introduction Big Data
Frank Kienle
 
Advanced Analytics and Machine Learning with Data Virtualization
Denodo
 
Best practices and trends in people soft
Hazelknight Media & Entertainment Pvt Ltd
 
Actionable Insights - Thompson
Prolifics
 
Microsoft SQL Server - StreamInsight Overview Presentation
Microsoft Private Cloud
 
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
confluent
 
WebAction-Sami Abkay
Inside Analysis
 
Spark Streaming and IoT by Mike Freedman
Spark Summit
 
Introduction to Stream Processing
Guido Schmutz
 
Qo Introduction V2
Joe_F
 
Webinar Data Mesh - Part 3
Jeffrey T. Pollock
 
Big Data Paris - A Modern Enterprise Architecture
MongoDB
 
ParStream - Big Data for Business Users
ParStream Inc.
 
Elastic Stack: Using data for insight and action
Elasticsearch
 
Dell NVIDIA AI Powered Transformation in Financial Services Webinar
Bill Wong
 
Spark meetup stream processing use cases
punesparkmeetup
 
Les objets connectés : de nombreux cas d'usage
Jedha Bootcamp
 
Event Stream Processing SAP
Gaurav Ahluwalia
 
Ad

More from Eduardo Castro (20)

PPTX
Introducción a polybase en SQL Server
Eduardo Castro
 
PPTX
Creando tu primer ambiente de AI en Azure ML y SQL Server
Eduardo Castro
 
PPTX
Seguridad en SQL Azure
Eduardo Castro
 
PPTX
Azure Synapse Analytics MLflow
Eduardo Castro
 
PPTX
SQL Server 2019 con Windows Server 2022
Eduardo Castro
 
PPTX
Novedades en SQL Server 2022
Eduardo Castro
 
PPTX
Introduccion a SQL Server 2022
Eduardo Castro
 
PPTX
Machine Learning con Azure Managed Instance
Eduardo Castro
 
PPTX
Novedades en sql server 2022
Eduardo Castro
 
PDF
Sql server 2019 con windows server 2022
Eduardo Castro
 
PDF
Introduccion a databricks
Eduardo Castro
 
PDF
Pronosticos con sql server
Eduardo Castro
 
PDF
Data warehouse con azure synapse analytics
Eduardo Castro
 
PPTX
Que hay de nuevo en el Azure Data Lake Storage Gen2
Eduardo Castro
 
PPTX
Introduccion a Azure Synapse Analytics
Eduardo Castro
 
PPTX
Seguridad de SQL Database en Azure
Eduardo Castro
 
PPTX
Python dentro de SQL Server
Eduardo Castro
 
PDF
Servicios Cognitivos de de Microsoft
Eduardo Castro
 
TXT
Script de paso a paso de configuración de Secure Enclaves
Eduardo Castro
 
PDF
Introducción a conceptos de SQL Server Secure Enclaves
Eduardo Castro
 
Introducción a polybase en SQL Server
Eduardo Castro
 
Creando tu primer ambiente de AI en Azure ML y SQL Server
Eduardo Castro
 
Seguridad en SQL Azure
Eduardo Castro
 
Azure Synapse Analytics MLflow
Eduardo Castro
 
SQL Server 2019 con Windows Server 2022
Eduardo Castro
 
Novedades en SQL Server 2022
Eduardo Castro
 
Introduccion a SQL Server 2022
Eduardo Castro
 
Machine Learning con Azure Managed Instance
Eduardo Castro
 
Novedades en sql server 2022
Eduardo Castro
 
Sql server 2019 con windows server 2022
Eduardo Castro
 
Introduccion a databricks
Eduardo Castro
 
Pronosticos con sql server
Eduardo Castro
 
Data warehouse con azure synapse analytics
Eduardo Castro
 
Que hay de nuevo en el Azure Data Lake Storage Gen2
Eduardo Castro
 
Introduccion a Azure Synapse Analytics
Eduardo Castro
 
Seguridad de SQL Database en Azure
Eduardo Castro
 
Python dentro de SQL Server
Eduardo Castro
 
Servicios Cognitivos de de Microsoft
Eduardo Castro
 
Script de paso a paso de configuración de Secure Enclaves
Eduardo Castro
 
Introducción a conceptos de SQL Server Secure Enclaves
Eduardo Castro
 

Recently uploaded (20)

PDF
How-Cloud-Computing-Impacts-Businesses-in-2025-and-Beyond.pdf
Artjoker Software Development Company
 
PDF
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
PDF
Brief History of Internet - Early Days of Internet
sutharharshit158
 
PDF
AI Unleashed - Shaping the Future -Starting Today - AIOUG Yatra 2025 - For Co...
Sandesh Rao
 
PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
PPTX
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 
PPTX
How to Build a Scalable Micro-Investing Platform in 2025 - A Founder’s Guide ...
Third Rock Techkno
 
PPT
Coupa-Kickoff-Meeting-Template presentai
annapureddyn
 
PDF
Cloud-Migration-Best-Practices-A-Practical-Guide-to-AWS-Azure-and-Google-Clou...
Artjoker Software Development Company
 
PDF
Automating ArcGIS Content Discovery with FME: A Real World Use Case
Safe Software
 
PPTX
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
PDF
SparkLabs Primer on Artificial Intelligence 2025
SparkLabs Group
 
PDF
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
PDF
A Day in the Life of Location Data - Turning Where into How.pdf
Precisely
 
PPTX
Applied-Statistics-Mastering-Data-Driven-Decisions.pptx
parmaryashparmaryash
 
PPTX
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
PDF
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
PDF
The Evolution of KM Roles (Presented at Knowledge Summit Dublin 2025)
Enterprise Knowledge
 
PDF
Beyond Automation: The Role of IoT Sensor Integration in Next-Gen Industries
Rejig Digital
 
PPTX
Comunidade Salesforce São Paulo - Desmistificando o Omnistudio (Vlocity)
Francisco Vieira Júnior
 
How-Cloud-Computing-Impacts-Businesses-in-2025-and-Beyond.pdf
Artjoker Software Development Company
 
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
Brief History of Internet - Early Days of Internet
sutharharshit158
 
AI Unleashed - Shaping the Future -Starting Today - AIOUG Yatra 2025 - For Co...
Sandesh Rao
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 
How to Build a Scalable Micro-Investing Platform in 2025 - A Founder’s Guide ...
Third Rock Techkno
 
Coupa-Kickoff-Meeting-Template presentai
annapureddyn
 
Cloud-Migration-Best-Practices-A-Practical-Guide-to-AWS-Azure-and-Google-Clou...
Artjoker Software Development Company
 
Automating ArcGIS Content Discovery with FME: A Real World Use Case
Safe Software
 
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
SparkLabs Primer on Artificial Intelligence 2025
SparkLabs Group
 
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
A Day in the Life of Location Data - Turning Where into How.pdf
Precisely
 
Applied-Statistics-Mastering-Data-Driven-Decisions.pptx
parmaryashparmaryash
 
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
The Evolution of KM Roles (Presented at Knowledge Summit Dublin 2025)
Enterprise Knowledge
 
Beyond Automation: The Role of IoT Sensor Integration in Next-Gen Industries
Rejig Digital
 
Comunidade Salesforce São Paulo - Desmistificando o Omnistudio (Vlocity)
Francisco Vieira Júnior
 

SQL Server 2008 R2 StreamInsight

  • 1. Petabytes for Peanuts! Making sense of “Ambient Data” SQL Server Stream InsightIng. Eduardo Castro, PhDComunidad [email protected]://ecastrom.blogspot.com
  • 2. Key Takeaways…Massive shift in how we process dataIncredible data volumesRemaking how we discoverChanging the Scientific MethodReducing latency & impedanceExtreme Scale Data ProcessingStream Processing (Several Views)From “programs” to “queries”What’s up with this “anti-SQL” stuff anyhow?
  • 3. 1997Storage Cost: $~1.00Transfer Time: ½ hour2009Storage Cost: ~0.1₵Transfer Time: 8 sec.1982Storage Cost: $~2000Transfer Time: 1 day“Free” Storage Power
  • 4. Ambient Data?Over 84 percent of Americans have cell phones, according to Steve Largent, president and CEO of CTIA. While two trillion minutes were used in 2007, an 18 percent increase over 2006 talk times. More than 48 billion text messages were sent in the month of December 2007, an average 1.6 billion messages per day. The rate of text messaging represented a 157 percent increase over December 2006 texting. http://www.clickz.com/3628985Text Message Traffic in US: 160GB / day  58TB / yearVoice traffic in US (GSM encoding) 200PB / year
  • 5. The Old WorldData volumes constrained by human typing speedApp & Data formed closed systemAppAssume 200M people in US typing 8 hr / day @ 10K keystokes / hour: 2TB/hror ~6PB / yearDB
  • 6. The Old New WorldAvailable data explodedAvailable DataQuestions toAnswerWhat data shouldwe throw out?Design SchemaDesign ETLWhat if we havea new question?DW Nirvana!
  • 7. The New World of Abundant DataSave All Available DataHypothesize  Theorize  TestNew Question to AnswerAlgorithmicProcessingRun “query”over data…ExploitCorrelation…Correlation isEnough!Analyze reduced dataThe CMS front end of the Large Hadron Collider records 1TB/sec!http://blogs.discovermagazine.com/cosmicvariance/2006/09/27/lhc-factoids/Interesting Read: The Petabyte Age: Because More Isn't Just More — More Is Differenthttp://www.wired.com/science/discoveries/magazine/16-07/pb_intro
  • 8. Analyze  Model  Monitor1Event Stream both stored and processedEvent ProcessingEngine4Produce real time alerts and actionEvent StreamAlerts & Action3Models installed in event processing engineCorrelation Model2Analysis produces event correlation modelsAnalysis
  • 9. Extreme Scale Data ProcessingSourceDWTraditional Data WarehouseSourceSourceETLSourceSourceAnalysis / ReportingSourceSourceExtreme ScaleData ProcessingDWNon-traditionalSources12Majority of data filtered or discardedAll data retained and reprocessedAnalysis / ReportingAnalysis
  • 10. SQL Server 2008 R2 – StreamInsight TechnologyData volumes are exploding with event data streaming from sources such as RFID, sensors and web logs The size and frequency of the data make it challenging to store for data mining and analysis. The ability to monitor, analyze and take business decisions in near real-time
  • 11. SQL Server StreamInsight’sSQL Server StreamInsight’s ability to derive insights from data streams and act in near real time provides significant business benefits. Some of the possible scenarios include: Algorithmic trading and fraud detection for financial services Industrial process control (chemicals, oil and gas) for manufacturing Electric grid monitoring and advanced metering for utilitiesClick stream web analyticsNetwork and data center system monitoring.
  • 12. .NETC#LINQStreamInsight Application DevelopmentStreamInsight Application at RuntimeEvent sourcesEvent targetsInputAdaptersOutputAdaptersStreamInsight EngineDevices, SensorsPagers &Monitoring devicesStanding QueriesKPI Dashboards, SharePoint UIWeb serversQuery LogicQuery LogicTrading stationsEvent stores & DatabasesQuery LogicEvent stores & DatabasesStock ticker, news feedsStreamInsight Platform
  • 14. EventsRepresent the user payload along with temporal characteristicsStreamsSequence of eventsFlows into (one or more) standing queries in StreamInsightengineQueriesOperate on event streamsApply desired semantics on eventsAdaptersConvert custom data from event sources to / from StreamInsight eventsKey Concepts
  • 15. EventComplex Event Processing (CEP) is the continuous and incremental processing of event streams from multiple sources based on declarative query and pattern specifications with near-zero latency. requestoutput streaminput streamresponseWhat is CEP?
  • 16. LatencyRelational Database ApplicationsCEP Target ScenariosOperational Analytics Applications, Logistics, etc.Data Warehousing ApplicationsWeb Analytics ApplicationsManufacturing Applications Financial Trading ApplicationsMonitoring ApplicationsAggregate Data Rate (Events/sec)Event Processing Scenarios
  • 17. Use Case: Customer SegmentationAnalysis of Click Streams on MSN.comWeb Server log streamed into StreamInsightCategorizing user behavior based on URL:Click targetsSearch keywordsSegmentation of user IDs into marketsAdapting navigational structure and ad placement in real timePatterns over time windows: user first clicks PageA, then PageB, then PageC within X secondsHigh performance requirementsMillions of online usersLow latency (seconds)Possible late events
  • 19. Use Case: NBC Sunday Night Football1Telemetry Receiver4StreamInsightListener AdapterGeoTag and group by regionSQL AdapterPerfCounter Adapter2Count total eventsCount session startsCount active sessions3
  • 20. Use Case: Data CenterPower ConsumptionVisualizeProcess InformationComplex Aggregations/CorrelationsCentraltime seriesarchiveQueryETWInput AdapterQuery21QueryPower MeterInput Adapter3
  • 21. ChallengesHow do I …detect interesting patterns?reason about temporal semantics?correlate data?aggregate data?avoid writing custom imperative code?create a runtime environment for continuous and event-driven processing? As a developer, I need a platform!
  • 22. Query ExpressivenessSelection of events (filter)Calculations on the payload (project)Correlation of streams (join)Stream partitioning (group and apply)Aggregation (sum, count, …) over event windowsRanking over event windows (topK)
  • 23. ProjectionFilterCorrelation (Join)Aggregation over windowsGroup and AggregateQuery Expressivenessvar result = from e ininputStreamgroup e by e.id intoeachGroupfrom win ineachGroup.TumblingWindow(TimeSpan.FromSeconds(10))selectnew { eachGroup.Key,avg = win.Avg(e => e.W) };
  • 24. ConclusionCEP Platform & APIEvent-triggered, fast ComputationAPI for Adapters, Queries, ApplicationsDeclarative LINQFlexible Adapter APIExtensibleSupportability
  • 25. Q&A

Editor's Notes

  • #11: Data volumes are exploding with event data streaming from sources such as RFID, sensors and web logs across industries including manufacturing, financial services and utilities.  The size and frequency of the data make it challenging to store for data mining and analysis.  The ability to monitor, analyze and act on the data in motion provides significant opportunity to make more informed business decisions in near real-time
  • #19: NBC Sunday Night Football: live streaming through SilverlightRich client experience, multiple camera anglesNeeded: track, monitor, analyze user behavior, based on silverlight Media analytics