SlideShare a Scribd company logo
Webhooks
Near-real time event processing with guaranteed delivery of HTTP callbacks
HBaseCon 2015
Alan Steckley
Principal Software Engineer, Salesforce
2
Poorna Chandra
Software Engineer, Cask
3
​Safe harbor statement under the Private Securities Litigation Reform Act of 1995:
​This presentation may contain forward-looking statements that involve risks, uncertainties, and assumptions. If any such uncertainties
materialize or if any of the assumptions proves incorrect, the results of salesforce.com, inc. could differ materially from the results
expressed or implied by the forward-looking statements we make. All statements other than statements of historical fact could be
deemed forward-looking, including any projections of product or service availability, subscriber growth, earnings, revenues, or other
financial items and any statements regarding strategies or plans of management for future operations, statements of belief, any
statements concerning new, planned, or upgraded services or technology developments and customer contracts or use of our services.
​The risks and uncertainties referred to above include – but are not limited to – risks associated with developing and delivering new
functionality for our service, new products and services, our new business model, our past operating losses, possible fluctuations in our
operating results and rate of growth, interruptions or delays in our Web hosting, breach of our security measures, the outcome of any
litigation, risks associated with completed and any possible mergers and acquisitions, the immature market in which we operate, our
relatively limited operating history, our ability to expand, retain, and motivate our employees and manage our growth, new releases of
our service and successful customer deployment, our limited history reselling non-salesforce.com products, and utilization and selling to
larger enterprise customers. Further information on potential factors that could affect the financial results of salesforce.com, inc. is
included in our annual report on Form 10-K for the most recent fiscal year and in our quarterly report on Form 10-Q for the most recent
fiscal quarter. These documents and others containing important disclosures are available on the SEC Filings section of the Investor
Information section of our Web site.
​Any unreleased services or features referenced in this or other presentations, press releases or public statements are not currently
available and may not be delivered on time or at all. Customers who purchase our services should make the purchase decisions based
upon features that are currently available. Salesforce.com, inc. assumes no obligation and does not intend to update these forward-
looking statements.
Safe Harbor
4
● Salesforce Marketing Cloud
● Webhooks use case
● Implementation in CDAP
● Q&A
Overview
5
● Connects businesses to their customers through email, social media, and SMS.
● 1+ billion personalized messages per day
● 100,000’s of business units
● Billions of subscribers
● Hosts petabytes of customer data in our data centers
● Handles a wide range of communications
○ Marketing campaigns
○ Purchase confirmations
○ Financial notifications
○ Password resets
What is the Salesforce Marketing Cloud?
6
● Webhooks is a near-real time event delivery platform with guaranteed delivery
○ Subscribers generate events by engaging with messages
○ Deliver events to customers over HTTP within seconds
○ Customers react to events in near real time
What is Webhooks?
7
A purchase receipt email fails to be delivered
A mail bounce event is pushed to a service hosted by the retailer
Retailer’s customer service is immediately aware of the failure
Example use case
8
1. Process a stream of near real time events based on customer defined actions.
2. Guarantee delivery of processed events emitted to third party systems.
General problem statement
9
High data integrity
Commerce, health, and finance messaging subject to government regulation
Horizontal scalability
Short time to market
Accessible developer experience
Existing Hadoop/YARN/HBase expertise and infrastructure
Open Source
Primary concerns
10
Some events need pieces of information from other event streams
Example: An email click needs the email send event for contextual information
Wait until other events arrive to assemble the final event
Join across streams
Configurable TTL to wait to join (optional)
Implementation concern - Joins
11
Configurable per customer endpoint
Retry
Throttle
TTL to deliver (optional)
Reporting metrics, SLA compliance
Implementation concern - Delivery guarantees
12
High level architecture
Ingest
Join
Route
Store
HTTP POST
Kafka Source
External
System
13
public class EventRouter {
private Map<EventType, Route> routesMap;
public void process(Event e) {
Route route = routesMap.get(e.clientId());
if (null != route) {
httpPost(e, route);
}
}
}
Business logic
14
public class EventJoiner {
private Map<JoinKey, SendEvent> sends;
public void process(ResponseEvent e) {
SendEvent send = sends.get(e.getKey());
if (null != send) {
Event joined = join(send, e);
routeEvent(joined);
}
}
}
Business logic
15
● Scaling data store is easy - use HBase
● Scaling application involves
○ Transactions
○ Application stack
○ Lifecycle management
○ Data movement
○ Coordination
How to scale?
16
17
● An open source framework to build and deploy data applications on
Apache™ Hadoop®
● Provides abstractions to represent data access and processing
pipelines
● Framework level guarantees for exactly-once semantics
● Transaction support on HBase
● Supports real time and batch processing
● Built on YARN and HBase
Cask Data Application Platform (CDAP)
18
Webhooks in CDAP
19
Business logic
public class EventJoiner {
private Map<JoinKey, SendEvent> sends;
public void process(ResponseEvent e) {
SendEvent send = sends.get(e.getKey());
if (null != send) {
Event joined = join(send, e);
routeEvent(joined);
}
}
}
20
Business logic in CDAP - Flowlet
public class EventJoiner extends AbstractFlowlet {
@UseDataSet(“sends”)
private SendEventDataset sends;
private OutputEmitter<Event> outQueue;
@ProcessInput
public void join(ResponseEvent e) {
SendEvent send = sends.get(e.getKey());
if (send != null) {
Event joined = join(e, send);
outQueue.emit(joined);
}
}
}
21
public class EventJoiner extends AbstractFlowlet {
@UseDataSet(“sends”)
private SendEventDataset sends;
private OutputEmitter<Event> outQueue;
@ProcessInput
public void join(ResponseEvent e) {
SendEvent send = sends.get(e.getKey());
if (send != null) {
Event joined = join(e, send);
outQueue.emit(joined);
}
}
}
Access data with Datasets
22
Chain Flowlets with Queues
public class EventJoiner extends AbstractFlowlet {
@UseDataSet(“sends”)
private SendEventDataset sends;
private OutputEmitter<Event> outQueue;
@ProcessInput
public void join(ResponseEvent e) {
SendEvent send = sends.get(e.getKey());
if (send != null) {
Event joined = join(e, send);
outQueue.emit(joined);
}
}
}
23
Tigon Flow
Event Joiner
Flowlet
HBase Queue HBase Queue
Start Tx End Tx
Start Tx
End Tx
Event Router
Flowlet
● Real time streaming processor
● Composed of Flowlets
● Exactly-once semantics
HBase Queue
24
Scaling Flowlets
Event Joiner
Flowlets
Event Router
Flowlets
HBase Queue
YARN
Containers
FIFO
Round Robin
Hash Partitioning
25
Summary
● CDAP makes development easier by handling the overhead of
scalability
○ Transactions
○ Application stack
○ Lifecycle management
○ Data movement
○ Coordination
26
Datasets and Tephra
27
Data abstraction using Dataset
● Store and retrieve data
● Reusable data access patterns
● Abstraction of underlying data storage
○ HBase
○ LevelDB
○ In-memory
● Can be shared between Flows (real-time) and MapReduce (batch)
28
● Transactions make exactly-once semantics possible
● Multi-row and across HBase regions transactions
● Optimistic concurrency control (Omid style)
● Open source (Apache 2.0 License)
● http://tephra.io
Transaction support with Tephra
29
● Used today in enterprise cloud applications
● CDAP is open source (Apache 2.0 License)
Use and contribute
http://cdap.io/
30
Alan Steckley
asteckley@salesforce.com
http://salesforce.com
Q&A
Poorna Chandra
poorna@cask.co
http://cdap.io
31

More Related Content

Similar to NRT Event Processing with Guaranteed Delivery of HTTP Callbacks, HBaseCon 2015 (20)

PDF
Control your world using the Salesforce1 Platform (IoT)
InternetCreations
 
PPTX
Understanding Salesforce Streaming API
gwestr
 
KEY
Event Driven Architecture
andreaskallberg
 
PDF
OutSystsems User Group Netherlands September 2024.pdf
mail496323
 
PDF
Realtime Apps with Node.js, Heroku, and Force.com Streaming
Salesforce Developers
 
PDF
Event Driven-Architecture from a Scalability perspective
Jonas Bonér
 
PPTX
Brasil Roadshow
Joshua Birk
 
PPTX
Neev Expertise in Spring and Hibernate
Neev Technologies
 
PPTX
Salesforce Streaming event - PushTopic and Generic Events
Dhanik Sahni
 
PDF
Integrating High-Velocity External Data in Your Salesforce Application
Salesforce Developers
 
PPTX
Eda gas andelectricity_meetup-adelaide_pov
Nicholas Bowman
 
PPT
Business Mashups Best of the Web APIs
dreamforce2006
 
PPTX
Event Management System using Full Stack Web Application Review-1
karthick de cluzters
 
PPTX
Multi-Process JavaScript Architectures
Mark Trostler
 
PDF
REST - What's It All About? (SAP TechEd 2012, CD110)
Sascha Wenninger
 
PDF
Event Driven Streaming Analytics - Demostration on Architecture of IoT
Lei Xu
 
PPTX
Hadoop Summit Tokyo Apache NiFi Crash Course
DataWorks Summit/Hadoop Summit
 
PDF
Real-Time Data Feeds Using the Streaming API
Salesforce Developers
 
PDF
On Demand services
Himanshu Gupta
 
PDF
Building the Eventbrite API Ecosystem
Mitch Colleran
 
Control your world using the Salesforce1 Platform (IoT)
InternetCreations
 
Understanding Salesforce Streaming API
gwestr
 
Event Driven Architecture
andreaskallberg
 
OutSystsems User Group Netherlands September 2024.pdf
mail496323
 
Realtime Apps with Node.js, Heroku, and Force.com Streaming
Salesforce Developers
 
Event Driven-Architecture from a Scalability perspective
Jonas Bonér
 
Brasil Roadshow
Joshua Birk
 
Neev Expertise in Spring and Hibernate
Neev Technologies
 
Salesforce Streaming event - PushTopic and Generic Events
Dhanik Sahni
 
Integrating High-Velocity External Data in Your Salesforce Application
Salesforce Developers
 
Eda gas andelectricity_meetup-adelaide_pov
Nicholas Bowman
 
Business Mashups Best of the Web APIs
dreamforce2006
 
Event Management System using Full Stack Web Application Review-1
karthick de cluzters
 
Multi-Process JavaScript Architectures
Mark Trostler
 
REST - What's It All About? (SAP TechEd 2012, CD110)
Sascha Wenninger
 
Event Driven Streaming Analytics - Demostration on Architecture of IoT
Lei Xu
 
Hadoop Summit Tokyo Apache NiFi Crash Course
DataWorks Summit/Hadoop Summit
 
Real-Time Data Feeds Using the Streaming API
Salesforce Developers
 
On Demand services
Himanshu Gupta
 
Building the Eventbrite API Ecosystem
Mitch Colleran
 

More from Cask Data (13)

PDF
Introducing a horizontally scalable, inference-based business Rules Engine fo...
Cask Data
 
PDF
About CDAP
Cask Data
 
PDF
Transaction in HBase, by Andreas Neumann, Cask
Cask Data
 
PDF
#BDAM: EDW Optimization with Hadoop and CDAP, by Sagar Kapare from Cask
Cask Data
 
PPTX
"Who Moved my Data? - Why tracking changes and sources of data is critical to...
Cask Data
 
PDF
Building Enterprise Grade Applications in Yarn with Apache Twill
Cask Data
 
PDF
Webinar: What's new in CDAP 3.5?
Cask Data
 
PDF
Transactions Over Apache HBase
Cask Data
 
PDF
ACID Transactions in Apache Phoenix with Apache Tephra™ (incubating), by Poor...
Cask Data
 
PDF
Logging infrastructure for Microservices using StreamSets Data Collector
Cask Data
 
PDF
Introducing Athena: 08/19 Big Data Application Meetup, Talk #3
Cask Data
 
PPTX
Brown Bag : CDAP (f.k.a Reactor) Streams Deep DiveStream on file brown bag
Cask Data
 
PDF
HBase Meetup @ Cask HQ 09/25
Cask Data
 
Introducing a horizontally scalable, inference-based business Rules Engine fo...
Cask Data
 
About CDAP
Cask Data
 
Transaction in HBase, by Andreas Neumann, Cask
Cask Data
 
#BDAM: EDW Optimization with Hadoop and CDAP, by Sagar Kapare from Cask
Cask Data
 
"Who Moved my Data? - Why tracking changes and sources of data is critical to...
Cask Data
 
Building Enterprise Grade Applications in Yarn with Apache Twill
Cask Data
 
Webinar: What's new in CDAP 3.5?
Cask Data
 
Transactions Over Apache HBase
Cask Data
 
ACID Transactions in Apache Phoenix with Apache Tephra™ (incubating), by Poor...
Cask Data
 
Logging infrastructure for Microservices using StreamSets Data Collector
Cask Data
 
Introducing Athena: 08/19 Big Data Application Meetup, Talk #3
Cask Data
 
Brown Bag : CDAP (f.k.a Reactor) Streams Deep DiveStream on file brown bag
Cask Data
 
HBase Meetup @ Cask HQ 09/25
Cask Data
 
Ad

Recently uploaded (20)

PPTX
Agentic Automation Journey Session 1/5: Context Grounding and Autopilot for E...
klpathrudu
 
PPTX
Home Care Tools: Benefits, features and more
Third Rock Techkno
 
PPTX
Comprehensive Risk Assessment Module for Smarter Risk Management
EHA Soft Solutions
 
PPTX
Help for Correlations in IBM SPSS Statistics.pptx
Version 1 Analytics
 
PDF
MiniTool Partition Wizard Free Crack + Full Free Download 2025
bashirkhan333g
 
PDF
IObit Driver Booster Pro 12.4.0.585 Crack Free Download
henryc1122g
 
PDF
4K Video Downloader Plus Pro Crack for MacOS New Download 2025
bashirkhan333g
 
PDF
SAP Firmaya İade ABAB Kodları - ABAB ile yazılmıl hazır kod örneği
Salih Küçük
 
PDF
The 5 Reasons for IT Maintenance - Arna Softech
Arna Softech
 
PPTX
Customise Your Correlation Table in IBM SPSS Statistics.pptx
Version 1 Analytics
 
PPTX
Hardware(Central Processing Unit ) CU and ALU
RizwanaKalsoom2
 
PPTX
Homogeneity of Variance Test Options IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
PDF
MiniTool Partition Wizard 12.8 Crack License Key LATEST
hashhshs786
 
PPTX
Foundations of Marketo Engage - Powering Campaigns with Marketo Personalization
bbedford2
 
PDF
Open Chain Q2 Steering Committee Meeting - 2025-06-25
Shane Coughlan
 
PPTX
Change Common Properties in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
PDF
AOMEI Partition Assistant Crack 10.8.2 + WinPE Free Downlaod New Version 2025
bashirkhan333g
 
PPTX
Coefficient of Variance in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
PDF
NEW-Viral>Wondershare Filmora 14.5.18.12900 Crack Free
sherryg1122g
 
PPTX
Agentic Automation Journey Series Day 2 – Prompt Engineering for UiPath Agents
klpathrudu
 
Agentic Automation Journey Session 1/5: Context Grounding and Autopilot for E...
klpathrudu
 
Home Care Tools: Benefits, features and more
Third Rock Techkno
 
Comprehensive Risk Assessment Module for Smarter Risk Management
EHA Soft Solutions
 
Help for Correlations in IBM SPSS Statistics.pptx
Version 1 Analytics
 
MiniTool Partition Wizard Free Crack + Full Free Download 2025
bashirkhan333g
 
IObit Driver Booster Pro 12.4.0.585 Crack Free Download
henryc1122g
 
4K Video Downloader Plus Pro Crack for MacOS New Download 2025
bashirkhan333g
 
SAP Firmaya İade ABAB Kodları - ABAB ile yazılmıl hazır kod örneği
Salih Küçük
 
The 5 Reasons for IT Maintenance - Arna Softech
Arna Softech
 
Customise Your Correlation Table in IBM SPSS Statistics.pptx
Version 1 Analytics
 
Hardware(Central Processing Unit ) CU and ALU
RizwanaKalsoom2
 
Homogeneity of Variance Test Options IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
MiniTool Partition Wizard 12.8 Crack License Key LATEST
hashhshs786
 
Foundations of Marketo Engage - Powering Campaigns with Marketo Personalization
bbedford2
 
Open Chain Q2 Steering Committee Meeting - 2025-06-25
Shane Coughlan
 
Change Common Properties in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
AOMEI Partition Assistant Crack 10.8.2 + WinPE Free Downlaod New Version 2025
bashirkhan333g
 
Coefficient of Variance in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
NEW-Viral>Wondershare Filmora 14.5.18.12900 Crack Free
sherryg1122g
 
Agentic Automation Journey Series Day 2 – Prompt Engineering for UiPath Agents
klpathrudu
 
Ad

NRT Event Processing with Guaranteed Delivery of HTTP Callbacks, HBaseCon 2015

  • 1. Webhooks Near-real time event processing with guaranteed delivery of HTTP callbacks HBaseCon 2015
  • 2. Alan Steckley Principal Software Engineer, Salesforce 2
  • 4. ​Safe harbor statement under the Private Securities Litigation Reform Act of 1995: ​This presentation may contain forward-looking statements that involve risks, uncertainties, and assumptions. If any such uncertainties materialize or if any of the assumptions proves incorrect, the results of salesforce.com, inc. could differ materially from the results expressed or implied by the forward-looking statements we make. All statements other than statements of historical fact could be deemed forward-looking, including any projections of product or service availability, subscriber growth, earnings, revenues, or other financial items and any statements regarding strategies or plans of management for future operations, statements of belief, any statements concerning new, planned, or upgraded services or technology developments and customer contracts or use of our services. ​The risks and uncertainties referred to above include – but are not limited to – risks associated with developing and delivering new functionality for our service, new products and services, our new business model, our past operating losses, possible fluctuations in our operating results and rate of growth, interruptions or delays in our Web hosting, breach of our security measures, the outcome of any litigation, risks associated with completed and any possible mergers and acquisitions, the immature market in which we operate, our relatively limited operating history, our ability to expand, retain, and motivate our employees and manage our growth, new releases of our service and successful customer deployment, our limited history reselling non-salesforce.com products, and utilization and selling to larger enterprise customers. Further information on potential factors that could affect the financial results of salesforce.com, inc. is included in our annual report on Form 10-K for the most recent fiscal year and in our quarterly report on Form 10-Q for the most recent fiscal quarter. These documents and others containing important disclosures are available on the SEC Filings section of the Investor Information section of our Web site. ​Any unreleased services or features referenced in this or other presentations, press releases or public statements are not currently available and may not be delivered on time or at all. Customers who purchase our services should make the purchase decisions based upon features that are currently available. Salesforce.com, inc. assumes no obligation and does not intend to update these forward- looking statements. Safe Harbor 4
  • 5. ● Salesforce Marketing Cloud ● Webhooks use case ● Implementation in CDAP ● Q&A Overview 5
  • 6. ● Connects businesses to their customers through email, social media, and SMS. ● 1+ billion personalized messages per day ● 100,000’s of business units ● Billions of subscribers ● Hosts petabytes of customer data in our data centers ● Handles a wide range of communications ○ Marketing campaigns ○ Purchase confirmations ○ Financial notifications ○ Password resets What is the Salesforce Marketing Cloud? 6
  • 7. ● Webhooks is a near-real time event delivery platform with guaranteed delivery ○ Subscribers generate events by engaging with messages ○ Deliver events to customers over HTTP within seconds ○ Customers react to events in near real time What is Webhooks? 7
  • 8. A purchase receipt email fails to be delivered A mail bounce event is pushed to a service hosted by the retailer Retailer’s customer service is immediately aware of the failure Example use case 8
  • 9. 1. Process a stream of near real time events based on customer defined actions. 2. Guarantee delivery of processed events emitted to third party systems. General problem statement 9
  • 10. High data integrity Commerce, health, and finance messaging subject to government regulation Horizontal scalability Short time to market Accessible developer experience Existing Hadoop/YARN/HBase expertise and infrastructure Open Source Primary concerns 10
  • 11. Some events need pieces of information from other event streams Example: An email click needs the email send event for contextual information Wait until other events arrive to assemble the final event Join across streams Configurable TTL to wait to join (optional) Implementation concern - Joins 11
  • 12. Configurable per customer endpoint Retry Throttle TTL to deliver (optional) Reporting metrics, SLA compliance Implementation concern - Delivery guarantees 12
  • 13. High level architecture Ingest Join Route Store HTTP POST Kafka Source External System 13
  • 14. public class EventRouter { private Map<EventType, Route> routesMap; public void process(Event e) { Route route = routesMap.get(e.clientId()); if (null != route) { httpPost(e, route); } } } Business logic 14
  • 15. public class EventJoiner { private Map<JoinKey, SendEvent> sends; public void process(ResponseEvent e) { SendEvent send = sends.get(e.getKey()); if (null != send) { Event joined = join(send, e); routeEvent(joined); } } } Business logic 15
  • 16. ● Scaling data store is easy - use HBase ● Scaling application involves ○ Transactions ○ Application stack ○ Lifecycle management ○ Data movement ○ Coordination How to scale? 16
  • 17. 17
  • 18. ● An open source framework to build and deploy data applications on Apache™ Hadoop® ● Provides abstractions to represent data access and processing pipelines ● Framework level guarantees for exactly-once semantics ● Transaction support on HBase ● Supports real time and batch processing ● Built on YARN and HBase Cask Data Application Platform (CDAP) 18
  • 20. Business logic public class EventJoiner { private Map<JoinKey, SendEvent> sends; public void process(ResponseEvent e) { SendEvent send = sends.get(e.getKey()); if (null != send) { Event joined = join(send, e); routeEvent(joined); } } } 20
  • 21. Business logic in CDAP - Flowlet public class EventJoiner extends AbstractFlowlet { @UseDataSet(“sends”) private SendEventDataset sends; private OutputEmitter<Event> outQueue; @ProcessInput public void join(ResponseEvent e) { SendEvent send = sends.get(e.getKey()); if (send != null) { Event joined = join(e, send); outQueue.emit(joined); } } } 21
  • 22. public class EventJoiner extends AbstractFlowlet { @UseDataSet(“sends”) private SendEventDataset sends; private OutputEmitter<Event> outQueue; @ProcessInput public void join(ResponseEvent e) { SendEvent send = sends.get(e.getKey()); if (send != null) { Event joined = join(e, send); outQueue.emit(joined); } } } Access data with Datasets 22
  • 23. Chain Flowlets with Queues public class EventJoiner extends AbstractFlowlet { @UseDataSet(“sends”) private SendEventDataset sends; private OutputEmitter<Event> outQueue; @ProcessInput public void join(ResponseEvent e) { SendEvent send = sends.get(e.getKey()); if (send != null) { Event joined = join(e, send); outQueue.emit(joined); } } } 23
  • 24. Tigon Flow Event Joiner Flowlet HBase Queue HBase Queue Start Tx End Tx Start Tx End Tx Event Router Flowlet ● Real time streaming processor ● Composed of Flowlets ● Exactly-once semantics HBase Queue 24
  • 25. Scaling Flowlets Event Joiner Flowlets Event Router Flowlets HBase Queue YARN Containers FIFO Round Robin Hash Partitioning 25
  • 26. Summary ● CDAP makes development easier by handling the overhead of scalability ○ Transactions ○ Application stack ○ Lifecycle management ○ Data movement ○ Coordination 26
  • 28. Data abstraction using Dataset ● Store and retrieve data ● Reusable data access patterns ● Abstraction of underlying data storage ○ HBase ○ LevelDB ○ In-memory ● Can be shared between Flows (real-time) and MapReduce (batch) 28
  • 29. ● Transactions make exactly-once semantics possible ● Multi-row and across HBase regions transactions ● Optimistic concurrency control (Omid style) ● Open source (Apache 2.0 License) ● http://tephra.io Transaction support with Tephra 29
  • 30. ● Used today in enterprise cloud applications ● CDAP is open source (Apache 2.0 License) Use and contribute http://cdap.io/ 30