Cloud Access Reliability
Engineering Initiative
Interoperable SLAs for digital supply chains
Technical Prospectus
4th November 2018
CLOUD SERVICE
(SaaS/PaaS)
TELCO &
CLOUD ACCESS
CUSTOMER
PREMISES
DEVICE
APPLICATION
SOFTWARE
Overview of the problem: digital supply chains exist to support distributed applications 2
DIGITALSUPPLYCHAIN
The increasingly variability has making it harder to reason about performance and
engineer reliability to meet that demand. We have to decide where to put the compute,
and when to communicate. The number of design and configuration degrees of
freedom are rising: location, capacity, scheduling, loss vs delay.
Questions about too much demand or not enough supply have become hard to answer.
Safety margins have become opaque. The trade-offs of cost and performance lack
predictable outcomes. The resulting unreliability is driving user frustration – the
“motherbuffer”, which in turn creates costly workarounds. The main one has been
“spend money on bandwidth” – but this doesn’t work to solve growing variability.
The answer is the software industry equivalent of “containerisation”: what
the “rack” does for cloud hardware, we need for cloud services. But how?
Users have portfolios of cloud applications that they wish to use. These have a mixture
of availability demands. The telecoms-cloud industry is building supply using a variety of
technologies, such as 5G slicing, SDN/NFV, SD-WAN, serverless computing, and WiFi.
That supply is becoming ever more dynamic (higher speeds, more statistical resource
sharing, more wireless) and more distributed (e.g. network function virtualisation, edge
apps). Technologies like cloud also have a start-up latency for the container versus “bare
metal” computing, as well as having to wait for shared resources (as with packet data).
LAN
Access network
Longhaul network
Cloud application container
SD-WAN
VPN
DATA CENTRE
USER DEVICE EXPERIENCE
Presentation layer runtime
APPLICATION
PROVIDER
TELCO &
CLOUD ACCESS
CUSTOMER
PREMISES
APPLICATION
PROVIDER
Digital supply chains are systems of supply and demand… just like any other industry 3
DEMANDFORCOPYINGDATADOWNSTREAM
DEMANDFORCOPYINGDATAUPSTEAM
SUPPLYOFINFORMATIONDOWNSTREAM
SUPPLYOFINFORMATIONUPSTREAM
Industries typically have a
standard unit of supply and
demand that meaningfully
“adds up” and interoperates
Natural gas BTU
Electricity MW
Water litre
Corn bushel
Sugar pound
Oil barrel
Ethanol gallon
Copper ton
Wool kilogram
Gold ounce
Shipping 40ft container
Cloud application access = ?
Wi-Fi Ethernet
xDSL Cable FTTx 2G/3G/4G/5G
Public
Internet
Private cloud
global access
MPLS/Carrier
Ethernet/etc.
Hosted app
Public or
private cloud
Serverless
functions
SD-WAN No SD-WAN
VPN No VPN
DISTRIBUTED APPLICATION SOFTWARE
DISTRIBUTED APPLICATION SOFTWARE
VDI Web browser UC/VoIP AppTV
APPLICATION
PROVIDER
TELCO &
CLOUD ACCESS
CUSTOMER
PREMISES
APPLICATION
PROVIDER
Digital supply chains involve complex interactions between multiple technology stacks 4
HORIZONTAL INTEROPERABILITY
VERTICALINTEROPERABILITY
ICT suppliers want
to manage “vertical”
interoperability:
How do I deliver enough
reliability end-to-end to meet
the customer’s need?
How can I optimise cost for
my sub-path without
sacrificing that reliability?
End users demand
“horizontal”
interoperability:
How do I know what on-
premise capabilities I need?
How do I select a service
provider and network access
technology and know my
applications will work?
Wi-Fi Ethernet
xDSL Cable FTTx 2G/3G/4G/5G
Public
Internet
Private cloud
global access
MPLS/Carrier
Ethernet/etc.
Hosted app
Public or
private cloud
Serverless
functions
SD-WAN No SD-WAN
VPN No VPN
DISTRIBUTED APPLICATION SOFTWARE
DISTRIBUTED APPLICATION SOFTWARE
VDI Web browser UC/VoIP AppTV
APPLICATION
PROVIDER
TELCO &
CLOUD ACCESS
CUSTOMER
PREMISES
APPLICATION
PROVIDER
“Vertical” interoperability needs standardised and (de)composable “availability SLAs” 5
The end-to-end
requirement is for a
bounded probability of
packet latency and loss.
The concept of “quality
attenuation” unifies
latency and loss into a
single mathematical
object, analogous to how
complex numbers bring
together real and
imaginary numbers.
”Quality attenuation” can
be expressed as
composable “availability
SLAs” (when using ∆Q
metrics) that “add up”
along the end-to-end path.
END-TO-ENDQUALITYREQUIREMENT
Wi-Fi Ethernet
xDSL Cable FTTx 2G/3G/4G/5G
Public
Internet
Private cloud
global access
MPLS/Carrier
Ethernet/etc.
Hosted app
Public or
private cloud
Serverless
functions
SD-WAN No SD-WAN
VPN No VPN
DISTRIBUTED APPLICATION SOFTWARE
DISTRIBUTED APPLICATION SOFTWARE
VDI Web browser UC/VoIP AppTV
APPLICATION
PROVIDER
TELCO &
CLOUD ACCESS
CUSTOMER
PREMISES
APPLICATION
PROVIDER
“Horizontal” interoperability decouples operational implementation from the availability SLA 6
HORIZONTAL INTEROPERABILITY
Availability is something you can
only lose: the baseline is “always
available”, and every element can
only detract from that standard of
perfection!
We can use the “availability SLAs”
to create a “budget” for quality
attenuation for each sub-system
and element of the system.
”Horizontal” interoperability
means we can safely make
independent operational choices
over how to meet that “budget”.
As long as we are “under budget”
(to an agreed probability) then we
know the end-to-end availability
requirement will still be met.
Wi-Fi Ethernet
xDSL Cable FTTx 2G/3G/4G/5G
Public
Internet
Private cloud
global access
MPLS/Carrier
Ethernet/etc.
Hosted app
Public or
private cloud
Serverless
functions
SD-WAN No SD-WAN
VPN No VPN
DISTRIBUTED APPLICATION SOFTWARE
DISTRIBUTED APPLICATION SOFTWARE
VDI Web browser UC/VoIP AppTV
APPLICATION
PROVIDER
TELCO &
CLOUD ACCESS
CUSTOMER
PREMISES
APPLICATION
PROVIDER
Our industry challenge: to develop a supply chain quality management system 7
HORIZONTAL INTEROPERABILITY
VERTICALINTEROPERABILITY
Relating “vertical” to
“horizontal” needs a
common language of
metrics (for SLAs) and
measures (for operations):
Standardised metrics
Standardised SLAs
Standardised operational
measurement methods
Standardised service
lifecycle management
processes
Standardised network
quality assurance
mechanisms
DISTRIBUTED COMPUTING
DISTRIBUTED
COMPUTING
DISTRIBUTED
COMPUTING
DISTRIBUTED
COMPUTING
DISTRIBUTED
COMPUTING
DISTRIBUTED
COMPUTING
DISTRIBUTED
COMPUTING
DISTRIBUTED
COMPUTING
DISTRIBUTED
COMPUTING
DISTRIBUTED
COMPUTING
DISTRIBUTED
COMPUTING
DISTRIBUTED
COMPUTING
DISTRIBUTED COMPUTING
DISTRIBUTED
COMPUTING
DISTRIBUTED COMPUTING
DISTRIBUTED
COMPUTING
DISTRIBUTED COMPUTING
DISTRIBUTED COMPUTING
DISTRIBUTED
COMPUTING
DISTRIBUTED
COMPUTING
DISTRIBUTED
COMPUTING
DISTRIBUTED
COMPUTING
DISTRIB
UTED
COMPU
TING
APPLICATION
PROVIDER
TELCO &
CLOUD ACCESS
CUSTOMER
PREMISES
APPLICATION
PROVIDER
We have to create the performance integration framework (“glue”) for interoperable SLAs 8
Other industries have solved this
reliability engineering integration
problem. For instance, the oil
industry can relate “upstream”
extraction and refining activities
to “downstream” distribution.
Telecoms had this capability
during the telephony era with
erlangs as a rational unit of supply
and demand. We lost this when
we moved to packet-based
statistical multiplexing.
∆Q-based “availability SLAs” are
”cloud access erlangs”. We can
now performance engineer digital
supply chains, create new
assurance revenue and reduce the
cost of workarounds and failure.
Requirements Challenges Opportunities
Define availability SLA
…in the user/customer’s own terms
Availability SLA metrics
…don’t sufficiently reflect QoE and
aren’t properly composable/causal
Capture the basic science
…as ∆Q calculus is the only known
scientifically sound approach
Solving these reliability engineering problems needs interoperable metrics and measures 9
Decompose SLA
…’vertically’ in supply chains to
specify SLA requirement on each
subsystem and sub-subsystem
Market differentiated SLA
…to win the customer’s business in
‘horizontal’ competitive market
Operationally deliver SLA
…to meet our service availability
promises (with optimised cost
and risk trade-offs for profitability)
SLA operational measures
…aren’t calibrated against testing
and inspection reference standards
SLA management models
…don’t fully incorporate established
and proven management theory
(e.g. 6-sigma, TOC, lean, Vanguard)
SLA assurance methods
…require improved or new business
processes across service lifecycle
Create standard metrics
…that accurately reflect availability
in the customer’s own terms
Calibrate measures
…to manage error bounds,
done based on shared cost and IPR
Construct assurance SLAs
…that have the essential
“horizontal” and “vertical”
interoperability properties
Capture the basic science
…as ∆Q performance calculus is the
only scientifically sound approach
Create standard metrics
…that accurately reflect availability
in the customer’s own terms
Calibrate measures
…to manage error bounds,
done based on shared cost and IPR
Construct assurance SLAs
…that have the essential
“horizontal” and “vertical”
interoperability properties
We have a strong baseline of ∆Q research and technology that now requires industrialisation 10
Fully developed ∆Q theoretical framework.
Quality attenuation theory training materials available.
“Wind tunnel” for cloud apps to establish ∆Q-based SLA.
Demonstrated ability to “budget” performance using ∆Q.
Mature first generation ∆Q measurement system.
Industrialisation and scaling (e.g. TWAMP) in progress.
Contention Management technology developed and
demonstrated to assure ∆Q SLA even in overload.
Operational trial platform available for shared use
(collaboration with Just Right Networks Ltd & SureTec Ltd).
15+ years of development by leading team of reliability engineering and distributed systems experts
Proven in application at both tier 1 network operators as well as many boutique and exotic applications
Capture the basic science
…as ∆Q performance calculus is the
only scientifically sound approach
Create standard metrics
…that accurately reflect availability
in the customer’s own terms
Calibrate measures
…to manage error bounds,
done based on shared cost and IPR
Construct assurance SLAs
…that have the essential
“horizontal” and “vertical”
interoperability properties
The next steps are clear… and interoperability by its nature is a collaborative activity 11
Develop ex ante reliability engineering curriculum from existing
materials — ‘train the trainer’ and disseminate to industry
Create reference ∆Q-based SLAs for common application types
Construct software libraries to persist and manipulate the SLAs
Frictionless deployment of packet observation end points
Use high-fidelity measures to audit lower-fidelity measures
Establish ‘state of the possible’ (i.e. highest yield of QoE at
lowest cost and risk) to establish technical reference study that
feeds into business case for industry investment in initiative.
The time has come to ‘up our game’ and collectively engage with the problem of interoperable metrics
This requires an industry effort that no single player can deliver as every process and system is affected
We help you to apply these reliability engineering breakthroughs
to solve the cloud access interoperability and integration problem
for distributed applications in complex digital supply chains
We feed specialist skills into
the development of existing
industry initiatives and
projects.
Examples: 5G, SDN/NFV, Zero Touch
Automation, UCaaS, distributed apps
(inc. blockchain), machine learning
We adapt established quality
management methods from
other industries, and drive
the adoption of new scientific
metrics and measures that
are suitable for our own.
We develop methods,
standards and tools by a
process of action learning. We
target the full service lifecycle
(product development,
marketing and sales, in-life
service and support).
12
Purpose of the Cloud Access Reliability Engineering Initiative 12
…performance integration
technology
…management methods and
processes
…skills, people and
relationships
Cloud access reliability engineering…
CARE initiative technical prospectus
NETWORK DEMAND NETWORK SUPPLY QoE ‘SLAZARD’
High-fidelity network measures
We know the ‘interoperable unit’ answer: the ∆Q calculus (see http://qualityattenuation.science/) 14
∆Q(A) ∆Q(B) ∆Q(C)
VA
SA
GA
VB
SB
GB
VC
SC
GC
+
+
+
+
+
+
=
=
=
∑V
∑S
∑G
SUPPLIER A SUPPLIER B SUPPLIER C
∆Q(∑ A+B+C)
15
Variable delay due to load
Size of packet delay
Geographic delay
∆Q|G
∆Q|S
∆Q|V
Packet size
One-waydelay
G/S/V are independent probability
functions using improper random
variables or improper cumulative
distributions. These can be
(de)convolved and “budgeted”
along the supply chain using
(de)composable “quality SLAs”.
∆Q metrics have an algebra for engineering predictable performance (& nothing else does!) 15
Other metrics and measures ∆Q-based metrics
and measures
Be a strong proxy for QoE Yes: e.g. effective bandwidth, Actual Experience Yes
Isolate problems in supply chains Partial: Correlation, but not strong causation Yes
Offer an auditable evidence chain No: Would not stand up in court as standard of proof Yes
Be non-intrusive Some: Only for passively observed single point average
metrics; others more like DoS attacks!
Yes
Work for all types of bearer Some: Separate worlds of cable, 5G, SDN/NFV all doing
own thing. No user-centric end-to-end view.
Yes
Be cheap to gather and operate Some: But high fidelity remains expensive. Yes
Be non-proprietary Some: Cheap to gather data is of low fidelity Yes
Have a scientific basis Partial: Have scientific basis, but limited generality Yes
Able to define ‘safety margin’ Partial: Only weak proxies for safety margin Yes
Can engineer spatial (location of
compute, routing of data) as well as
temporal (scheduling of resources)
Partial: Some ability to separate static from dynamic but
no formal algebra
Yes
SLAs are (de)composable No: Composition not a meaningful operation Yes
There are other metrics and measures, some of which meet some needs, but none meet all 16
Martin Geddes
mail@martingeddes.com

More Related Content

PDF
Digital supply chain quality management
PDF
Stationarity is the new speed
PDF
Introduction to network quality arbitrage
PDF
Sample proposal summary for quality arbitrage business unit
PDF
Saiful Hidayat Telkom Indonesia Vietnam Telecoms International Summit Nuturin...
PDF
The science of network performance
PDF
Essential science for broadband regulation
PDF
Multimedia Analytics with 5G Edge Nodes
Digital supply chain quality management
Stationarity is the new speed
Introduction to network quality arbitrage
Sample proposal summary for quality arbitrage business unit
Saiful Hidayat Telkom Indonesia Vietnam Telecoms International Summit Nuturin...
The science of network performance
Essential science for broadband regulation
Multimedia Analytics with 5G Edge Nodes

What's hot (20)

PDF
Using the Cloud to Attract, Engage & Retain Your Customers
PDF
2021 Predictions and Trends for the SD-WAN and Edge Market
PPTX
Structure 2014 - Launchpad Competition
PPTX
Structure 2014 - Disrupting the data center - Intel sponsor workshop
PDF
Geddes/PNSol - Broadband market evolution
PPTX
Advanced network performance measurement techniques
PDF
Introduction to ΔQ and Network Performance Science (extracts)
PPTX
Enabling Telcos to lead Digital Transformation using 5G
PDF
Towards the extinction of mega data centres? To which extent should the Clou...
PPTX
The Ladder: How money and multiplexing are connected
PDF
Converging Enterprise Communications, IT and the Cloud - White Paper - Unifie...
PDF
Analysys Mason_Impact_of_telco_cloud_on_CSPs_2016
PDF
Superfit broadband
PPT
COLT Unified Communications
PDF
ENTSO-E Draft Network Code for Operational Security
PDF
Ericsson Review: Software-Defined-Networking
PDF
Purdicom Wireless Plus Roadshow Presentation
PDF
Dynamic workload migration over optical backbone network to minimize data cen...
PPTX
Introduction to Rakuten Mobile Virtual Network Operator
PDF
SmartCom - for better indonesia digital creative industry
Using the Cloud to Attract, Engage & Retain Your Customers
2021 Predictions and Trends for the SD-WAN and Edge Market
Structure 2014 - Launchpad Competition
Structure 2014 - Disrupting the data center - Intel sponsor workshop
Geddes/PNSol - Broadband market evolution
Advanced network performance measurement techniques
Introduction to ΔQ and Network Performance Science (extracts)
Enabling Telcos to lead Digital Transformation using 5G
Towards the extinction of mega data centres? To which extent should the Clou...
The Ladder: How money and multiplexing are connected
Converging Enterprise Communications, IT and the Cloud - White Paper - Unifie...
Analysys Mason_Impact_of_telco_cloud_on_CSPs_2016
Superfit broadband
COLT Unified Communications
ENTSO-E Draft Network Code for Operational Security
Ericsson Review: Software-Defined-Networking
Purdicom Wireless Plus Roadshow Presentation
Dynamic workload migration over optical backbone network to minimize data cen...
Introduction to Rakuten Mobile Virtual Network Operator
SmartCom - for better indonesia digital creative industry
Ad

Similar to CARE initiative technical prospectus (20)

PPSX
Cloud Computing - A Primer
PPT
Cloud infrastructure and Cloud Services
PPT
Deliver Business Value Through Cloud Computing
PDF
Cisco Connect 2018 Thailand - Enabling the next gen data center transformatio...
PPTX
A Breif On Cloud computing
PDF
Confluent Partner Tech Talk with Reply
PPTX
Microsoft Windows Azure - Platfrom Appfabric Service Bus And Access Control P...
PDF
Ensure the Secure, Reliable Delivery of Applications to Any User, Over Any Ne...
PPT
Steve Mills - Dispelling the Vapor Around Cloud Computing
PPT
2010.10.07. Le Cloud Computing pour les N...ouveaux - Loic Simon - Club Allia...
PPTX
F5 Networks - парадная дверь в облака
PPT
Cloud computing - dien toan dam may
PDF
Cloud Computing Strategy and Architecture
PDF
VMworld 2014: Virtualization 101
PDF
VMworld 2013: Create a Key Metrics-based Actionable Roadmap to Deliver IT as ...
PPT
Cloud 12 08 V2
PPTX
MARLABS - Cloud services CIO Conference
PPT
Cloud Computing Ppt
PPT
Cloud Computing
PPT
Cloudcomputingppt 12746363271272 Phpapp01
Cloud Computing - A Primer
Cloud infrastructure and Cloud Services
Deliver Business Value Through Cloud Computing
Cisco Connect 2018 Thailand - Enabling the next gen data center transformatio...
A Breif On Cloud computing
Confluent Partner Tech Talk with Reply
Microsoft Windows Azure - Platfrom Appfabric Service Bus And Access Control P...
Ensure the Secure, Reliable Delivery of Applications to Any User, Over Any Ne...
Steve Mills - Dispelling the Vapor Around Cloud Computing
2010.10.07. Le Cloud Computing pour les N...ouveaux - Loic Simon - Club Allia...
F5 Networks - парадная дверь в облака
Cloud computing - dien toan dam may
Cloud Computing Strategy and Architecture
VMworld 2014: Virtualization 101
VMworld 2013: Create a Key Metrics-based Actionable Roadmap to Deliver IT as ...
Cloud 12 08 V2
MARLABS - Cloud services CIO Conference
Cloud Computing Ppt
Cloud Computing
Cloudcomputingppt 12746363271272 Phpapp01
Ad

More from Martin Geddes (20)

PDF
Martin Geddes Livestream - March 2018
PPTX
Example high-fidelity network measures using ∆Q metrics
PDF
Superfast or superfit? The case for UK broadband policy reform
PDF
Broadband service quality - rationing or markets?
PDF
The End of Information Technology: Introducing Hypersense & Human Technology
PDF
The Guardian Avatar
PDF
Evaluating the internet end-user experience in the Russian Federation
PDF
FCC Open Internet Transparency - a review by Martin Geddes
PDF
Beasts to superheroes to gods
PDF
Beyond 'neutrality' - how to reconnect regulation to reality?
PDF
What is a 'polyservice' network?
PDF
The perception gap: the barrier to disruptive innovation in telecoms
PDF
Lastingham - Parish mast report
PDF
BT Operate Case Study
PDF
Why ∆Q is the ideal network metric
PDF
The technology zeitgeist
PDF
The business world in 2025
PDF
A Study of Traffic Management Detection Methods & Tools
PDF
Hypertext to Hypervoice - The next stage in collaboration on the Web
PDF
The Properties and Mathematics of Data Transport Quality
Martin Geddes Livestream - March 2018
Example high-fidelity network measures using ∆Q metrics
Superfast or superfit? The case for UK broadband policy reform
Broadband service quality - rationing or markets?
The End of Information Technology: Introducing Hypersense & Human Technology
The Guardian Avatar
Evaluating the internet end-user experience in the Russian Federation
FCC Open Internet Transparency - a review by Martin Geddes
Beasts to superheroes to gods
Beyond 'neutrality' - how to reconnect regulation to reality?
What is a 'polyservice' network?
The perception gap: the barrier to disruptive innovation in telecoms
Lastingham - Parish mast report
BT Operate Case Study
Why ∆Q is the ideal network metric
The technology zeitgeist
The business world in 2025
A Study of Traffic Management Detection Methods & Tools
Hypertext to Hypervoice - The next stage in collaboration on the Web
The Properties and Mathematics of Data Transport Quality

Recently uploaded (20)

PDF
Connector Corner: Transform Unstructured Documents with Agentic Automation
PDF
Planning-an-Audit-A-How-To-Guide-Checklist-WP.pdf
PPTX
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION
PPTX
future_of_ai_comprehensive_20250822032121.pptx
PDF
Lung cancer patients survival prediction using outlier detection and optimize...
PDF
“The Future of Visual AI: Efficient Multimodal Intelligence,” a Keynote Prese...
PDF
LMS bot: enhanced learning management systems for improved student learning e...
PPTX
AI-driven Assurance Across Your End-to-end Network With ThousandEyes
PDF
AI.gov: A Trojan Horse in the Age of Artificial Intelligence
PDF
SaaS reusability assessment using machine learning techniques
PDF
4 layer Arch & Reference Arch of IoT.pdf
PDF
Early detection and classification of bone marrow changes in lumbar vertebrae...
PPTX
Training Program for knowledge in solar cell and solar industry
PPTX
MuleSoft-Compete-Deck for midddleware integrations
PDF
5-Ways-AI-is-Revolutionizing-Telecom-Quality-Engineering.pdf
PDF
Data Virtualization in Action: Scaling APIs and Apps with FME
PDF
Accessing-Finance-in-Jordan-MENA 2024 2025.pdf
PDF
Electrocardiogram sequences data analytics and classification using unsupervi...
PDF
INTERSPEECH 2025 「Recent Advances and Future Directions in Voice Conversion」
PDF
Introduction to MCP and A2A Protocols: Enabling Agent Communication
Connector Corner: Transform Unstructured Documents with Agentic Automation
Planning-an-Audit-A-How-To-Guide-Checklist-WP.pdf
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION
future_of_ai_comprehensive_20250822032121.pptx
Lung cancer patients survival prediction using outlier detection and optimize...
“The Future of Visual AI: Efficient Multimodal Intelligence,” a Keynote Prese...
LMS bot: enhanced learning management systems for improved student learning e...
AI-driven Assurance Across Your End-to-end Network With ThousandEyes
AI.gov: A Trojan Horse in the Age of Artificial Intelligence
SaaS reusability assessment using machine learning techniques
4 layer Arch & Reference Arch of IoT.pdf
Early detection and classification of bone marrow changes in lumbar vertebrae...
Training Program for knowledge in solar cell and solar industry
MuleSoft-Compete-Deck for midddleware integrations
5-Ways-AI-is-Revolutionizing-Telecom-Quality-Engineering.pdf
Data Virtualization in Action: Scaling APIs and Apps with FME
Accessing-Finance-in-Jordan-MENA 2024 2025.pdf
Electrocardiogram sequences data analytics and classification using unsupervi...
INTERSPEECH 2025 「Recent Advances and Future Directions in Voice Conversion」
Introduction to MCP and A2A Protocols: Enabling Agent Communication

CARE initiative technical prospectus

  • 1. Cloud Access Reliability Engineering Initiative Interoperable SLAs for digital supply chains Technical Prospectus 4th November 2018
  • 2. CLOUD SERVICE (SaaS/PaaS) TELCO & CLOUD ACCESS CUSTOMER PREMISES DEVICE APPLICATION SOFTWARE Overview of the problem: digital supply chains exist to support distributed applications 2 DIGITALSUPPLYCHAIN The increasingly variability has making it harder to reason about performance and engineer reliability to meet that demand. We have to decide where to put the compute, and when to communicate. The number of design and configuration degrees of freedom are rising: location, capacity, scheduling, loss vs delay. Questions about too much demand or not enough supply have become hard to answer. Safety margins have become opaque. The trade-offs of cost and performance lack predictable outcomes. The resulting unreliability is driving user frustration – the “motherbuffer”, which in turn creates costly workarounds. The main one has been “spend money on bandwidth” – but this doesn’t work to solve growing variability. The answer is the software industry equivalent of “containerisation”: what the “rack” does for cloud hardware, we need for cloud services. But how? Users have portfolios of cloud applications that they wish to use. These have a mixture of availability demands. The telecoms-cloud industry is building supply using a variety of technologies, such as 5G slicing, SDN/NFV, SD-WAN, serverless computing, and WiFi. That supply is becoming ever more dynamic (higher speeds, more statistical resource sharing, more wireless) and more distributed (e.g. network function virtualisation, edge apps). Technologies like cloud also have a start-up latency for the container versus “bare metal” computing, as well as having to wait for shared resources (as with packet data).
  • 3. LAN Access network Longhaul network Cloud application container SD-WAN VPN DATA CENTRE USER DEVICE EXPERIENCE Presentation layer runtime APPLICATION PROVIDER TELCO & CLOUD ACCESS CUSTOMER PREMISES APPLICATION PROVIDER Digital supply chains are systems of supply and demand… just like any other industry 3 DEMANDFORCOPYINGDATADOWNSTREAM DEMANDFORCOPYINGDATAUPSTEAM SUPPLYOFINFORMATIONDOWNSTREAM SUPPLYOFINFORMATIONUPSTREAM Industries typically have a standard unit of supply and demand that meaningfully “adds up” and interoperates Natural gas BTU Electricity MW Water litre Corn bushel Sugar pound Oil barrel Ethanol gallon Copper ton Wool kilogram Gold ounce Shipping 40ft container Cloud application access = ?
  • 4. Wi-Fi Ethernet xDSL Cable FTTx 2G/3G/4G/5G Public Internet Private cloud global access MPLS/Carrier Ethernet/etc. Hosted app Public or private cloud Serverless functions SD-WAN No SD-WAN VPN No VPN DISTRIBUTED APPLICATION SOFTWARE DISTRIBUTED APPLICATION SOFTWARE VDI Web browser UC/VoIP AppTV APPLICATION PROVIDER TELCO & CLOUD ACCESS CUSTOMER PREMISES APPLICATION PROVIDER Digital supply chains involve complex interactions between multiple technology stacks 4 HORIZONTAL INTEROPERABILITY VERTICALINTEROPERABILITY ICT suppliers want to manage “vertical” interoperability: How do I deliver enough reliability end-to-end to meet the customer’s need? How can I optimise cost for my sub-path without sacrificing that reliability? End users demand “horizontal” interoperability: How do I know what on- premise capabilities I need? How do I select a service provider and network access technology and know my applications will work?
  • 5. Wi-Fi Ethernet xDSL Cable FTTx 2G/3G/4G/5G Public Internet Private cloud global access MPLS/Carrier Ethernet/etc. Hosted app Public or private cloud Serverless functions SD-WAN No SD-WAN VPN No VPN DISTRIBUTED APPLICATION SOFTWARE DISTRIBUTED APPLICATION SOFTWARE VDI Web browser UC/VoIP AppTV APPLICATION PROVIDER TELCO & CLOUD ACCESS CUSTOMER PREMISES APPLICATION PROVIDER “Vertical” interoperability needs standardised and (de)composable “availability SLAs” 5 The end-to-end requirement is for a bounded probability of packet latency and loss. The concept of “quality attenuation” unifies latency and loss into a single mathematical object, analogous to how complex numbers bring together real and imaginary numbers. ”Quality attenuation” can be expressed as composable “availability SLAs” (when using ∆Q metrics) that “add up” along the end-to-end path. END-TO-ENDQUALITYREQUIREMENT
  • 6. Wi-Fi Ethernet xDSL Cable FTTx 2G/3G/4G/5G Public Internet Private cloud global access MPLS/Carrier Ethernet/etc. Hosted app Public or private cloud Serverless functions SD-WAN No SD-WAN VPN No VPN DISTRIBUTED APPLICATION SOFTWARE DISTRIBUTED APPLICATION SOFTWARE VDI Web browser UC/VoIP AppTV APPLICATION PROVIDER TELCO & CLOUD ACCESS CUSTOMER PREMISES APPLICATION PROVIDER “Horizontal” interoperability decouples operational implementation from the availability SLA 6 HORIZONTAL INTEROPERABILITY Availability is something you can only lose: the baseline is “always available”, and every element can only detract from that standard of perfection! We can use the “availability SLAs” to create a “budget” for quality attenuation for each sub-system and element of the system. ”Horizontal” interoperability means we can safely make independent operational choices over how to meet that “budget”. As long as we are “under budget” (to an agreed probability) then we know the end-to-end availability requirement will still be met.
  • 7. Wi-Fi Ethernet xDSL Cable FTTx 2G/3G/4G/5G Public Internet Private cloud global access MPLS/Carrier Ethernet/etc. Hosted app Public or private cloud Serverless functions SD-WAN No SD-WAN VPN No VPN DISTRIBUTED APPLICATION SOFTWARE DISTRIBUTED APPLICATION SOFTWARE VDI Web browser UC/VoIP AppTV APPLICATION PROVIDER TELCO & CLOUD ACCESS CUSTOMER PREMISES APPLICATION PROVIDER Our industry challenge: to develop a supply chain quality management system 7 HORIZONTAL INTEROPERABILITY VERTICALINTEROPERABILITY Relating “vertical” to “horizontal” needs a common language of metrics (for SLAs) and measures (for operations): Standardised metrics Standardised SLAs Standardised operational measurement methods Standardised service lifecycle management processes Standardised network quality assurance mechanisms
  • 8. DISTRIBUTED COMPUTING DISTRIBUTED COMPUTING DISTRIBUTED COMPUTING DISTRIBUTED COMPUTING DISTRIBUTED COMPUTING DISTRIBUTED COMPUTING DISTRIBUTED COMPUTING DISTRIBUTED COMPUTING DISTRIBUTED COMPUTING DISTRIBUTED COMPUTING DISTRIBUTED COMPUTING DISTRIBUTED COMPUTING DISTRIBUTED COMPUTING DISTRIBUTED COMPUTING DISTRIBUTED COMPUTING DISTRIBUTED COMPUTING DISTRIBUTED COMPUTING DISTRIBUTED COMPUTING DISTRIBUTED COMPUTING DISTRIBUTED COMPUTING DISTRIBUTED COMPUTING DISTRIBUTED COMPUTING DISTRIB UTED COMPU TING APPLICATION PROVIDER TELCO & CLOUD ACCESS CUSTOMER PREMISES APPLICATION PROVIDER We have to create the performance integration framework (“glue”) for interoperable SLAs 8 Other industries have solved this reliability engineering integration problem. For instance, the oil industry can relate “upstream” extraction and refining activities to “downstream” distribution. Telecoms had this capability during the telephony era with erlangs as a rational unit of supply and demand. We lost this when we moved to packet-based statistical multiplexing. ∆Q-based “availability SLAs” are ”cloud access erlangs”. We can now performance engineer digital supply chains, create new assurance revenue and reduce the cost of workarounds and failure.
  • 9. Requirements Challenges Opportunities Define availability SLA …in the user/customer’s own terms Availability SLA metrics …don’t sufficiently reflect QoE and aren’t properly composable/causal Capture the basic science …as ∆Q calculus is the only known scientifically sound approach Solving these reliability engineering problems needs interoperable metrics and measures 9 Decompose SLA …’vertically’ in supply chains to specify SLA requirement on each subsystem and sub-subsystem Market differentiated SLA …to win the customer’s business in ‘horizontal’ competitive market Operationally deliver SLA …to meet our service availability promises (with optimised cost and risk trade-offs for profitability) SLA operational measures …aren’t calibrated against testing and inspection reference standards SLA management models …don’t fully incorporate established and proven management theory (e.g. 6-sigma, TOC, lean, Vanguard) SLA assurance methods …require improved or new business processes across service lifecycle Create standard metrics …that accurately reflect availability in the customer’s own terms Calibrate measures …to manage error bounds, done based on shared cost and IPR Construct assurance SLAs …that have the essential “horizontal” and “vertical” interoperability properties
  • 10. Capture the basic science …as ∆Q performance calculus is the only scientifically sound approach Create standard metrics …that accurately reflect availability in the customer’s own terms Calibrate measures …to manage error bounds, done based on shared cost and IPR Construct assurance SLAs …that have the essential “horizontal” and “vertical” interoperability properties We have a strong baseline of ∆Q research and technology that now requires industrialisation 10 Fully developed ∆Q theoretical framework. Quality attenuation theory training materials available. “Wind tunnel” for cloud apps to establish ∆Q-based SLA. Demonstrated ability to “budget” performance using ∆Q. Mature first generation ∆Q measurement system. Industrialisation and scaling (e.g. TWAMP) in progress. Contention Management technology developed and demonstrated to assure ∆Q SLA even in overload. Operational trial platform available for shared use (collaboration with Just Right Networks Ltd & SureTec Ltd). 15+ years of development by leading team of reliability engineering and distributed systems experts Proven in application at both tier 1 network operators as well as many boutique and exotic applications
  • 11. Capture the basic science …as ∆Q performance calculus is the only scientifically sound approach Create standard metrics …that accurately reflect availability in the customer’s own terms Calibrate measures …to manage error bounds, done based on shared cost and IPR Construct assurance SLAs …that have the essential “horizontal” and “vertical” interoperability properties The next steps are clear… and interoperability by its nature is a collaborative activity 11 Develop ex ante reliability engineering curriculum from existing materials — ‘train the trainer’ and disseminate to industry Create reference ∆Q-based SLAs for common application types Construct software libraries to persist and manipulate the SLAs Frictionless deployment of packet observation end points Use high-fidelity measures to audit lower-fidelity measures Establish ‘state of the possible’ (i.e. highest yield of QoE at lowest cost and risk) to establish technical reference study that feeds into business case for industry investment in initiative. The time has come to ‘up our game’ and collectively engage with the problem of interoperable metrics This requires an industry effort that no single player can deliver as every process and system is affected
  • 12. We help you to apply these reliability engineering breakthroughs to solve the cloud access interoperability and integration problem for distributed applications in complex digital supply chains We feed specialist skills into the development of existing industry initiatives and projects. Examples: 5G, SDN/NFV, Zero Touch Automation, UCaaS, distributed apps (inc. blockchain), machine learning We adapt established quality management methods from other industries, and drive the adoption of new scientific metrics and measures that are suitable for our own. We develop methods, standards and tools by a process of action learning. We target the full service lifecycle (product development, marketing and sales, in-life service and support). 12 Purpose of the Cloud Access Reliability Engineering Initiative 12 …performance integration technology …management methods and processes …skills, people and relationships Cloud access reliability engineering…
  • 14. NETWORK DEMAND NETWORK SUPPLY QoE ‘SLAZARD’ High-fidelity network measures We know the ‘interoperable unit’ answer: the ∆Q calculus (see http://qualityattenuation.science/) 14
  • 15. ∆Q(A) ∆Q(B) ∆Q(C) VA SA GA VB SB GB VC SC GC + + + + + + = = = ∑V ∑S ∑G SUPPLIER A SUPPLIER B SUPPLIER C ∆Q(∑ A+B+C) 15 Variable delay due to load Size of packet delay Geographic delay ∆Q|G ∆Q|S ∆Q|V Packet size One-waydelay G/S/V are independent probability functions using improper random variables or improper cumulative distributions. These can be (de)convolved and “budgeted” along the supply chain using (de)composable “quality SLAs”. ∆Q metrics have an algebra for engineering predictable performance (& nothing else does!) 15
  • 16. Other metrics and measures ∆Q-based metrics and measures Be a strong proxy for QoE Yes: e.g. effective bandwidth, Actual Experience Yes Isolate problems in supply chains Partial: Correlation, but not strong causation Yes Offer an auditable evidence chain No: Would not stand up in court as standard of proof Yes Be non-intrusive Some: Only for passively observed single point average metrics; others more like DoS attacks! Yes Work for all types of bearer Some: Separate worlds of cable, 5G, SDN/NFV all doing own thing. No user-centric end-to-end view. Yes Be cheap to gather and operate Some: But high fidelity remains expensive. Yes Be non-proprietary Some: Cheap to gather data is of low fidelity Yes Have a scientific basis Partial: Have scientific basis, but limited generality Yes Able to define ‘safety margin’ Partial: Only weak proxies for safety margin Yes Can engineer spatial (location of compute, routing of data) as well as temporal (scheduling of resources) Partial: Some ability to separate static from dynamic but no formal algebra Yes SLAs are (de)composable No: Composition not a meaningful operation Yes There are other metrics and measures, some of which meet some needs, but none meet all 16

Editor's Notes

  • #15: Only works if you have ∆Q calculus to unify loss and delay into a single statistical resource and high-fidelity metrics to create CDF. Special ingredient is to break latency into three basis functions: geography, packet size, and variable load delay