SlideShare a Scribd company logo
E N D - T O - E N D
Q U A L I T Y O F E X P E R I E N C E
E V A L U A T I O N F O R
H T T P A D A P T I V E S T R E A M I N G
B A B A K T A R A G H I
U N I V . - P R O F . D I D R . C H R I S T I A N T I M M E R E R
A S S O C . - P R O F . D I D R . M A T H I A S L U X
A S S O C . - P R O F . D I D R . K L A U S S C H Ö F F M A N N
A S S O C . - P R O F . D I D R . A L I C E N G I Z B E G Ě N
C L A S S O F 2 0 2 0
A T H E N A C H R I S T I A N D O P P L E R ( C D ) L A B O R A T O R Y
I T E C - I N S T I T U T E O F I N F O R M A T I O N T E C H N O L O G Y
Agenda
• Introduction and Context (9 minutes)
• Evaluation Frameworks (8 minutes)
• Studies on QoE Impacting Factors
(12 minutes)
• Comprehensive Dataset Presentation
(7 minutes)
• Highlights and Future Directions
(3 minutes)
• Q&A
Introduction
Context, HAS and QoE
Research Questions
Research Methodology
Contributions and Publications
HTTP Adaptive Streaming I
Figure 1: HTTP Adaptive Streaming (HAS) concept and how the delivered quality of segments depends on the shape of the network.
4
• Provisioning
– Codecs and Encoders, Encryptors
• Delivery
– Network Protocols, and
Topologies
• Consumption
– Media players and ABR
algorithms
HTTP Adaptive Streaming II
Consumption
Delivery
Provisioning
5
End-to-end
Aspect
Quality of Experience I
The degree of delight or annoyance of the user of
an application or service. It results from the
fulfilment of his or her expectations with respect to
the utility and/or enjoyment of the application or
service in the light of the user’s personality and
current state. – Brunnström et al. [27]
6
Quality of Experience II
• How to evaluate or measure the degree of annoyance or delightfulness of the user
– Objective Evaluation
• Understand and formulate the metrics
– Start-up Delay: How long does it take for the user to see the first frame
of the video from the moment s/he clicks the play button?
– Delivered Media Quality: What is the delivered media quality
at each moment and in average?
• E.g.: VMAF, Resolution, and Bitrate
– Stall Events (rebuffering): How many times a
stall event happens and for how long?
• Using quality models
– Subjective Evaluation
• Investigate the perceived quality by user
– Conduct evaluation with human subjects
7
Research Questions
RQ1) How to design, develop, and deploy
scalable end-to- end QoE evaluation
groundwork for HAS, encompassing both
video-on-demand content and low-latency live
streaming?
RQ2) What are the QoE influencing perceptual
factors, and how can they be effectively
evaluated through subjective assessment
methods in HAS? And how do existing quality
models align with the findings derived from
the subjective assessments?
8
• Empirical Research Methodology
– An approach to investigation that relies on direct or indirect observation and experience to
gather data and generate knowledge. It involves systematically collecting and analysing
empirical evidence, such as measurements, experiments, and observations, to test
hypotheses and validate theories.
– Data-driven Assessment
– Real-world Evaluation and User-Centric Perspective
– Allows Objective and Subjective Measures
• Objective: Unbiased and quantifiable, using predetermined criteria
and standards [9]
• Subjective: The process of assessment based on personal
opinions, feelings, or individual judgments [9]
– Supports Iterative Improvement
– Helps with Industry and Standardization
Research Methodology
9
10
CAdViSE: cloud-based adaptive video streaming evaluation framework for the automated testing of media players.
In Proceedings of the 11th ACM Multimedia Systems Conference (MMSys), 2020
Understanding Quality of Experience of Heuristic- based HTTP Adaptive Bitrate Algorithms. In Proceedings of the
31st ACM Workshop on Network and Operating Systems Support for Digital Audio and Video (NOSSDAV), 2021
INTENSE: In-Depth Studies on Stall Events and Quality Switches and Their Impact on the Quality of Experience in
HTTP Adaptive Streaming. In IEEE Access, 2021
Multi-codec ultra high definition 8K MPEG-DASH dataset. In Proceedings of the 13th ACM Multimedia Systems
Conference (MMSys), 2022
LLL-CAdViSE: Live Low-Latency Cloud-Based Adaptive Video Streaming Evaluation Framework. In IEEE Access, 2023
Contributations
Evaluation Frameworks
11
CAdViSE: Cloud-based Adaptive Video
Streaming Evaluation
LLL-CAdViSE: Live Low-Latency Cloud-based Adaptive Video Streaming Evaluation
• Media Players Evaluation with CAdViSE
• Live Low-latency Evaluation with LLL-CAdViSE
Use Cases
• A Quality of Experience evaluation framework for HTTP Adaptive Streaming
– Facilitates an organized and structured evaluation
• Test environment remains the same; therefore, the results can be
interpreted as improved performance or otherwise
– Its cloud-based, since scalability is a key factor
– Enabled to assess multiple ABR algorithms and media simultaneously
– Simulates network conditions; accepts network traces as plugins
• Mimics real world network characteristics scenarios
– Provides unified insights into quality metrics
• Measures raw metrics
• Works seamlessly with analytic tools (graphs and plots)
CAdViSE (What?)
12
• Application Layer
– Runner, Initializer and Starter scripts
– Written with Bash Script, Python and JavaScript
• Cloud Components
– Player Container (VNC and Selenium)
– Network Emulator
– EC2 Instances, SSM Execution, DynamoDB, S3 and
CloudWatch
• Logs and Analytics
– Comprehensive Logs
– Analytic Players Plugin
CAdViSE (How?)
13
Live Low-Latency (CAdViSE)
14
Server (AWS EC2)
- Generate the live feed
- Encode
- Package (DASH & HLS)
- Ingest & Deliver
- Calculate MOS
- Manipulate network
Client (AWS EC2)
- Run media player
- Redirect requests to
server
- Record logs
- Manipulate
network
Database (AWS DynamoDB)
- Store log records
- Index the data
- Retrieve log
records
LLL-CAdViSE Console (Shell)
- Manage EC2 instances
- Initialize server and client(s)
- Execute the experiment
- Execute QoE
calculation
P r e l i m i n a r y
E v a l u a t i o n
W i t h C A d V i S E
15
• 5 Experiments; 9:00 minutes each
• AWS EC2 t2.medium instances (4Gib RAM,2 CPU
cores)
• Emulated Network Profiles: 4 mbit/s <> 800 kbit/s
• Target Latencies: 1s, 3s, 5s, and 10s
• Two streaming formats:
– MPEG-DASH (dash.js 4.4.1)
– HLS (hls.js 1.2.0)
• ARB algorithms:
– Learn2Adapt-LowLatency (L2A-LL)
– Low-on-Latency Plus (LoLP)
• 3 Experiments of 420 seconds
• Network profiles:
– Bicycle commuter LTE network
– Car driver LTE network
– Train commuter LTE network
– Tram commuter LTE network
– Network0 up to 10Gpbs
LLL-CAdViSE Evaluation Setup
16
0
1000
2000
3000
4000
5000
6000
7000
8000
0
10
20
30
40
50
60
70
80
90
100
110
120
130
140
150
160
170
180
190
200
210
220
230
240
250
260
270
280
290
300
310
320
330
340
350
360
370
380
390
400
410
KBPS
TIME (SECONDS)
Biker Network Profile
Available Bandwidth Poly. (Available Bandwidth)
0
2000
4000
6000
8000
10000
12000
14000
0
10
20
30
40
50
60
70
80
90
100
110
120
130
140
150
160
170
180
190
200
210
220
230
240
250
260
270
280
290
300
310
320
330
340
350
360
370
380
390
400
410
KBPS
TIME (SECONDS)
Car Driver Network Profile
Available Bandwidth Poly. (Available Bandwidth)
0
1000
2000
3000
4000
5000
6000
7000
8000
0
10
20
30
40
50
60
70
80
90
100
110
120
130
140
150
160
170
180
190
200
210
220
230
240
250
260
270
280
290
300
310
320
330
340
350
360
370
380
390
400
410
KBPS
TIME (SECONDS)
Train Commuter Network Profile
Available Bandwidth Poly. (Available Bandwidth)
0
1000
2000
3000
4000
5000
6000
7000
0
10
20
30
40
50
60
70
80
90
100
110
120
130
140
150
160
170
180
190
200
210
220
230
240
250
260
270
280
290
300
310
320
330
340
350
360
370
380
390
400
410
KBPS
TIME (SECONDS)
Tram Commuter Network Profile
Available Bandwidth Poly. (Available Bandwidth)
LLL-CAdViSE Evaluation Result I
17
• All time values are in seconds.
• a: Experiment title, format: [protocol]-[ABR]-[network]-
[target latency] (def: Default, l2a: L2A-LL).
• b: Average of the sum of stall events duration.
• c: Average start-up delay.
• d: Average of the sum of seek events duration.
• e: Average quantity of quality switches.
• f: Playback bitrate (min-max-avg) in kbps.
• g: Latency (min-max-avg).
• h: Playback rate (min-max-avg).
• i: Average MOS predicted by the ITU-T P.1203
quality model.
LLL-CAdViSE Evaluation Result II
18
1
2
3
4
5
0
5
10
15
20
Bicycle Car Train Tram Net0
Average
P.1203
MOS
Average
Latency
(second)
Network Profiles
MPEG-DASH, TL: 5S
Default L2A-LL LoLP
Default L2A-LL LoLP
Latency:
MOS:
1
2
3
4
5
0
10
20
30
40
Bicycle Car Train Tram Net0
Average
P.1203
MOS
Average
Latency
(second)
Network Profiles
HLS, TL: 5S
Default L2A-LL LoLP
Default L2A-LL LoLP
Latency:
MOS:
Studies on QoE
Impacting Factors
19
• Exploring Adaptive Bitrate (ABR) Algorithms
• Objective and Subjective Evaluation
• Empirical Findings
Understanding Quality of Experience
• Minimum Noticeable Stall event Duration (MNSD) Evaluation
• Stall event vs. Quality level switch (SvQ) Evaluation
• Short stall events vs. a Longer stall event (SvL) Evaluation
• Relation of Stall event impact on the QoE with Video Quality level
(RSVQ) Evaluation
• Objective QoE Models Comparison
In-depth Studies on Stall Events and Quality Switches
• Throughput-based
– Uses throughput prediction heuristics to optimize streaming quality by estimating available network
bandwidth.
– Examples: PANDA, Festive, CrystalBall.
• Buffer-based
– Relies solely on buffer occupancy to make streaming decisions, aiming to prevent buffer underruns and
stalling.
– Examples: BBA0, BOLA, Quetra.
• Hybrid
– Integrates multiple heuristics such as throughput, buffer level, and latency
to make comprehensive streaming decisions.
– Examples: GTA, Elastic, MPC.
• Learning-based
– Utilizes machine learning techniques to adapt streaming quality based on
historical data and real-time network conditions.
– Examples: Pensieve, Fugu, Stick.
Exploring ABR Algorithms
20
CAdViSE Testbed:
Cloud-based platform for assessing ABR algorithms under diverse network conditions.
Ensures reproducibility with session logs for accurate recreation of streaming sessions.
Experiment Logs:
Logs archived in DynamoDB.
Script processes logs to simulate and inject stall events using FFmpeg.
Video Processing:
Generates a JSON file for ITU-T P.1203 model to obtain Mean Opinion Score (MOS).
Concatenates audio and video tracks for finalized mp4 files.
Evaluation Portal:
Developed using Serverless Architecture and AWS Lambda.
Based on ITU-T P.910 standards for subjective assessments.
Crowdsourced Testing:
Uses Amazon Mechanical Turk for participant recruitment.
Custom web media player delivers test sequences to users.
Evaluation Process:
Participants watch and rate 10 test sequences on a 1 to 5 scale.
Reliability questions ensure valid votes.
Results stored and processed via AWS services.
Objective and Subjective Evaluation
21
1000
2000
3000
4000
5000
6000
7000
8000
10
20
30
40
50
60
70
80
90
100
110
120
k
b
p
s
s
e
c
o
n
d
Ramp Up
Ramp Dow n
1000
2000
3000
4000
5000
6000
7000
8000
10
20
30
40
50
60
70
80
90
100
110
120
k
b
p
s
s
e
c
o
n
d
Stable
Fluctuation
Empirical Findings I
22
FastMPC Elastic BBA0 Quetra BOLA dash.js Shaka
Fluctuation 73.23 5.85 7.95 10.88 28.46 41.40 52.25
Ramp Down 30.63 8.35 6.18 10.33 11.29 21.29 34.90
Ramp Up 17.18 0.00 0.19 0.00 4.13 4.55 13.39
Stable 12.84 0.16 0.00 0.00 4.20 4.26 20.12
0
15
30
45
60
75
90
AVG.
STALL
(SECOND)
FastMPC Elastic BBA0
Fluctuation 5.48 5.36 5.48
Ramp Down 5.56 5.29 5.41
Ramp Up 7.22 6.37 6.56
Stable 5.65 5.46 5.40
0
2
4
6
8
10
12
AVG.
STARTUP
(SECOND)
Quetra BOLA dash.js Shaka
10.88 28.46 41.40 52.25
10.33 11.29 21.29 34.90
0.00 4.13 4.55 13.39
0.00 4.20 4.26 20.12
FastMPC Elastic BBA0 Quetra BOLA dash.js Shaka
Fluctuation 5.48 5.36 5.48 5.36 5.56 5.50 5.28
Ramp Down 5.56 5.29 5.41 5.43 5.57 5.54 5.40
Ramp Up 7.22 6.37 6.56 6.78 7.48 7.51 9.65
Stable 5.65 5.46 5.40 5.42 5.62 5.65 5.65
0
2
4
6
8
10
12
AVG.
STARTUP
(SECOND)
Empirical Findings II
23
2.64
2.93
3.13
2.24
3.07
2.26
2.70
3.66
3.87 3.98
3.67
3.80
3.34
3.67
1.00
1.50
2.00
2.50
3.00
3.50
4.00
4.50
5.00
BBA0 BOLA dash.js Elastic FastMPC Quetra Shaka
Pearson's Correlation Coefficient 0.84
Objective MOS Subjective MOS
2.22
1.86
1.99 2.07
1.91 1.98 1.98
3.39
3.21 3.29
3.12 3.10 3.08
3.30
1.00
1.50
2.00
2.50
3.00
3.50
4.00
4.50
5.00
BBA0 BOLA dash.js Elastic FastMPC Quetra Shaka
Pearson's Correlation Coefficient 0.52
Objective MOS Subjective MOS
Stable Network Profile Fluctuation Network Profile
2.56
2.67 2.63
2.26
2.84
2.26
2.79
3.62 3.73 3.65
3.45
3.68
3.41
3.73
1.00
1.50
2.00
2.50
3.00
3.50
4.00
4.50
5.00
BBA0 BOLA dash.js Elastic FastMPC Quetra Shaka
Pearson's Correlation Coefficient 0.94
Objective MOS Subjective MOS
RampUp Network Profile
2.33 2.43 2.33
2.00
2.48
2.00
2.35
3.48
3.65
3.48
3.26
3.48
3.26
3.45
1.00
1.50
2.00
2.50
3.00
3.50
4.00
4.50
5.00
BBA0 BOLA dash.js Elastic FastMPC Quetra Shaka
Pearson's Correlation Coefficient 0.90
Objective MOS Subjective MOS
RampDown Network Profile
In-depth Studies on Stall Events and
Quality Switches
• Minimum Noticeable Stall Duration (MNSD):
– Investigated the threshold below which stall events are not noticeable to users,
thus not affecting perceived QoE.
• Stall Event vs. Quality Switch (SvQ):
– Evaluated user preference between experiencing a stall event or a quality drop
during unfavourable network conditions.
• Short vs. Long Stall Events (SvL):
– Studied the impact on QoE of multiple short stall events versus a single longer
stall event, considering both predicted and perceived MOS.
• Stall Impact and Video Quality (RSVQ):
– Examined the relationship between the impact of stall events on QoE and video
quality level, addressing conflicting findings from previous studies.
• QoE Models Comparison:
– Compared various QoE objective evaluation models with subjective MOS results
to study their correlations.
24
Subjective Evaluation Portal
25
Minimum Noticeable Stall Duration
26
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
<
0
.
0
5
1
<
0
.
1
0
1
<
0
.
1
5
1
<
0
.
2
0
1
<
0
.
2
0
1
2
3
4
5
6
7
8
9
10
11
12
1
5
9
13
17
21
25
29
33
37
41
45
49
53
57
61
65
69
73
77
81
85
89
93
97
101
Number
of
Times
Stall
Event
Exposed
Stall Event Duration (Millisecond)
Missed Stall Events
Log. (Missed Stall Events)
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
<
0.051
<
0.101
<
0.151
<
0.201
<
0.251
<
0.301
<
0.351
<
0.401
<
0.451
<
0.501
<
0.551
<
0.601
<
0.651
<
0.701
<
0.751
<
0.801
<
0.851
<
0.901
<
0.951
<
1.001
Stall Event Duration (Second)
Noticed Missed
65
69
73
77
81
85
89
93
97
101
cond)
Missed Stall Events
Log. (Missed Stall Events)
• Decrease in noticed stall events starts at
durations less than 0.301 seconds.
• Over 45% of subjects did not notice stall
events shorter than 0.051 seconds.
• Stall events under 0.004 seconds were not
noticeable to participants.
Stall Event vs. Quality Switch
27
Set A - Case I: A pattern with 6s stall
event and upward quality switch.
Set A - Case II: A pattern without a stall
event and continuous low-quality
streaming.
Set B - Case I: A pattern with high
video quality streaming but with a 6s
stall event.
Set B - Case II: A pattern with a
downward quality switch and without
stall event.
Stall Event vs. Quality Switch
28
Set A Case I Set A Case II Set B Case I Set B Case II
Perceived MOS 3.28 3.11 3.75 3.52
Predicted MOS 2.96 2.45 3.62 2.76
1
2
3
4
5
Mean
Opinion
Score
Stall Events' Patterns
• Preference for Case I in both Set A
and Set B over Case II.
• Preference for higher-quality
versions even with a 6-second stall.
Short vs. Long Stall Events
29
(0-0) (1-4) (1-8) (4-1) (4-2) (8-1)
Perceived MOS 4.54 4.11 3.83 3.44 3.35 3.23
Predicted MOS 4.71 4.31 4.12 3.33 3.23 2.67
1
2
3
4
5
Mean
Opinion
Score
Stall Events' Patterns (count,duration)
• Preference for longer stall
events over frequent, shorter
ones
Stall Impact and Video Quality
30
Q1
Q1 +
Stall
Q2
Q2 +
Stall
Q3
Q3 +
Stall
Perceived MOS 2.85 2.57 3.81 3.08 4.48 3.77
Predicted MOS 1.88 1.65 2.6 2.11 4.63 3.36
1
2
3
4
5
Mean
Opinion
Score
VMAF Video Quality
• Minor QoE penalty from stall events
at low-quality videos (Q1).
• Higher penalty on QoE for middle
(Q2) and high-quality (Q3) videos
with stall events.
QoE Models Comparison
31
• BiQPS and FINEAS:
- Inconsistent performance across
evaluations.
• P.1203 model:
- Best overall performance.
- Highest PCC and SRCC (> 0.8)
- Lowest RMSE: 0.326.
• Pearson Correlation Coefficient (PCC)
• Spearman’s Rank Correlation Coefficient (SRCC)
• Root Mean Square Error (RMSE)
A Comprehensive
Dataset
32
Video Codecs and Development Procedures
Source Video Sequences
Available Representations
Video Codecs and Development Procedures
33
• Advanced Video Coding (AVC)
– Library: libx264 (version 0.160.3011) from FFMPEG, slow preset.
• High Efficiency Video Coding (HEVC)
– Library: libx265 (version 3.4) from FFMPEG, slow preset.
• AOMedia Video 1 (AV1)
– Library: libsvtav1 (version 0.9.0) from FFMPEG, preset 8.
• Versatile Video Coding (VVC)
– Library: Fraunhofer VVenC (version 1.3.1), requires 8-bit YUV input, processed with FFMPEG and encoded with VVenC.
– At the dataset preparation time MP4Box (part of GPAC project) supports VVC in nightly builds, enabling MP4 file
packaging, VVC bitstream dumping, and MPEG-DASH content packaging
• ISOBMFF incl. VVC
• DASH manifest
Encoder
VVC Elementary Streams
GPAC
Encoded Audio Track
Playback
Decoder
GPAC
MP4Client
Source Video Sequences
34
Available Representations
35
• Resolutions up to 7680x4320 or 8K
• Maximum media duration of 322 seconds
• Segment lengths of 4 and 8 seconds
• Available publicly with the following link:
http://www.itec.aau.at/ftp/datasets/mmsys22
Highlights And
Conclusion
Three Main Categories of Contributions:
1. Evaluation frameworks (CAdViSE and LLL-CAdViSE) for VOD and
live streaming.
1. Directly addresses RQ1.
2. Studies on subjective and objective QoE assessments and the
impacts of HAS defects on QoE.
1. Directly addresses RQ2.
1. Comprehensive dataset with up-to-date video technologies,
including 8K VVC.
1. Directly addresses RQ1.
36
Future Works
37
• Support for New Protocols and Codecs: Extend evaluation frameworks to include
emerging standards like WebRTC and VVC.
• Machine Learning for QoE: Apply machine learning techniques to predict and optimize
QoE based assessments.
• Enhance Quality Models: Align existing quality models with subjective assessment
findings for better prediction accuracy.
• Real-time QoE Monitoring: Develop tools for real-time
QoE monitoring and feedback to enable dynamic
adjustments during streaming sessions.
• User-centric QoE Personalization: Investigate methods
for personalizing QoE based on individual user
preferences and viewing habits.
Thank You!
38
Q&A

More Related Content

PPTX
CAdViSE or how to find the Sweet Spots of ABR Systems
Alpen-Adria-Universität
 
PDF
LLL-CAdViSE: Live Low-Latency Cloud-based Adaptive Video Streaming Evaluation...
Alpen-Adria-Universität
 
PDF
CAdViSE or how to find the sweet spots of ABR systems
Minh Nguyen
 
PDF
QoE in DASH
Eswar Publications
 
PDF
HTTP Adaptive Streaming – Quo Vadis? (2023)
Alpen-Adria-Universität
 
PDF
3. Quality of Experience-Centric Management.pdf
AliIssa53
 
PPTX
Can networks deliver quality of experience?
Antonio Liotta
 
PPTX
QoE++: Shifting from Ego- to Eco-System? QCMan 2015 Keynote
Tobias Hoßfeld
 
CAdViSE or how to find the Sweet Spots of ABR Systems
Alpen-Adria-Universität
 
LLL-CAdViSE: Live Low-Latency Cloud-based Adaptive Video Streaming Evaluation...
Alpen-Adria-Universität
 
CAdViSE or how to find the sweet spots of ABR systems
Minh Nguyen
 
QoE in DASH
Eswar Publications
 
HTTP Adaptive Streaming – Quo Vadis? (2023)
Alpen-Adria-Universität
 
3. Quality of Experience-Centric Management.pdf
AliIssa53
 
Can networks deliver quality of experience?
Antonio Liotta
 
QoE++: Shifting from Ego- to Eco-System? QCMan 2015 Keynote
Tobias Hoßfeld
 

Similar to End-to-end Quality of Experience Evaluation for HTTP Adaptive Streaming (20)

PDF
Quality of Experience and Quality of Service
Videoguy
 
PPTX
Assessing Effect Sizes of Influence Factors Towards a QoE Model for HTTP Adap...
SmartenIT
 
PDF
QoE-enabled big video streaming for large-scale heterogeneous clients and net...
redpel dot com
 
PDF
Quality Of Experience Engineering For Customer Added Value Services From Eval...
beskekucekw9
 
PDF
Automated Objective and Subjective Evaluation of HTTP Adaptive Streaming Systems
Alpen-Adria-Universität
 
PDF
HTTP Adaptive Streaming – Where Is It Heading?
Alpen-Adria-Universität
 
PDF
Quality of Experience
MusTufa Nullwala
 
PDF
HTTP Adaptive Streaming – Quo Vadis (2024)
Alpen-Adria-Universität
 
PDF
4. Quantitative comparison of application network interaction.pdf
AliIssa53
 
PPTX
The human side of video streaming services
Antonio Liotta
 
PDF
IEEE NS2 PROJECT@ DREAMWEB TECHNO SOLUTION
ranjith kumar
 
PDF
CSDN: CDN-Aware QoE Optimization in SDN-Assisted HTTP Adaptive Video Streaming
Alpen-Adria-Universität
 
PDF
Quality of Experience: Measuring Quality from the End-User Perspective
Förderverein Technische Fakultät
 
PDF
CSDN_ CDN-Aware QoE Optimization inSDN-Assisted HTTP Adaptive Video Streaming...
Reza Farahani
 
PPTX
Multimedia networks
Saqib Shehzad
 
PDF
Ericsson Technology Review: Video QoE: leveraging standards to meet rising us...
Ericsson
 
PPTX
Policy-Driven Dynamic HTTP Adaptive Streaming Player Environment
Alpen-Adria-Universität
 
PDF
QoE Optimization in Live Streaming
Alpen-Adria-Universität
 
PDF
SURVEY ON QOE\QOS CORRELATION MODELS FORMULTIMEDIA SERVICES
ijdpsjournal
 
DOCX
An SDN Based Approach To Measuring And Optimizing ABR Video Quality Of Experi...
Cisco Service Provider
 
Quality of Experience and Quality of Service
Videoguy
 
Assessing Effect Sizes of Influence Factors Towards a QoE Model for HTTP Adap...
SmartenIT
 
QoE-enabled big video streaming for large-scale heterogeneous clients and net...
redpel dot com
 
Quality Of Experience Engineering For Customer Added Value Services From Eval...
beskekucekw9
 
Automated Objective and Subjective Evaluation of HTTP Adaptive Streaming Systems
Alpen-Adria-Universität
 
HTTP Adaptive Streaming – Where Is It Heading?
Alpen-Adria-Universität
 
Quality of Experience
MusTufa Nullwala
 
HTTP Adaptive Streaming – Quo Vadis (2024)
Alpen-Adria-Universität
 
4. Quantitative comparison of application network interaction.pdf
AliIssa53
 
The human side of video streaming services
Antonio Liotta
 
IEEE NS2 PROJECT@ DREAMWEB TECHNO SOLUTION
ranjith kumar
 
CSDN: CDN-Aware QoE Optimization in SDN-Assisted HTTP Adaptive Video Streaming
Alpen-Adria-Universität
 
Quality of Experience: Measuring Quality from the End-User Perspective
Förderverein Technische Fakultät
 
CSDN_ CDN-Aware QoE Optimization inSDN-Assisted HTTP Adaptive Video Streaming...
Reza Farahani
 
Multimedia networks
Saqib Shehzad
 
Ericsson Technology Review: Video QoE: leveraging standards to meet rising us...
Ericsson
 
Policy-Driven Dynamic HTTP Adaptive Streaming Player Environment
Alpen-Adria-Universität
 
QoE Optimization in Live Streaming
Alpen-Adria-Universität
 
SURVEY ON QOE\QOS CORRELATION MODELS FORMULTIMEDIA SERVICES
ijdpsjournal
 
An SDN Based Approach To Measuring And Optimizing ABR Video Quality Of Experi...
Cisco Service Provider
 
Ad

More from Alpen-Adria-Universität (20)

PDF
Energy-Quality-aware Variable Framerate Pareto-Front for Adaptive Video Strea...
Alpen-Adria-Universität
 
PDF
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
PDF
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
PDF
VEED: Video Encoding Energy and CO2 Emissions Dataset for AWS EC2 instances
Alpen-Adria-Universität
 
PDF
GREEM: An Open-Source Energy Measurement Tool for Video Processing
Alpen-Adria-Universität
 
PDF
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
Alpen-Adria-Universität
 
PDF
VEEP: Video Encoding Energy and CO₂ Emission Prediction
Alpen-Adria-Universität
 
PDF
Content-adaptive Video Coding for HTTP Adaptive Streaming
Alpen-Adria-Universität
 
PPTX
Empowerment of Atypical Viewers via Low-Effort Personalized Modeling of Video...
Alpen-Adria-Universität
 
PPTX
Empowerment of Atypical Viewers via Low-Effort Personalized Modeling of Vid...
Alpen-Adria-Universität
 
PPTX
Optimizing Video Streaming for Sustainability and Quality: The Role of Prese...
Alpen-Adria-Universität
 
PDF
Energy-Efficient Multi-Codec Bitrate-Ladder Estimation for Adaptive Video Str...
Alpen-Adria-Universität
 
PPTX
Machine Learning Based Resource Utilization Prediction in the Computing Conti...
Alpen-Adria-Universität
 
PDF
Evaluation of Quality of Experience of ABR Schemes in Gaming Stream
Alpen-Adria-Universität
 
PDF
Network-Assisted Delivery of Adaptive Video Streaming Services through CDN, S...
Alpen-Adria-Universität
 
PDF
Multi-access Edge Computing for Adaptive Video Streaming
Alpen-Adria-Universität
 
PDF
VE-Match: Video Encoding Matching-based Model for Cloud and Edge Computing In...
Alpen-Adria-Universität
 
PDF
Energy Consumption in Video Streaming: Components, Measurements, and Strategies
Alpen-Adria-Universität
 
PDF
Exploring the Energy Consumption of Video Streaming: Components, Challenges, ...
Alpen-Adria-Universität
 
PDF
Video Coding Enhancements for HTTP Adaptive Streaming Using Machine Learning
Alpen-Adria-Universität
 
Energy-Quality-aware Variable Framerate Pareto-Front for Adaptive Video Strea...
Alpen-Adria-Universität
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
VEED: Video Encoding Energy and CO2 Emissions Dataset for AWS EC2 instances
Alpen-Adria-Universität
 
GREEM: An Open-Source Energy Measurement Tool for Video Processing
Alpen-Adria-Universität
 
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
Alpen-Adria-Universität
 
VEEP: Video Encoding Energy and CO₂ Emission Prediction
Alpen-Adria-Universität
 
Content-adaptive Video Coding for HTTP Adaptive Streaming
Alpen-Adria-Universität
 
Empowerment of Atypical Viewers via Low-Effort Personalized Modeling of Video...
Alpen-Adria-Universität
 
Empowerment of Atypical Viewers via Low-Effort Personalized Modeling of Vid...
Alpen-Adria-Universität
 
Optimizing Video Streaming for Sustainability and Quality: The Role of Prese...
Alpen-Adria-Universität
 
Energy-Efficient Multi-Codec Bitrate-Ladder Estimation for Adaptive Video Str...
Alpen-Adria-Universität
 
Machine Learning Based Resource Utilization Prediction in the Computing Conti...
Alpen-Adria-Universität
 
Evaluation of Quality of Experience of ABR Schemes in Gaming Stream
Alpen-Adria-Universität
 
Network-Assisted Delivery of Adaptive Video Streaming Services through CDN, S...
Alpen-Adria-Universität
 
Multi-access Edge Computing for Adaptive Video Streaming
Alpen-Adria-Universität
 
VE-Match: Video Encoding Matching-based Model for Cloud and Edge Computing In...
Alpen-Adria-Universität
 
Energy Consumption in Video Streaming: Components, Measurements, and Strategies
Alpen-Adria-Universität
 
Exploring the Energy Consumption of Video Streaming: Components, Challenges, ...
Alpen-Adria-Universität
 
Video Coding Enhancements for HTTP Adaptive Streaming Using Machine Learning
Alpen-Adria-Universität
 
Ad

Recently uploaded (20)

PDF
Automating ArcGIS Content Discovery with FME: A Real World Use Case
Safe Software
 
PDF
AI-Cloud-Business-Management-Platforms-The-Key-to-Efficiency-Growth.pdf
Artjoker Software Development Company
 
PDF
Peak of Data & AI Encore - Real-Time Insights & Scalable Editing with ArcGIS
Safe Software
 
PDF
Software Development Methodologies in 2025
KodekX
 
PPTX
Agile Chennai 18-19 July 2025 | Emerging patterns in Agentic AI by Bharani Su...
AgileNetwork
 
PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
PPTX
IT Runs Better with ThousandEyes AI-driven Assurance
ThousandEyes
 
PPTX
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
PDF
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
PDF
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
PPTX
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
PPTX
Applied-Statistics-Mastering-Data-Driven-Decisions.pptx
parmaryashparmaryash
 
PDF
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
PPTX
Simple and concise overview about Quantum computing..pptx
mughal641
 
PDF
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
PDF
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
PDF
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
PDF
Brief History of Internet - Early Days of Internet
sutharharshit158
 
PDF
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
PDF
SparkLabs Primer on Artificial Intelligence 2025
SparkLabs Group
 
Automating ArcGIS Content Discovery with FME: A Real World Use Case
Safe Software
 
AI-Cloud-Business-Management-Platforms-The-Key-to-Efficiency-Growth.pdf
Artjoker Software Development Company
 
Peak of Data & AI Encore - Real-Time Insights & Scalable Editing with ArcGIS
Safe Software
 
Software Development Methodologies in 2025
KodekX
 
Agile Chennai 18-19 July 2025 | Emerging patterns in Agentic AI by Bharani Su...
AgileNetwork
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
IT Runs Better with ThousandEyes AI-driven Assurance
ThousandEyes
 
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
Applied-Statistics-Mastering-Data-Driven-Decisions.pptx
parmaryashparmaryash
 
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
Simple and concise overview about Quantum computing..pptx
mughal641
 
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
Brief History of Internet - Early Days of Internet
sutharharshit158
 
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
SparkLabs Primer on Artificial Intelligence 2025
SparkLabs Group
 

End-to-end Quality of Experience Evaluation for HTTP Adaptive Streaming

  • 1. E N D - T O - E N D Q U A L I T Y O F E X P E R I E N C E E V A L U A T I O N F O R H T T P A D A P T I V E S T R E A M I N G B A B A K T A R A G H I U N I V . - P R O F . D I D R . C H R I S T I A N T I M M E R E R A S S O C . - P R O F . D I D R . M A T H I A S L U X A S S O C . - P R O F . D I D R . K L A U S S C H Ö F F M A N N A S S O C . - P R O F . D I D R . A L I C E N G I Z B E G Ě N C L A S S O F 2 0 2 0 A T H E N A C H R I S T I A N D O P P L E R ( C D ) L A B O R A T O R Y I T E C - I N S T I T U T E O F I N F O R M A T I O N T E C H N O L O G Y
  • 2. Agenda • Introduction and Context (9 minutes) • Evaluation Frameworks (8 minutes) • Studies on QoE Impacting Factors (12 minutes) • Comprehensive Dataset Presentation (7 minutes) • Highlights and Future Directions (3 minutes) • Q&A
  • 3. Introduction Context, HAS and QoE Research Questions Research Methodology Contributions and Publications
  • 4. HTTP Adaptive Streaming I Figure 1: HTTP Adaptive Streaming (HAS) concept and how the delivered quality of segments depends on the shape of the network. 4
  • 5. • Provisioning – Codecs and Encoders, Encryptors • Delivery – Network Protocols, and Topologies • Consumption – Media players and ABR algorithms HTTP Adaptive Streaming II Consumption Delivery Provisioning 5 End-to-end Aspect
  • 6. Quality of Experience I The degree of delight or annoyance of the user of an application or service. It results from the fulfilment of his or her expectations with respect to the utility and/or enjoyment of the application or service in the light of the user’s personality and current state. – Brunnström et al. [27] 6
  • 7. Quality of Experience II • How to evaluate or measure the degree of annoyance or delightfulness of the user – Objective Evaluation • Understand and formulate the metrics – Start-up Delay: How long does it take for the user to see the first frame of the video from the moment s/he clicks the play button? – Delivered Media Quality: What is the delivered media quality at each moment and in average? • E.g.: VMAF, Resolution, and Bitrate – Stall Events (rebuffering): How many times a stall event happens and for how long? • Using quality models – Subjective Evaluation • Investigate the perceived quality by user – Conduct evaluation with human subjects 7
  • 8. Research Questions RQ1) How to design, develop, and deploy scalable end-to- end QoE evaluation groundwork for HAS, encompassing both video-on-demand content and low-latency live streaming? RQ2) What are the QoE influencing perceptual factors, and how can they be effectively evaluated through subjective assessment methods in HAS? And how do existing quality models align with the findings derived from the subjective assessments? 8
  • 9. • Empirical Research Methodology – An approach to investigation that relies on direct or indirect observation and experience to gather data and generate knowledge. It involves systematically collecting and analysing empirical evidence, such as measurements, experiments, and observations, to test hypotheses and validate theories. – Data-driven Assessment – Real-world Evaluation and User-Centric Perspective – Allows Objective and Subjective Measures • Objective: Unbiased and quantifiable, using predetermined criteria and standards [9] • Subjective: The process of assessment based on personal opinions, feelings, or individual judgments [9] – Supports Iterative Improvement – Helps with Industry and Standardization Research Methodology 9
  • 10. 10 CAdViSE: cloud-based adaptive video streaming evaluation framework for the automated testing of media players. In Proceedings of the 11th ACM Multimedia Systems Conference (MMSys), 2020 Understanding Quality of Experience of Heuristic- based HTTP Adaptive Bitrate Algorithms. In Proceedings of the 31st ACM Workshop on Network and Operating Systems Support for Digital Audio and Video (NOSSDAV), 2021 INTENSE: In-Depth Studies on Stall Events and Quality Switches and Their Impact on the Quality of Experience in HTTP Adaptive Streaming. In IEEE Access, 2021 Multi-codec ultra high definition 8K MPEG-DASH dataset. In Proceedings of the 13th ACM Multimedia Systems Conference (MMSys), 2022 LLL-CAdViSE: Live Low-Latency Cloud-Based Adaptive Video Streaming Evaluation Framework. In IEEE Access, 2023 Contributations
  • 11. Evaluation Frameworks 11 CAdViSE: Cloud-based Adaptive Video Streaming Evaluation LLL-CAdViSE: Live Low-Latency Cloud-based Adaptive Video Streaming Evaluation • Media Players Evaluation with CAdViSE • Live Low-latency Evaluation with LLL-CAdViSE Use Cases
  • 12. • A Quality of Experience evaluation framework for HTTP Adaptive Streaming – Facilitates an organized and structured evaluation • Test environment remains the same; therefore, the results can be interpreted as improved performance or otherwise – Its cloud-based, since scalability is a key factor – Enabled to assess multiple ABR algorithms and media simultaneously – Simulates network conditions; accepts network traces as plugins • Mimics real world network characteristics scenarios – Provides unified insights into quality metrics • Measures raw metrics • Works seamlessly with analytic tools (graphs and plots) CAdViSE (What?) 12
  • 13. • Application Layer – Runner, Initializer and Starter scripts – Written with Bash Script, Python and JavaScript • Cloud Components – Player Container (VNC and Selenium) – Network Emulator – EC2 Instances, SSM Execution, DynamoDB, S3 and CloudWatch • Logs and Analytics – Comprehensive Logs – Analytic Players Plugin CAdViSE (How?) 13
  • 14. Live Low-Latency (CAdViSE) 14 Server (AWS EC2) - Generate the live feed - Encode - Package (DASH & HLS) - Ingest & Deliver - Calculate MOS - Manipulate network Client (AWS EC2) - Run media player - Redirect requests to server - Record logs - Manipulate network Database (AWS DynamoDB) - Store log records - Index the data - Retrieve log records LLL-CAdViSE Console (Shell) - Manage EC2 instances - Initialize server and client(s) - Execute the experiment - Execute QoE calculation
  • 15. P r e l i m i n a r y E v a l u a t i o n W i t h C A d V i S E 15 • 5 Experiments; 9:00 minutes each • AWS EC2 t2.medium instances (4Gib RAM,2 CPU cores) • Emulated Network Profiles: 4 mbit/s <> 800 kbit/s
  • 16. • Target Latencies: 1s, 3s, 5s, and 10s • Two streaming formats: – MPEG-DASH (dash.js 4.4.1) – HLS (hls.js 1.2.0) • ARB algorithms: – Learn2Adapt-LowLatency (L2A-LL) – Low-on-Latency Plus (LoLP) • 3 Experiments of 420 seconds • Network profiles: – Bicycle commuter LTE network – Car driver LTE network – Train commuter LTE network – Tram commuter LTE network – Network0 up to 10Gpbs LLL-CAdViSE Evaluation Setup 16 0 1000 2000 3000 4000 5000 6000 7000 8000 0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 220 230 240 250 260 270 280 290 300 310 320 330 340 350 360 370 380 390 400 410 KBPS TIME (SECONDS) Biker Network Profile Available Bandwidth Poly. (Available Bandwidth) 0 2000 4000 6000 8000 10000 12000 14000 0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 220 230 240 250 260 270 280 290 300 310 320 330 340 350 360 370 380 390 400 410 KBPS TIME (SECONDS) Car Driver Network Profile Available Bandwidth Poly. (Available Bandwidth) 0 1000 2000 3000 4000 5000 6000 7000 8000 0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 220 230 240 250 260 270 280 290 300 310 320 330 340 350 360 370 380 390 400 410 KBPS TIME (SECONDS) Train Commuter Network Profile Available Bandwidth Poly. (Available Bandwidth) 0 1000 2000 3000 4000 5000 6000 7000 0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 220 230 240 250 260 270 280 290 300 310 320 330 340 350 360 370 380 390 400 410 KBPS TIME (SECONDS) Tram Commuter Network Profile Available Bandwidth Poly. (Available Bandwidth)
  • 17. LLL-CAdViSE Evaluation Result I 17 • All time values are in seconds. • a: Experiment title, format: [protocol]-[ABR]-[network]- [target latency] (def: Default, l2a: L2A-LL). • b: Average of the sum of stall events duration. • c: Average start-up delay. • d: Average of the sum of seek events duration. • e: Average quantity of quality switches. • f: Playback bitrate (min-max-avg) in kbps. • g: Latency (min-max-avg). • h: Playback rate (min-max-avg). • i: Average MOS predicted by the ITU-T P.1203 quality model.
  • 18. LLL-CAdViSE Evaluation Result II 18 1 2 3 4 5 0 5 10 15 20 Bicycle Car Train Tram Net0 Average P.1203 MOS Average Latency (second) Network Profiles MPEG-DASH, TL: 5S Default L2A-LL LoLP Default L2A-LL LoLP Latency: MOS: 1 2 3 4 5 0 10 20 30 40 Bicycle Car Train Tram Net0 Average P.1203 MOS Average Latency (second) Network Profiles HLS, TL: 5S Default L2A-LL LoLP Default L2A-LL LoLP Latency: MOS:
  • 19. Studies on QoE Impacting Factors 19 • Exploring Adaptive Bitrate (ABR) Algorithms • Objective and Subjective Evaluation • Empirical Findings Understanding Quality of Experience • Minimum Noticeable Stall event Duration (MNSD) Evaluation • Stall event vs. Quality level switch (SvQ) Evaluation • Short stall events vs. a Longer stall event (SvL) Evaluation • Relation of Stall event impact on the QoE with Video Quality level (RSVQ) Evaluation • Objective QoE Models Comparison In-depth Studies on Stall Events and Quality Switches
  • 20. • Throughput-based – Uses throughput prediction heuristics to optimize streaming quality by estimating available network bandwidth. – Examples: PANDA, Festive, CrystalBall. • Buffer-based – Relies solely on buffer occupancy to make streaming decisions, aiming to prevent buffer underruns and stalling. – Examples: BBA0, BOLA, Quetra. • Hybrid – Integrates multiple heuristics such as throughput, buffer level, and latency to make comprehensive streaming decisions. – Examples: GTA, Elastic, MPC. • Learning-based – Utilizes machine learning techniques to adapt streaming quality based on historical data and real-time network conditions. – Examples: Pensieve, Fugu, Stick. Exploring ABR Algorithms 20
  • 21. CAdViSE Testbed: Cloud-based platform for assessing ABR algorithms under diverse network conditions. Ensures reproducibility with session logs for accurate recreation of streaming sessions. Experiment Logs: Logs archived in DynamoDB. Script processes logs to simulate and inject stall events using FFmpeg. Video Processing: Generates a JSON file for ITU-T P.1203 model to obtain Mean Opinion Score (MOS). Concatenates audio and video tracks for finalized mp4 files. Evaluation Portal: Developed using Serverless Architecture and AWS Lambda. Based on ITU-T P.910 standards for subjective assessments. Crowdsourced Testing: Uses Amazon Mechanical Turk for participant recruitment. Custom web media player delivers test sequences to users. Evaluation Process: Participants watch and rate 10 test sequences on a 1 to 5 scale. Reliability questions ensure valid votes. Results stored and processed via AWS services. Objective and Subjective Evaluation 21 1000 2000 3000 4000 5000 6000 7000 8000 10 20 30 40 50 60 70 80 90 100 110 120 k b p s s e c o n d Ramp Up Ramp Dow n 1000 2000 3000 4000 5000 6000 7000 8000 10 20 30 40 50 60 70 80 90 100 110 120 k b p s s e c o n d Stable Fluctuation
  • 22. Empirical Findings I 22 FastMPC Elastic BBA0 Quetra BOLA dash.js Shaka Fluctuation 73.23 5.85 7.95 10.88 28.46 41.40 52.25 Ramp Down 30.63 8.35 6.18 10.33 11.29 21.29 34.90 Ramp Up 17.18 0.00 0.19 0.00 4.13 4.55 13.39 Stable 12.84 0.16 0.00 0.00 4.20 4.26 20.12 0 15 30 45 60 75 90 AVG. STALL (SECOND) FastMPC Elastic BBA0 Fluctuation 5.48 5.36 5.48 Ramp Down 5.56 5.29 5.41 Ramp Up 7.22 6.37 6.56 Stable 5.65 5.46 5.40 0 2 4 6 8 10 12 AVG. STARTUP (SECOND) Quetra BOLA dash.js Shaka 10.88 28.46 41.40 52.25 10.33 11.29 21.29 34.90 0.00 4.13 4.55 13.39 0.00 4.20 4.26 20.12 FastMPC Elastic BBA0 Quetra BOLA dash.js Shaka Fluctuation 5.48 5.36 5.48 5.36 5.56 5.50 5.28 Ramp Down 5.56 5.29 5.41 5.43 5.57 5.54 5.40 Ramp Up 7.22 6.37 6.56 6.78 7.48 7.51 9.65 Stable 5.65 5.46 5.40 5.42 5.62 5.65 5.65 0 2 4 6 8 10 12 AVG. STARTUP (SECOND)
  • 23. Empirical Findings II 23 2.64 2.93 3.13 2.24 3.07 2.26 2.70 3.66 3.87 3.98 3.67 3.80 3.34 3.67 1.00 1.50 2.00 2.50 3.00 3.50 4.00 4.50 5.00 BBA0 BOLA dash.js Elastic FastMPC Quetra Shaka Pearson's Correlation Coefficient 0.84 Objective MOS Subjective MOS 2.22 1.86 1.99 2.07 1.91 1.98 1.98 3.39 3.21 3.29 3.12 3.10 3.08 3.30 1.00 1.50 2.00 2.50 3.00 3.50 4.00 4.50 5.00 BBA0 BOLA dash.js Elastic FastMPC Quetra Shaka Pearson's Correlation Coefficient 0.52 Objective MOS Subjective MOS Stable Network Profile Fluctuation Network Profile 2.56 2.67 2.63 2.26 2.84 2.26 2.79 3.62 3.73 3.65 3.45 3.68 3.41 3.73 1.00 1.50 2.00 2.50 3.00 3.50 4.00 4.50 5.00 BBA0 BOLA dash.js Elastic FastMPC Quetra Shaka Pearson's Correlation Coefficient 0.94 Objective MOS Subjective MOS RampUp Network Profile 2.33 2.43 2.33 2.00 2.48 2.00 2.35 3.48 3.65 3.48 3.26 3.48 3.26 3.45 1.00 1.50 2.00 2.50 3.00 3.50 4.00 4.50 5.00 BBA0 BOLA dash.js Elastic FastMPC Quetra Shaka Pearson's Correlation Coefficient 0.90 Objective MOS Subjective MOS RampDown Network Profile
  • 24. In-depth Studies on Stall Events and Quality Switches • Minimum Noticeable Stall Duration (MNSD): – Investigated the threshold below which stall events are not noticeable to users, thus not affecting perceived QoE. • Stall Event vs. Quality Switch (SvQ): – Evaluated user preference between experiencing a stall event or a quality drop during unfavourable network conditions. • Short vs. Long Stall Events (SvL): – Studied the impact on QoE of multiple short stall events versus a single longer stall event, considering both predicted and perceived MOS. • Stall Impact and Video Quality (RSVQ): – Examined the relationship between the impact of stall events on QoE and video quality level, addressing conflicting findings from previous studies. • QoE Models Comparison: – Compared various QoE objective evaluation models with subjective MOS results to study their correlations. 24
  • 26. Minimum Noticeable Stall Duration 26 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% < 0 . 0 5 1 < 0 . 1 0 1 < 0 . 1 5 1 < 0 . 2 0 1 < 0 . 2 0 1 2 3 4 5 6 7 8 9 10 11 12 1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81 85 89 93 97 101 Number of Times Stall Event Exposed Stall Event Duration (Millisecond) Missed Stall Events Log. (Missed Stall Events) 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% < 0.051 < 0.101 < 0.151 < 0.201 < 0.251 < 0.301 < 0.351 < 0.401 < 0.451 < 0.501 < 0.551 < 0.601 < 0.651 < 0.701 < 0.751 < 0.801 < 0.851 < 0.901 < 0.951 < 1.001 Stall Event Duration (Second) Noticed Missed 65 69 73 77 81 85 89 93 97 101 cond) Missed Stall Events Log. (Missed Stall Events) • Decrease in noticed stall events starts at durations less than 0.301 seconds. • Over 45% of subjects did not notice stall events shorter than 0.051 seconds. • Stall events under 0.004 seconds were not noticeable to participants.
  • 27. Stall Event vs. Quality Switch 27 Set A - Case I: A pattern with 6s stall event and upward quality switch. Set A - Case II: A pattern without a stall event and continuous low-quality streaming. Set B - Case I: A pattern with high video quality streaming but with a 6s stall event. Set B - Case II: A pattern with a downward quality switch and without stall event.
  • 28. Stall Event vs. Quality Switch 28 Set A Case I Set A Case II Set B Case I Set B Case II Perceived MOS 3.28 3.11 3.75 3.52 Predicted MOS 2.96 2.45 3.62 2.76 1 2 3 4 5 Mean Opinion Score Stall Events' Patterns • Preference for Case I in both Set A and Set B over Case II. • Preference for higher-quality versions even with a 6-second stall.
  • 29. Short vs. Long Stall Events 29 (0-0) (1-4) (1-8) (4-1) (4-2) (8-1) Perceived MOS 4.54 4.11 3.83 3.44 3.35 3.23 Predicted MOS 4.71 4.31 4.12 3.33 3.23 2.67 1 2 3 4 5 Mean Opinion Score Stall Events' Patterns (count,duration) • Preference for longer stall events over frequent, shorter ones
  • 30. Stall Impact and Video Quality 30 Q1 Q1 + Stall Q2 Q2 + Stall Q3 Q3 + Stall Perceived MOS 2.85 2.57 3.81 3.08 4.48 3.77 Predicted MOS 1.88 1.65 2.6 2.11 4.63 3.36 1 2 3 4 5 Mean Opinion Score VMAF Video Quality • Minor QoE penalty from stall events at low-quality videos (Q1). • Higher penalty on QoE for middle (Q2) and high-quality (Q3) videos with stall events.
  • 31. QoE Models Comparison 31 • BiQPS and FINEAS: - Inconsistent performance across evaluations. • P.1203 model: - Best overall performance. - Highest PCC and SRCC (> 0.8) - Lowest RMSE: 0.326. • Pearson Correlation Coefficient (PCC) • Spearman’s Rank Correlation Coefficient (SRCC) • Root Mean Square Error (RMSE)
  • 32. A Comprehensive Dataset 32 Video Codecs and Development Procedures Source Video Sequences Available Representations
  • 33. Video Codecs and Development Procedures 33 • Advanced Video Coding (AVC) – Library: libx264 (version 0.160.3011) from FFMPEG, slow preset. • High Efficiency Video Coding (HEVC) – Library: libx265 (version 3.4) from FFMPEG, slow preset. • AOMedia Video 1 (AV1) – Library: libsvtav1 (version 0.9.0) from FFMPEG, preset 8. • Versatile Video Coding (VVC) – Library: Fraunhofer VVenC (version 1.3.1), requires 8-bit YUV input, processed with FFMPEG and encoded with VVenC. – At the dataset preparation time MP4Box (part of GPAC project) supports VVC in nightly builds, enabling MP4 file packaging, VVC bitstream dumping, and MPEG-DASH content packaging • ISOBMFF incl. VVC • DASH manifest Encoder VVC Elementary Streams GPAC Encoded Audio Track Playback Decoder GPAC MP4Client
  • 35. Available Representations 35 • Resolutions up to 7680x4320 or 8K • Maximum media duration of 322 seconds • Segment lengths of 4 and 8 seconds • Available publicly with the following link: http://www.itec.aau.at/ftp/datasets/mmsys22
  • 36. Highlights And Conclusion Three Main Categories of Contributions: 1. Evaluation frameworks (CAdViSE and LLL-CAdViSE) for VOD and live streaming. 1. Directly addresses RQ1. 2. Studies on subjective and objective QoE assessments and the impacts of HAS defects on QoE. 1. Directly addresses RQ2. 1. Comprehensive dataset with up-to-date video technologies, including 8K VVC. 1. Directly addresses RQ1. 36
  • 37. Future Works 37 • Support for New Protocols and Codecs: Extend evaluation frameworks to include emerging standards like WebRTC and VVC. • Machine Learning for QoE: Apply machine learning techniques to predict and optimize QoE based assessments. • Enhance Quality Models: Align existing quality models with subjective assessment findings for better prediction accuracy. • Real-time QoE Monitoring: Develop tools for real-time QoE monitoring and feedback to enable dynamic adjustments during streaming sessions. • User-centric QoE Personalization: Investigate methods for personalizing QoE based on individual user preferences and viewing habits.