SlideShare a Scribd company logo
2
Most read
3
Most read
10
Most read
Video Summarization Using
Clustering

Sachin
DTU/2K12/EC-149
Mentor: Mr. Avinash Ratre
Introduction
We have seen YouTube and other media sources pushing the bounds of video
consuming in the past few years. As media sources compete for more of a
viewer’s time everyday, one possible alleviation is a video summarization
system. A movie teaser is an example of a video summary. However, not
everyone has the time to edit their videos for a concise
version.
This presentation highlights a fast and efficient algorithm using k-means
clustering with RGB histograms for creating a video summary. It is aimed
particularly at low quality media, specifically YouTube videos.
Approach
1.
2.
3.
4.
5.

6.

Split the input file into time segments of k seconds: f0...fn.
Take the first frame of each segment. Let this frame be
representative of the segment. We assign it
Compute the histograms from x0....xn and assign it y0...yn.
Cluster the histograms(y0....yn) into k groups using K-Means.
Euclidean distance will be the error function.
Round robin for segment selection: Iterate through the k
groups and select a segment randomly from a cluster, add it
to list l until the number of desired segments are chosen.
Join list l of segments together to generate a video summary.
Overview

RGB Histrogram
Feature Selection
• We selected RGB color histograms for our feature comparator
due to its global nature and speed of processing. Histograms
are a good trade-off between accuracy and speed.
• Histogram is a frequency approach where it compresses the
information of a video frame into a vector.
• The majority of YouTube videos are lower quality so extracting
more challenging features tends to be more difficult.
Histograms can perform well because they do not attempt to
infer any semantic meaning in the actual segments.
Algorithm
Group all the similar histograms into the k clusters. Each
histogram is representative of the corresponding video segment.
K-means algorithm is defined below:
1. Select k random centroid points on our multi-dimensional
space.
2. Compute each histogram against all the cluster centroids
3.Each histogram is assigned to the cluster that minimizes the
error function.
4.Recompute cluster centroids.
5.On every iteration, check to see if the centroids converged. If
not, we go to step 2.
Error function
We use Euclidean distance as our error function. This is the
general approach when directly comparing histograms.

Additionally, we also experimented with the cosine similarity and
saw no noticeable difference in the clustering output.
Results
• We selected k = 8 as our k-means parameter and use 20
segments for the output video
Dataset
Following YouTube videos in our system. All of these videos are
320x240.
•MotoGP: Recent round of the world motorcycle racing series.
This represents a typical sports video.
•Man Vs Wild Episode.
Clusters Generated
• When we clustered the MotoGP clip, it was able to separate
all the action footage from the pit stand footage. This is
particularly useful for viewers who only want to watch the
race and not the pit stand.
• The Man vs Wild episode was able to correctly cluster
different segments. It particularly helped that the uniquely
identifying segments had much color similarity. When the
Bear(the main actor) was in the desert, the colors are
populated with a higher color intensity. Similarly, when he
was in the Florida everglades, the colors are lower in intensity.
MotoGP clusters
Man vs Wild clusters
Problems
• Repeated segments

When a static image is present for a long time, two or more
segments will be created from this image. During the
clustering, all of the segments with the static image will be
clustered in the same group.
• Background

In the MotoGP video clip, the majority of the segments consists
of the road in the background. Our algorithm grouped most of
these shots into one cluster. The intended behavior would be
to capture the different teams into different clusters because
each team has a unique color scheme. However, the
background dominated and grouped most of these segments
together. It would interesting future work to see if two levels
of clustering would be helpful: one for the initial segments
and another sub-clustering for within each set.
Conclusion
We have presented a system to automatically create a
summarized video from a YouTube video.
K-means is a simple and effective method for clustering similar
frames together.
Our system is modular in design so future work can be developed
by substituting in variouscomponents. Instead of using
histograms, future work can try to use other features suchas
motion vectors or even audio. However, we have demonstrated
that a simple feature with a simple unsupervised learning
technique can be a good starting point for a video summarization
system.
References
• Video Summarization Using Clustering
By
Tommy Chheng, Department of Computer Science,University of
California, Irvine
• A User Attention Model for Video Summarization
By
Yu-Fei Ma, Lie Lu, Hong-Jiang Zhang and Mingjing Li
Microsoft Research Asia

More Related Content

What's hot (20)

PDF
Machine Learning - Object Detection and Classification
Vikas Jain
 
PDF
PGL SUM Video Summarization
VasileiosMezaris
 
PPS
Notepad Presentation Mca
hamzaghanchi
 
PPTX
Fundamentals and image compression models
lavanya marichamy
 
PPTX
Active contour segmentation
Nishant Jain
 
PDF
Video compression
nnmaurya
 
PPTX
MPEG video compression standard
anuragjagetiya
 
PDF
Sign Language Translator
Manjari Raj
 
PPTX
IMAGE SEGMENTATION.
Tawose Olamide Timothy
 
PPTX
Multimedia synchronization
I World Tech
 
PPTX
Android Project Presentation
Laxmi Kant Yadav
 
PPT
Video Compression Basics - MPEG2
VijayKumarArya
 
PPT
Compression techniques
m_divya_bharathi
 
PPTX
Real Time Object Dectection using machine learning
pratik pratyay
 
PPTX
Emotion recognition
Madhusudhan G
 
PPTX
auto-assistance system for visually impaired person
shahsamkit73
 
PDF
Comparative study of Text-to-Speech Synthesis for Indian Languages by using S...
ravi sharma
 
PDF
Compression: Video Compression (MPEG and others)
danishrafiq
 
PPTX
Snakes in Images (Active contour tutorial)
Yan Xu
 
PDF
Unsupervised Video Summarization via Attention-Driven Adversarial Learning
VasileiosMezaris
 
Machine Learning - Object Detection and Classification
Vikas Jain
 
PGL SUM Video Summarization
VasileiosMezaris
 
Notepad Presentation Mca
hamzaghanchi
 
Fundamentals and image compression models
lavanya marichamy
 
Active contour segmentation
Nishant Jain
 
Video compression
nnmaurya
 
MPEG video compression standard
anuragjagetiya
 
Sign Language Translator
Manjari Raj
 
IMAGE SEGMENTATION.
Tawose Olamide Timothy
 
Multimedia synchronization
I World Tech
 
Android Project Presentation
Laxmi Kant Yadav
 
Video Compression Basics - MPEG2
VijayKumarArya
 
Compression techniques
m_divya_bharathi
 
Real Time Object Dectection using machine learning
pratik pratyay
 
Emotion recognition
Madhusudhan G
 
auto-assistance system for visually impaired person
shahsamkit73
 
Comparative study of Text-to-Speech Synthesis for Indian Languages by using S...
ravi sharma
 
Compression: Video Compression (MPEG and others)
danishrafiq
 
Snakes in Images (Active contour tutorial)
Yan Xu
 
Unsupervised Video Summarization via Attention-Driven Adversarial Learning
VasileiosMezaris
 

Viewers also liked (16)

PDF
Gaining Colour Stability in Live Image Capturing
Guy K. Kloss
 
PDF
Content based video summarization into object maps
Universitat Politècnica de Catalunya
 
PDF
Keyframe-based Video Summarization Designer
Universitat Politècnica de Catalunya
 
PPTX
Goal Recognition in Soccer Match
Dharmesh Tank
 
PPT
Applying Media Content Analysis to the Production of Musical Videos as Summar...
Chris Huang
 
PPT
Howen CCTV System worldwide Application-201309
Berry Gao
 
PDF
Paralleling Variable Block Size Motion Estimation of HEVC On CPU plus GPU Pla...
Shanghai Jiao Tong University(上海交通大学)
 
PDF
VIDEO SUMMARIZATION: CORRELATION FOR SUMMARIZATION AND SUBTRACTION FOR RARE E...
Journal For Research
 
PDF
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...
Universitat Politècnica de Catalunya
 
PDF
"Image and Video Summarization," a Presentation from the University of Washin...
Edge AI and Vision Alliance
 
PDF
Past, Present & Future of Recommender Systems: An Industry Perspective
Justin Basilico
 
PDF
Factorization Meets the Item Embedding: Regularizing Matrix Factorization wit...
Dawen Liang
 
PDF
(Some) pitfalls of distributed learning
Yves Raimond
 
PDF
Balancing Discovery and Continuation in Recommendations
Mohammad Hossein Taghavi
 
PPT
Integrating Physical And Logical Security
Jorge Sebastiao
 
Gaining Colour Stability in Live Image Capturing
Guy K. Kloss
 
Content based video summarization into object maps
Universitat Politècnica de Catalunya
 
Keyframe-based Video Summarization Designer
Universitat Politècnica de Catalunya
 
Goal Recognition in Soccer Match
Dharmesh Tank
 
Applying Media Content Analysis to the Production of Musical Videos as Summar...
Chris Huang
 
Howen CCTV System worldwide Application-201309
Berry Gao
 
Paralleling Variable Block Size Motion Estimation of HEVC On CPU plus GPU Pla...
Shanghai Jiao Tong University(上海交通大学)
 
VIDEO SUMMARIZATION: CORRELATION FOR SUMMARIZATION AND SUBTRACTION FOR RARE E...
Journal For Research
 
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...
Universitat Politècnica de Catalunya
 
"Image and Video Summarization," a Presentation from the University of Washin...
Edge AI and Vision Alliance
 
Past, Present & Future of Recommender Systems: An Industry Perspective
Justin Basilico
 
Factorization Meets the Item Embedding: Regularizing Matrix Factorization wit...
Dawen Liang
 
(Some) pitfalls of distributed learning
Yves Raimond
 
Balancing Discovery and Continuation in Recommendations
Mohammad Hossein Taghavi
 
Integrating Physical And Logical Security
Jorge Sebastiao
 
Ad

Similar to Video summarization using clustering (15)

PDF
Defense_20140625
Shun-Hsing Ou
 
PDF
VISUAL ATTENTION BASED KEYFRAMES EXTRACTION AND VIDEO SUMMARIZATION
cscpconf
 
PDF
Computer Vision Computer Vision: Algorithms and Applications Richard Szeliski
TaqwaElsayed
 
PPTX
TVSum: Summarizing Web Videos Using Titles
NEERAJ BAGHEL
 
PDF
Key frame extraction for video summarization using motion activity descriptors
eSAT Publishing House
 
PDF
Key frame extraction for video summarization using motion activity descriptors
eSAT Journals
 
PDF
L0956974
IOSR Journals
 
PDF
50320140502001 2
IAEME Publication
 
PDF
50320140502001
IAEME Publication
 
PPTX
CA-SUM Video Summarization
VasileiosMezaris
 
PDF
International Journal of Image Processing (IJIP) Volume (3) Issue (4)
CSCJournals
 
PPTX
Digital Image Processing.pptx
Aishwarya SenthilNathan
 
PDF
F0953235
IOSR Journals
 
PPTX
seminar 2
DEOYANIVAIDYA
 
PPTX
Summarizing videos with Attention
Arithmer Inc.
 
Defense_20140625
Shun-Hsing Ou
 
VISUAL ATTENTION BASED KEYFRAMES EXTRACTION AND VIDEO SUMMARIZATION
cscpconf
 
Computer Vision Computer Vision: Algorithms and Applications Richard Szeliski
TaqwaElsayed
 
TVSum: Summarizing Web Videos Using Titles
NEERAJ BAGHEL
 
Key frame extraction for video summarization using motion activity descriptors
eSAT Publishing House
 
Key frame extraction for video summarization using motion activity descriptors
eSAT Journals
 
L0956974
IOSR Journals
 
50320140502001 2
IAEME Publication
 
50320140502001
IAEME Publication
 
CA-SUM Video Summarization
VasileiosMezaris
 
International Journal of Image Processing (IJIP) Volume (3) Issue (4)
CSCJournals
 
Digital Image Processing.pptx
Aishwarya SenthilNathan
 
F0953235
IOSR Journals
 
seminar 2
DEOYANIVAIDYA
 
Summarizing videos with Attention
Arithmer Inc.
 
Ad

Recently uploaded (20)

PDF
Zoology (Animal Physiology) practical Manual
raviralanaresh2
 
PPSX
HEALTH ASSESSMENT (Community Health Nursing) - GNM 1st Year
Priyanshu Anand
 
PPT
digestive system for Pharm d I year HAP
rekhapositivity
 
PDF
DIGESTION OF CARBOHYDRATES,PROTEINS,LIPIDS
raviralanaresh2
 
PPTX
How to Configure Prepayments in Odoo 18 Sales
Celine George
 
PPTX
Views on Education of Indian Thinkers Mahatma Gandhi.pptx
ShrutiMahanta1
 
PPTX
How to Manage Access Rights & User Types in Odoo 18
Celine George
 
PPTX
LEGAL ASPECTS OF PSYCHIATRUC NURSING.pptx
PoojaSen20
 
PPTX
How to Configure Access Rights of Manufacturing Orders in Odoo 18 Manufacturing
Celine George
 
PPTX
How to Configure Storno Accounting in Odoo 18 Accounting
Celine George
 
PDF
The-Beginnings-of-Indian-Civilisation.pdf/6th class new ncert social/by k san...
Sandeep Swamy
 
PPTX
PPT on the Development of Education in the Victorian England
Beena E S
 
PPTX
Views on Education of Indian Thinkers J.Krishnamurthy..pptx
ShrutiMahanta1
 
PPTX
Optimizing Cancer Screening With MCED Technologies: From Science to Practical...
i3 Health
 
PPTX
ROLE OF ANTIOXIDANT IN EYE HEALTH MANAGEMENT.pptx
Subham Panja
 
PPTX
How to Manage Promotions in Odoo 18 Sales
Celine George
 
PPTX
Optimizing Cancer Screening With MCED Technologies: From Science to Practical...
i3 Health
 
PPTX
Nutri-QUIZ-Bee-Elementary.pptx...................
ferdinandsanbuenaven
 
PPTX
Unit 2 COMMERCIAL BANKING, Corporate banking.pptx
AnubalaSuresh1
 
PPTX
Explorando Recursos do Summer '25: Dicas Essenciais - 02
Mauricio Alexandre Silva
 
Zoology (Animal Physiology) practical Manual
raviralanaresh2
 
HEALTH ASSESSMENT (Community Health Nursing) - GNM 1st Year
Priyanshu Anand
 
digestive system for Pharm d I year HAP
rekhapositivity
 
DIGESTION OF CARBOHYDRATES,PROTEINS,LIPIDS
raviralanaresh2
 
How to Configure Prepayments in Odoo 18 Sales
Celine George
 
Views on Education of Indian Thinkers Mahatma Gandhi.pptx
ShrutiMahanta1
 
How to Manage Access Rights & User Types in Odoo 18
Celine George
 
LEGAL ASPECTS OF PSYCHIATRUC NURSING.pptx
PoojaSen20
 
How to Configure Access Rights of Manufacturing Orders in Odoo 18 Manufacturing
Celine George
 
How to Configure Storno Accounting in Odoo 18 Accounting
Celine George
 
The-Beginnings-of-Indian-Civilisation.pdf/6th class new ncert social/by k san...
Sandeep Swamy
 
PPT on the Development of Education in the Victorian England
Beena E S
 
Views on Education of Indian Thinkers J.Krishnamurthy..pptx
ShrutiMahanta1
 
Optimizing Cancer Screening With MCED Technologies: From Science to Practical...
i3 Health
 
ROLE OF ANTIOXIDANT IN EYE HEALTH MANAGEMENT.pptx
Subham Panja
 
How to Manage Promotions in Odoo 18 Sales
Celine George
 
Optimizing Cancer Screening With MCED Technologies: From Science to Practical...
i3 Health
 
Nutri-QUIZ-Bee-Elementary.pptx...................
ferdinandsanbuenaven
 
Unit 2 COMMERCIAL BANKING, Corporate banking.pptx
AnubalaSuresh1
 
Explorando Recursos do Summer '25: Dicas Essenciais - 02
Mauricio Alexandre Silva
 

Video summarization using clustering

  • 2. Introduction We have seen YouTube and other media sources pushing the bounds of video consuming in the past few years. As media sources compete for more of a viewer’s time everyday, one possible alleviation is a video summarization system. A movie teaser is an example of a video summary. However, not everyone has the time to edit their videos for a concise version. This presentation highlights a fast and efficient algorithm using k-means clustering with RGB histograms for creating a video summary. It is aimed particularly at low quality media, specifically YouTube videos.
  • 3. Approach 1. 2. 3. 4. 5. 6. Split the input file into time segments of k seconds: f0...fn. Take the first frame of each segment. Let this frame be representative of the segment. We assign it Compute the histograms from x0....xn and assign it y0...yn. Cluster the histograms(y0....yn) into k groups using K-Means. Euclidean distance will be the error function. Round robin for segment selection: Iterate through the k groups and select a segment randomly from a cluster, add it to list l until the number of desired segments are chosen. Join list l of segments together to generate a video summary.
  • 5. Feature Selection • We selected RGB color histograms for our feature comparator due to its global nature and speed of processing. Histograms are a good trade-off between accuracy and speed. • Histogram is a frequency approach where it compresses the information of a video frame into a vector. • The majority of YouTube videos are lower quality so extracting more challenging features tends to be more difficult. Histograms can perform well because they do not attempt to infer any semantic meaning in the actual segments.
  • 6. Algorithm Group all the similar histograms into the k clusters. Each histogram is representative of the corresponding video segment. K-means algorithm is defined below: 1. Select k random centroid points on our multi-dimensional space. 2. Compute each histogram against all the cluster centroids 3.Each histogram is assigned to the cluster that minimizes the error function. 4.Recompute cluster centroids. 5.On every iteration, check to see if the centroids converged. If not, we go to step 2.
  • 7. Error function We use Euclidean distance as our error function. This is the general approach when directly comparing histograms. Additionally, we also experimented with the cosine similarity and saw no noticeable difference in the clustering output.
  • 8. Results • We selected k = 8 as our k-means parameter and use 20 segments for the output video
  • 9. Dataset Following YouTube videos in our system. All of these videos are 320x240. •MotoGP: Recent round of the world motorcycle racing series. This represents a typical sports video. •Man Vs Wild Episode.
  • 10. Clusters Generated • When we clustered the MotoGP clip, it was able to separate all the action footage from the pit stand footage. This is particularly useful for viewers who only want to watch the race and not the pit stand. • The Man vs Wild episode was able to correctly cluster different segments. It particularly helped that the uniquely identifying segments had much color similarity. When the Bear(the main actor) was in the desert, the colors are populated with a higher color intensity. Similarly, when he was in the Florida everglades, the colors are lower in intensity.
  • 12. Man vs Wild clusters
  • 13. Problems • Repeated segments When a static image is present for a long time, two or more segments will be created from this image. During the clustering, all of the segments with the static image will be clustered in the same group. • Background In the MotoGP video clip, the majority of the segments consists of the road in the background. Our algorithm grouped most of these shots into one cluster. The intended behavior would be to capture the different teams into different clusters because each team has a unique color scheme. However, the background dominated and grouped most of these segments together. It would interesting future work to see if two levels of clustering would be helpful: one for the initial segments and another sub-clustering for within each set.
  • 14. Conclusion We have presented a system to automatically create a summarized video from a YouTube video. K-means is a simple and effective method for clustering similar frames together. Our system is modular in design so future work can be developed by substituting in variouscomponents. Instead of using histograms, future work can try to use other features suchas motion vectors or even audio. However, we have demonstrated that a simple feature with a simple unsupervised learning technique can be a good starting point for a video summarization system.
  • 15. References • Video Summarization Using Clustering By Tommy Chheng, Department of Computer Science,University of California, Irvine • A User Attention Model for Video Summarization By Yu-Fei Ma, Lie Lu, Hong-Jiang Zhang and Mingjing Li Microsoft Research Asia