SlideShare a Scribd company logo
Extrapolation Methods for Accelerating PageRank Computations Sepandar D. Kamvar Taher H. Haveliwala Christopher D. Manning Gene H. Golub Stanford University
Motivation Problem:   Speed up PageRank Motivation: Personalization “ Freshness” Note: PageRank Computations don’t get faster as computers do. Results:  1.  The Official Site of the San Francisco Giants Search: Giants Results:  1.  The Official Site of the New York Giants
Outline Definition of PageRank Computation of PageRank  Convergence Properties Outline of Our Approach  Empirical Results 0.4 0.2 0.4 Repeat: u 1 u 2 u 3 u 4 u 5 u 1 u 2 u 3 u 4 u 5
Link Counts Linked by 2 Important Pages Linked by 2 Unimportant pages Sep’s Home Page Taher’s Home Page Yahoo! CNN DB Pub Server CS361
Definition of PageRank The importance of a page is given by the importance of the pages that link to it. importance of page  i pages  j  that link to page  i number of outlinks from  page  j importance of page  j
Definition of PageRank Yahoo! CNN DB Pub Server Taher Sep 1/2 1/2 1 1 0.1 0.1 0.1 0.05 0.25
PageRank Diagram Initialize all nodes to rank  0.333 0.333 0.333
PageRank Diagram Propagate ranks across links (multiplying by link weights) 0.167 0.167 0.333 0.333
PageRank Diagram 0.333 0.5 0.167
PageRank Diagram 0.167 0.167 0.5 0.167
PageRank Diagram 0.5 0.333 0.167
PageRank Diagram After a while… 0.4 0.4 0.2
Computing PageRank Initialize: Repeat until convergence: importance of page  i pages  j  that link to page  i number of outlinks from  page  j importance of page  j
Matrix Notation 0 .2  0 .3  0  0 .1  .4  0 .1 = .1 .3 .2 .3 .1 .1 .2 . 1 .3 .2 .3 .1 .1
Matrix Notation Find  x  that satisfies: . 1 .3 .2 .3 .1 .1 0 .2  0 .3  0  0 .1  .4  0 .1 = .1 .3 .2 .3 .1 .1 .2
Power Method Initialize: Repeat until convergence:
PageRank doesn’t actually use P T .  Instead, it uses A=cP T  + (1-c)E T . So the PageRank problem is really: not: A side note Find  x  that satisfies: Find  x  that satisfies:
Power Method And the algorithm is really . . . Initialize: Repeat until convergence:
Outline Definition of PageRank Computation of PageRank  Convergence Properties Outline of Our Approach  Empirical Results 0.4 0.2 0.4 Repeat: u 1 u 2 u 3 u 4 u 5 u 1 u 2 u 3 u 4 u 5
Power Method u 1 1 u 2  2 u 3  3 u 4  4 u 5  5 Express  x (0)  in terms of eigenvectors of A
Power Method u 1 1 u 2  2  2 u 3  3  3 u 4  4  4 u 5  5  5
Power Method u 1 1 u 2  2  2 2 u 3  3  3 2 u 4  4  4 2 u 5  5  5 2
Power Method u 1 1 u 2  2  2 k u 3  3  3 k u 4  4  4 k u 5  5  5 k
Power Method u 1 1 u 2  u 3  u 4  u 5 
Why does it work? Imagine our  n x n  matrix  A  has  n  distinct eigenvectors  u i . u 1 1 u 2  2 u 3  3 u 4  4 u 5  5 Then, you can write any  n -dimensional vector as a linear combination of the eigenvectors of  A .
Why does it work? From the last slide: To get the first iterate, multiply  x (0)  by  A . First eigenvalue is 1. Therefore: All less than 1
Power Method u 1 1 u 2  2 u 3  3 u 4  4 u 5  5 u 1 1 u 2  2  2 u 3  3  3 u 4  4  4 u 5  5  5 u 1 1 u 2  2  2 2 u 3  3  3 2 u 4  4  4 2 u 5  5  5 2
The smaller   2 , the faster the convergence of the Power Method. Convergence u 1 1 u 2  2  2 k u 3  3  3 k u 4  4  4 k u 5  5  5 k
Our Approach u 1 u 2 u 3 u 4 u 5 Estimate components of current iterate   in the directions  of second two eigenvectors, and eliminate them.
Why this approach? For traditional problems: A  is smaller, often dense.  2  often close to    , making the power method slow. In our problem,  A  is huge and sparse More importantly,   2  is small 1 .  Therefore, Power method is actually much faster than other methods. 1 (“The Second Eigenvalue of the Google Matrix” dbpubs.stanford.edu/pub/2003-20.)
Using Successive Iterates u 1 x (0) u 1 u 2 u 3 u 4 u 5
Using Successive Iterates u 1 x (1) x (0) u 1 u 2 u 3 u 4 u 5
Using Successive Iterates u 1 x (1) x (0) x (2) u 1 u 2 u 3 u 4 u 5
Using Successive Iterates x (0) u 1 x (1) x (2) u 1 u 2 u 3 u 4 u 5
Using Successive Iterates  x (0) x’ = u 1 x (1) u 1 u 2 u 3 u 4 u 5
How do we do this? Assume x (k)  can be written as a linear combination of the first three eigenvectors ( u 1 ,  u 2 ,  u 3 ) of A. Compute approximation to { u 2 , u 3 }, and subtract it from x (k)  to get x (k) ’
Assume Assume the  x (k)  can be represented by first 3 eigenvectors of  A
Linear Combination Let’s take some linear combination of these 3 iterates.
Rearranging Terms We can rearrange the terms to get: Goal: Find   1 ,  2 ,  3  so that coefficients of  u 2  and  u 3   are 0, and coefficient of  u 1   is 1.
Summary We make an assumption about the current iterate.  Solve for dominant eigenvector as a linear combination of the next three iterates. We use a few iterations of the Power Method to “clean it up”.
Outline Definition of PageRank Computation of PageRank  Convergence Properties Outline of Our Approach  Empirical Results u 1 u 2 u 3 u 4 u 5 u 1 u 2 u 3 u 4 u 5 0.4 0.2 0.4 Repeat:
Results Quadratic Extrapolation speeds up convergence.  Extrapolation was only used 5 times!
Results Extrapolation dramatically speeds up convergence,  for high values of c (c=.99)
Take-home message Speeds up PageRank by a fair amount, but not by enough for true Personalized PageRank. Ideas are useful for further speedup algorithms. Quadratic Extrapolation can be used for a whole class of problems.
The End Paper available at  http://dbpubs.stanford.edu/pub/2003-16

More Related Content

PDF
Min and max search
Kumar
 
PPTX
Data Structure and Algorithms Merge Sort
ManishPrajapati78
 
PPT
Algorithm designing using divide and conquer algorithms
SiddhantShelake
 
PPTX
Taylor and Maclaurin Series
Harsh Pathak
 
PPTX
Riemann Hypothesis and Natural Functions
Kannan Nambiar
 
PDF
Lec9
Rishit Shah
 
PDF
Post_Number Systems_8.3
Marc King
 
PDF
Post_Number Systems_8.3-3
Marc King
 
Min and max search
Kumar
 
Data Structure and Algorithms Merge Sort
ManishPrajapati78
 
Algorithm designing using divide and conquer algorithms
SiddhantShelake
 
Taylor and Maclaurin Series
Harsh Pathak
 
Riemann Hypothesis and Natural Functions
Kannan Nambiar
 
Post_Number Systems_8.3
Marc King
 
Post_Number Systems_8.3-3
Marc King
 

What's hot (10)

PPTX
Basics of quantum mechanics
MirzaMusmanBaig
 
PPTX
Two queue tandem resim 16 presentatio
Manuel Villen Altamirano
 
PPTX
ME-314- Control Engineering - Week 03-04
Dr. Bilal Siddiqui, C.Eng., MIMechE, FRAeS
 
PPTX
Mah
Md Din Islam
 
PPTX
10 merge sort
irdginfo
 
PPTX
Introduction to density functional theory
Sarthak Hajirnis
 
PDF
ADAPTIVE CONTROL AND SYNCHRONIZATION OF A HIGHLY CHAOTIC ATTRACTOR
ijistjournal
 
PPT
Wk 6 part 2 non linearites and non linearization april 05
Charlton Inao
 
PDF
ADAPTIVE CONTROL AND SYNCHRONIZATION OF LIU’S FOUR-WING CHAOTIC SYSTEM WITH C...
IJCSEA Journal
 
PPT
Stable chaos
Xiong Wang
 
Basics of quantum mechanics
MirzaMusmanBaig
 
Two queue tandem resim 16 presentatio
Manuel Villen Altamirano
 
ME-314- Control Engineering - Week 03-04
Dr. Bilal Siddiqui, C.Eng., MIMechE, FRAeS
 
10 merge sort
irdginfo
 
Introduction to density functional theory
Sarthak Hajirnis
 
ADAPTIVE CONTROL AND SYNCHRONIZATION OF A HIGHLY CHAOTIC ATTRACTOR
ijistjournal
 
Wk 6 part 2 non linearites and non linearization april 05
Charlton Inao
 
ADAPTIVE CONTROL AND SYNCHRONIZATION OF LIU’S FOUR-WING CHAOTIC SYSTEM WITH C...
IJCSEA Journal
 
Stable chaos
Xiong Wang
 
Ad

Similar to Extrapolation (20)

PDF
sffffffffffsssssNetwork_MarkovChains.pdf
sandrosvanidze3
 
PDF
Cost Efficient PageRank Computation using GPU : NOTES
Subhajit Sahu
 
PPTX
Advanced Modularity Optimization Assignment Help
Computer Network Assignment Help
 
PDF
Design of Second Order Digital Differentiator and Integrator Using Forward Di...
inventionjournals
 
PDF
Fortran chapter 2.pdf
JifarRaya
 
PDF
Intro. to computational Physics ch2.pdf
JifarRaya
 
PDF
Perspective in Informatics 3 - Assignment 2 - Answer Sheet
Hoang Nguyen Phong
 
PDF
NFSFIXES
Robert Reynoldson
 
PDF
D026017036
inventionjournals
 
PPTX
MODULE_05-Matrix Decomposition.pptx
AlokSingh205089
 
PPTX
Mncs 16-10-1주-변승규-introduction to the machine learning #2
Seung-gyu Byeon
 
PDF
Paper Study: Melding the data decision pipeline
ChenYiHuang5
 
PPTX
Simplex algorithm
Khwaja Bilal Hassan
 
PDF
New approach for wolfe’s modified simplex method to solve quadratic programmi...
eSAT Journals
 
PPTX
Data Analysis Homework Help
Matlab Assignment Experts
 
PPTX
Numerical Techniques
Yasir Mahdi
 
PPTX
Chapter8-Link_Analysis.pptx
AmenahAbbood
 
PPTX
Chapter8-Link_Analysis (1).pptx
AmenahAbbood
 
PPTX
Generative models
Avner Gidron
 
sffffffffffsssssNetwork_MarkovChains.pdf
sandrosvanidze3
 
Cost Efficient PageRank Computation using GPU : NOTES
Subhajit Sahu
 
Advanced Modularity Optimization Assignment Help
Computer Network Assignment Help
 
Design of Second Order Digital Differentiator and Integrator Using Forward Di...
inventionjournals
 
Fortran chapter 2.pdf
JifarRaya
 
Intro. to computational Physics ch2.pdf
JifarRaya
 
Perspective in Informatics 3 - Assignment 2 - Answer Sheet
Hoang Nguyen Phong
 
D026017036
inventionjournals
 
MODULE_05-Matrix Decomposition.pptx
AlokSingh205089
 
Mncs 16-10-1주-변승규-introduction to the machine learning #2
Seung-gyu Byeon
 
Paper Study: Melding the data decision pipeline
ChenYiHuang5
 
Simplex algorithm
Khwaja Bilal Hassan
 
New approach for wolfe’s modified simplex method to solve quadratic programmi...
eSAT Journals
 
Data Analysis Homework Help
Matlab Assignment Experts
 
Numerical Techniques
Yasir Mahdi
 
Chapter8-Link_Analysis.pptx
AmenahAbbood
 
Chapter8-Link_Analysis (1).pptx
AmenahAbbood
 
Generative models
Avner Gidron
 
Ad

Recently uploaded (20)

PPTX
Agile Chennai 18-19 July 2025 Ideathon | AI Powered Microfinance Literacy Gui...
AgileNetwork
 
PPTX
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 
PDF
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
PDF
AI Unleashed - Shaping the Future -Starting Today - AIOUG Yatra 2025 - For Co...
Sandesh Rao
 
PDF
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
PDF
Software Development Methodologies in 2025
KodekX
 
PDF
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
PDF
Brief History of Internet - Early Days of Internet
sutharharshit158
 
PDF
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
PPTX
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
PPTX
The Future of AI & Machine Learning.pptx
pritsen4700
 
PDF
Oracle AI Vector Search- Getting Started and what's new in 2025- AIOUG Yatra ...
Sandesh Rao
 
PDF
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
PDF
AI-Cloud-Business-Management-Platforms-The-Key-to-Efficiency-Growth.pdf
Artjoker Software Development Company
 
PDF
SparkLabs Primer on Artificial Intelligence 2025
SparkLabs Group
 
PDF
Doc9.....................................
SofiaCollazos
 
PDF
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
PDF
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
PDF
The Future of Artificial Intelligence (AI)
Mukul
 
PDF
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 
Agile Chennai 18-19 July 2025 Ideathon | AI Powered Microfinance Literacy Gui...
AgileNetwork
 
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
AI Unleashed - Shaping the Future -Starting Today - AIOUG Yatra 2025 - For Co...
Sandesh Rao
 
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
Software Development Methodologies in 2025
KodekX
 
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
Brief History of Internet - Early Days of Internet
sutharharshit158
 
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
The Future of AI & Machine Learning.pptx
pritsen4700
 
Oracle AI Vector Search- Getting Started and what's new in 2025- AIOUG Yatra ...
Sandesh Rao
 
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
AI-Cloud-Business-Management-Platforms-The-Key-to-Efficiency-Growth.pdf
Artjoker Software Development Company
 
SparkLabs Primer on Artificial Intelligence 2025
SparkLabs Group
 
Doc9.....................................
SofiaCollazos
 
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
The Future of Artificial Intelligence (AI)
Mukul
 
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 

Extrapolation

  • 1. Extrapolation Methods for Accelerating PageRank Computations Sepandar D. Kamvar Taher H. Haveliwala Christopher D. Manning Gene H. Golub Stanford University
  • 2. Motivation Problem: Speed up PageRank Motivation: Personalization “ Freshness” Note: PageRank Computations don’t get faster as computers do. Results: 1. The Official Site of the San Francisco Giants Search: Giants Results: 1. The Official Site of the New York Giants
  • 3. Outline Definition of PageRank Computation of PageRank Convergence Properties Outline of Our Approach Empirical Results 0.4 0.2 0.4 Repeat: u 1 u 2 u 3 u 4 u 5 u 1 u 2 u 3 u 4 u 5
  • 4. Link Counts Linked by 2 Important Pages Linked by 2 Unimportant pages Sep’s Home Page Taher’s Home Page Yahoo! CNN DB Pub Server CS361
  • 5. Definition of PageRank The importance of a page is given by the importance of the pages that link to it. importance of page i pages j that link to page i number of outlinks from page j importance of page j
  • 6. Definition of PageRank Yahoo! CNN DB Pub Server Taher Sep 1/2 1/2 1 1 0.1 0.1 0.1 0.05 0.25
  • 7. PageRank Diagram Initialize all nodes to rank 0.333 0.333 0.333
  • 8. PageRank Diagram Propagate ranks across links (multiplying by link weights) 0.167 0.167 0.333 0.333
  • 10. PageRank Diagram 0.167 0.167 0.5 0.167
  • 11. PageRank Diagram 0.5 0.333 0.167
  • 12. PageRank Diagram After a while… 0.4 0.4 0.2
  • 13. Computing PageRank Initialize: Repeat until convergence: importance of page i pages j that link to page i number of outlinks from page j importance of page j
  • 14. Matrix Notation 0 .2 0 .3 0 0 .1 .4 0 .1 = .1 .3 .2 .3 .1 .1 .2 . 1 .3 .2 .3 .1 .1
  • 15. Matrix Notation Find x that satisfies: . 1 .3 .2 .3 .1 .1 0 .2 0 .3 0 0 .1 .4 0 .1 = .1 .3 .2 .3 .1 .1 .2
  • 16. Power Method Initialize: Repeat until convergence:
  • 17. PageRank doesn’t actually use P T . Instead, it uses A=cP T + (1-c)E T . So the PageRank problem is really: not: A side note Find x that satisfies: Find x that satisfies:
  • 18. Power Method And the algorithm is really . . . Initialize: Repeat until convergence:
  • 19. Outline Definition of PageRank Computation of PageRank Convergence Properties Outline of Our Approach Empirical Results 0.4 0.2 0.4 Repeat: u 1 u 2 u 3 u 4 u 5 u 1 u 2 u 3 u 4 u 5
  • 20. Power Method u 1 1 u 2  2 u 3  3 u 4  4 u 5  5 Express x (0) in terms of eigenvectors of A
  • 21. Power Method u 1 1 u 2  2  2 u 3  3  3 u 4  4  4 u 5  5  5
  • 22. Power Method u 1 1 u 2  2  2 2 u 3  3  3 2 u 4  4  4 2 u 5  5  5 2
  • 23. Power Method u 1 1 u 2  2  2 k u 3  3  3 k u 4  4  4 k u 5  5  5 k
  • 24. Power Method u 1 1 u 2  u 3  u 4  u 5 
  • 25. Why does it work? Imagine our n x n matrix A has n distinct eigenvectors u i . u 1 1 u 2  2 u 3  3 u 4  4 u 5  5 Then, you can write any n -dimensional vector as a linear combination of the eigenvectors of A .
  • 26. Why does it work? From the last slide: To get the first iterate, multiply x (0) by A . First eigenvalue is 1. Therefore: All less than 1
  • 27. Power Method u 1 1 u 2  2 u 3  3 u 4  4 u 5  5 u 1 1 u 2  2  2 u 3  3  3 u 4  4  4 u 5  5  5 u 1 1 u 2  2  2 2 u 3  3  3 2 u 4  4  4 2 u 5  5  5 2
  • 28. The smaller  2 , the faster the convergence of the Power Method. Convergence u 1 1 u 2  2  2 k u 3  3  3 k u 4  4  4 k u 5  5  5 k
  • 29. Our Approach u 1 u 2 u 3 u 4 u 5 Estimate components of current iterate in the directions of second two eigenvectors, and eliminate them.
  • 30. Why this approach? For traditional problems: A is smaller, often dense.  2 often close to   , making the power method slow. In our problem, A is huge and sparse More importantly,  2 is small 1 . Therefore, Power method is actually much faster than other methods. 1 (“The Second Eigenvalue of the Google Matrix” dbpubs.stanford.edu/pub/2003-20.)
  • 31. Using Successive Iterates u 1 x (0) u 1 u 2 u 3 u 4 u 5
  • 32. Using Successive Iterates u 1 x (1) x (0) u 1 u 2 u 3 u 4 u 5
  • 33. Using Successive Iterates u 1 x (1) x (0) x (2) u 1 u 2 u 3 u 4 u 5
  • 34. Using Successive Iterates x (0) u 1 x (1) x (2) u 1 u 2 u 3 u 4 u 5
  • 35. Using Successive Iterates x (0) x’ = u 1 x (1) u 1 u 2 u 3 u 4 u 5
  • 36. How do we do this? Assume x (k) can be written as a linear combination of the first three eigenvectors ( u 1 , u 2 , u 3 ) of A. Compute approximation to { u 2 , u 3 }, and subtract it from x (k) to get x (k) ’
  • 37. Assume Assume the x (k) can be represented by first 3 eigenvectors of A
  • 38. Linear Combination Let’s take some linear combination of these 3 iterates.
  • 39. Rearranging Terms We can rearrange the terms to get: Goal: Find  1 ,  2 ,  3 so that coefficients of u 2 and u 3 are 0, and coefficient of u 1 is 1.
  • 40. Summary We make an assumption about the current iterate. Solve for dominant eigenvector as a linear combination of the next three iterates. We use a few iterations of the Power Method to “clean it up”.
  • 41. Outline Definition of PageRank Computation of PageRank Convergence Properties Outline of Our Approach Empirical Results u 1 u 2 u 3 u 4 u 5 u 1 u 2 u 3 u 4 u 5 0.4 0.2 0.4 Repeat:
  • 42. Results Quadratic Extrapolation speeds up convergence. Extrapolation was only used 5 times!
  • 43. Results Extrapolation dramatically speeds up convergence, for high values of c (c=.99)
  • 44. Take-home message Speeds up PageRank by a fair amount, but not by enough for true Personalized PageRank. Ideas are useful for further speedup algorithms. Quadratic Extrapolation can be used for a whole class of problems.
  • 45. The End Paper available at http://dbpubs.stanford.edu/pub/2003-16

Editor's Notes

  • #26: Why does the Power Method Work?
  • #27: Assume that lambda 1 is less than 1 and all other eigenvalues are strictly less than 1.
  • #29: Here, talk about in the past, how lambda 2 is often close to 1, so the power method is not useful. However, in our case,
  • #38: Note : derivation given here is slightly different from what’s in the paper the one here is perhaps more intuitive the one in the paper is more compact