SlideShare a Scribd company logo
Current Topics in MapReduce and Virtualization Presented by Chris Bunch at UCSB CS595D - Seminar on Large-Scale Data Management February 2, 2010 http://cs.ucsb.edu/~cgb
To Recap: The “Comparison Paper” by DeWitt, Stonebraker, et al. [1] claims: Data movement is fast for Hadoop MR but slow for Vertica and DBMS-X Queries are fast on Vertica and DBMS-X and slow on Hadoop MR Conclusion: Hadoop MR bad, Vertica good
Specifically Comparison paper claims Hadoop MR is slow because: H MR must always read the entire file MR cannot enforce a schema in the input data (parsing it becomes a bottleneck) Fault tolerance requires data shuffling between Map and Reduce
Update In Jan. 2010’s CACM, DeWitt and Stonebraker [2] update their point of view: Hadoop MR and relational DBs complement each other Use Hadoop MR for “complex” or “quick-and-dirty” analyses. Use relational DBs for everything else.
Another Update Dean and Ghemawat also respond in Jan. 2010’s CACM [3]: Problems are with H MR, not MR itself MR does not need to read all the input data Can use BigTable / HBase to get a subset of the input data for processing
Continuing MR input / output doesn’t need to be simple text files (use BigTable / HBase) MR input / output data can have schemas Can be stored as Protocol Buffers Parsing a string: 1731 ns / record Parsing a Protocol Buffer: 20 ns / record
Fundamentally: Bad Representation of Data: 137|http://www.somehost.com/index.html|602 Good Representation of Data: message Rankings { required string pageurl = 1 required int32 pagerank = 2 required int32 avgduration = 3 }
Conclusion DeWitt and Stonebraker’s arguments are valid against Hadoop MR but not against MR itself Dean’s rebuttal clearly shows that Google MR overcomes DeWitt’s objections to it No native support for PB Serialization in Hadoop MR [4] (hybrid approach possible)
Part 2:  Virtualization Software layer for isolated execution of 1+ virtual guest system on real hardware (multicores) Improves hardware utilization, improves portability, other benefits Multiplexes hardware resources between guests
Virtualization Emulates ISA (captures privileged instructions) and devices, manages state Without OS modification: full virtualization With OS modification: paravirtualization Hardware support for virtualization (modern AMD / Intel processors)
Migrating VMs: Why? Load balancing Online maintenance Proactive fault tolerance Power management
Live Migration of Virtual Machines [5] Authored by Christopher Clark, Keir Fraser, Steven Hand, Jacob Gorm Hansen, Eric Jul, Christian Limpach, Ian Pratt, and Andrew Warfield (Cambridge and University of Copenhagen) Published in NSDI 2005
In a Nutshell Perform continuous migration while the system is running to ensure that when migration is needed, it can be done quickly. Recorded service downtimes as low as 60ms using Xen
Motivation Process-level migration is hard Small interface between OS and VMM makes VM migration much easier Goal is to minimize application downtime, total migration time, and ensure that migration does not impact active services
Memory Migration Techniques Push phase: Source sends memory pages to destination VM Stop-and-copy phase: Source stops, sends pages, starts destination Pull phase: Destination retrieves memory pages from source VM as needed This hybrid technique uses the first two
Migrating Local Resources To migrate network traffic, simply send an ARP reply with the new destination Does not always work Can also create destination VM with same MAC address Local disk storage problem not addressed For now, use NFS
The Algorithm
Writable Working Sets Modified pages need to be re-copied over Dubbed the “Writable Working Set” Measure this by reading the dirty bitmap every 50 ms Small WWS ⇔ easy to migrate Large WWS ⇔ hard to migrate
WWS for SPEC CINT2000
Implementation Issues Managed migration: Daemon in a separate VM copies pages from source to destination Requires modification to Xen so that daemon can read shadow page tables Can stop source for final copy easily
Implementation Issues Self migration: Source copies pages to destination No modification to source OS needed Stopping source for final copy is hard First stop everything except migrator program, then copy final dirty pages
Rate Limiting If the migration process uses too much bandwidth, it can hamper other processes Relies on administrator specifying a min and max bandwidth to use Seems like it could be determined programatically
Optimizations Don’t copy pages that are frequently dirtied Slow down write-heavy services Don’t do this to essential processes Free all unused cache pages when migration starts Can incur a greater cost if needed later
Evaluation Hardware: 2 Dell PE-2650 servers Dual Xen 2GHz CPUs (one disabled) 2GB memory, Gigabit ethernet Software: XenLinux 2.4.27 Disk attached via NAS
SPECweb99
Quake 3 Server
Memory Muncher
Future Work Intelligently choose the placement and movement of VMs in a cluster Expand this technique to work for VMs not on the same subnet Add support for migrating hard drives Suggest using mirrored disks for now
Conclusions This new technique allows us to migrate VMs with low downtime Works well on applications w/small WWS Optimizations may help other cases but could impact application performance Future work looks promising
Live Migration of Virtual Machine Based on Full System Trace and Replay [6] Authored by Haikun Liu, Hai Jin, Xiaofei Liao, Liting Hu, and Chen Yu (Huazhong University) Published in HPDC 2009
In a Nutshell Previous methods migrate VM but incur too much downtime and too much network bandwidth. Records up to 72.4% reduction in app downtime, up to 31.5% reduction in migration time, and up to 95.9% reduction in data needed to synchronize VM state
Motivation Pre-copy methods fail in three ways: Can’t do memory intensive operations Slowing down write-heavy processes is infeasible in real-world applications The algorithm doesn’t recover the CPU’s cache, resulting in cache and TLB misses and possible performance degradation
Goals Minimize application downtime Minimize total migration time Minimize total data transferred All are similar to goals from previous work
Basic Idea Synchronize the state of the two machines Second machine then will follow same state as the first unless a non-deterministic event occurs Remedy this by keeping a log of non-deterministic events (time, external input) and replaying them
Getting Around Limitations Checkpoint / replay scheme succeeds: Can do memory intensive operations Doesn’t slow down write-heavy processes Does recover the CPU’s cache, avoiding cache and TLB misses and avoiding possible performance degradation
Specifically
Implementation Details Logging and sending logs done by source Replay performed by target Also entails monitoring R log  and initializing the migration
Implementation Details Checkpointing Pause source VM, change all pages to read-only, unpause VM Start copying pages and if writes come, redirect them to a Copy-On-Write buffer (COW) When done, merge pages and COW
Implementation Details File system access - must be SAN Reading / writing forbidden on target VM, redirected to log file (external input) Network redirection Same as before, uses ARP broadcasting
Experimental Setup Hardware: 2 AMD Athlon 3500+ CPUs 1 GB DDR RAM (VM only uses half) Gigabit Ethernet Software: UMLinux w/ RHEL AS3 Disk attached via NAS
Application Downtime
Total Migration Time
Data Transferred
Lessons Looking at kernel-build: Has low non-determinism, so R log  is small Total migration time is long because R replay  ≈ R log Recall we want apps with the original condition, R replay  >> R log , for best migration time
Synchronization Data
Summary: Pros Excels when R replay  >> R log Incurs less application downtime than previous work Total migration time less than previously Migrates with less traffic than previously
Summary: Cons Works only on single processor systems Works only when ARP redirect works Performs poorly when R replay  ≈ R log Still does not address regular hard drives Large size makes migration infeasible
References [1] Pavlo et al.,  A Comparison of Approaches to Large-Scale Data Analysis , SIGMOD 2009 [2] Stonebraker et al.,  MapReduce and Parallel DBMSs: Friends or Foes? , CACM Jan. 2010 [3] Dean et al.,  MapReduce: A Flexible Data Processing Tool , CACM, Jan. 2010 [4]  Add serialization support for Protocol Buffers ,  http://issues.apache.org/jira/browse/MAPREDUCE-377 [5] Clark et al.,  Live Migration of Virtual Machines , NSDI 2005 [6] Liu et al.,  Live Migration of Virtual Machine Based on Full System Trace and Replay, HPDC 2009

More Related Content

What's hot (19)

PDF
3D: DBT using Databricks and Delta
Databricks
 
PDF
Microsoft: Building a Massively Scalable System with DataStax and Microsoft's...
DataStax Academy
 
PDF
Cloud
Damilola Mosaku
 
PDF
Glynn Bird – Cloudant – Building applications for success.- NoSQL matters Bar...
NoSQLmatters
 
PDF
ADDO 2021: Why and how to include database changes in the deployment pipeline
Eduardo Piairo
 
PDF
Lessons from Large-Scale Cloud Software at Databricks
Matei Zaharia
 
PPTX
Relational databases vs Non-relational databases
James Serra
 
PPTX
Preventative Maintenance of Robots in Automotive Industry
DataWorks Summit/Hadoop Summit
 
PPTX
Move a successful onpremise oltp application to the cloud
Ike Ellis
 
PDF
Interactive Visualization of Streaming Data Powered by Spark by Ruhollah Farc...
Spark Summit
 
PPTX
Couchbase and Apache Spark
Matt Ingenthron
 
PPTX
Seamless, Real-Time Data Integration with Connect
Precisely
 
PPTX
Spark and Couchbase– Augmenting the Operational Database with Spark
Matt Ingenthron
 
PPTX
Review Oracle OpenWorld 2015 - Overview, Main themes, Announcements and Future
Lucas Jellema
 
PDF
Large Scale Lakehouse Implementation Using Structured Streaming
Databricks
 
PPTX
Introduction to couchbase
Dipti Borkar
 
PDF
It's a wrap - closing keynote for nlOUG Tech Experience 2017 (16th June, The ...
Lucas Jellema
 
PDF
Simplifying Disaster Recovery with Delta Lake
Databricks
 
PDF
Realizing the promise of portable data processing with Apache Beam
DataWorks Summit
 
3D: DBT using Databricks and Delta
Databricks
 
Microsoft: Building a Massively Scalable System with DataStax and Microsoft's...
DataStax Academy
 
Glynn Bird – Cloudant – Building applications for success.- NoSQL matters Bar...
NoSQLmatters
 
ADDO 2021: Why and how to include database changes in the deployment pipeline
Eduardo Piairo
 
Lessons from Large-Scale Cloud Software at Databricks
Matei Zaharia
 
Relational databases vs Non-relational databases
James Serra
 
Preventative Maintenance of Robots in Automotive Industry
DataWorks Summit/Hadoop Summit
 
Move a successful onpremise oltp application to the cloud
Ike Ellis
 
Interactive Visualization of Streaming Data Powered by Spark by Ruhollah Farc...
Spark Summit
 
Couchbase and Apache Spark
Matt Ingenthron
 
Seamless, Real-Time Data Integration with Connect
Precisely
 
Spark and Couchbase– Augmenting the Operational Database with Spark
Matt Ingenthron
 
Review Oracle OpenWorld 2015 - Overview, Main themes, Announcements and Future
Lucas Jellema
 
Large Scale Lakehouse Implementation Using Structured Streaming
Databricks
 
Introduction to couchbase
Dipti Borkar
 
It's a wrap - closing keynote for nlOUG Tech Experience 2017 (16th June, The ...
Lucas Jellema
 
Simplifying Disaster Recovery with Delta Lake
Databricks
 
Realizing the promise of portable data processing with Apache Beam
DataWorks Summit
 

Viewers also liked (20)

PDF
Graphically understand and interactively explore your Data Lineage
Mohammad Ahmed
 
PPTX
Design Patterns
Evandro Venancio
 
PDF
GraphDay Stockholm - Levaraging Graph-Technology to fight Financial Fraud
Neo4j
 
PDF
GraphDay Stockholm - Graphs in Action
Neo4j
 
PDF
GraphDay Stockholm - iKnow Solutions - The Value Add of Graphs to Analytics a...
Neo4j
 
PDF
Webinar: Intro to Cypher
Neo4j
 
PDF
GraphDay Stockholm - Telia Zone
Neo4j
 
PPTX
Neo4j GraphTalks - Einführung in Graphdatenbanken
Neo4j
 
PDF
GraphDay Stockholm - Graphs in the Real World: Top Use Cases for Graph Databases
Neo4j
 
PPTX
Identity and Access Management
Neo4j
 
PPTX
The Five Graphs of Government: How Federal Agencies can Utilize Graph Technology
Neo4j
 
PPTX
GraphTalks Rome - Selecting the right Technology
Neo4j
 
KEY
Intro to Neo4j presentation
jexp
 
PDF
GraphTalks Rome - Introducing Neo4j
Neo4j
 
PDF
GraphTalks Rome - Identity and Access Management
Neo4j
 
PDF
GraphTalks Rome - The Italian Business Graph
Neo4j
 
PPTX
Knowledge Architecture: Graphing Your Knowledge
Neo4j
 
PDF
Journey of The Connected Enterprise - Knowledge Graphs - Smart Data
Benjamin Nussbaum
 
PDF
How to Design Retail Recommendation Engines with Neo4j
Neo4j
 
PDF
Hadoop and Graph Databases (Neo4j): Winning Combination for Bioanalytics - Jo...
Neo4j
 
Graphically understand and interactively explore your Data Lineage
Mohammad Ahmed
 
Design Patterns
Evandro Venancio
 
GraphDay Stockholm - Levaraging Graph-Technology to fight Financial Fraud
Neo4j
 
GraphDay Stockholm - Graphs in Action
Neo4j
 
GraphDay Stockholm - iKnow Solutions - The Value Add of Graphs to Analytics a...
Neo4j
 
Webinar: Intro to Cypher
Neo4j
 
GraphDay Stockholm - Telia Zone
Neo4j
 
Neo4j GraphTalks - Einführung in Graphdatenbanken
Neo4j
 
GraphDay Stockholm - Graphs in the Real World: Top Use Cases for Graph Databases
Neo4j
 
Identity and Access Management
Neo4j
 
The Five Graphs of Government: How Federal Agencies can Utilize Graph Technology
Neo4j
 
GraphTalks Rome - Selecting the right Technology
Neo4j
 
Intro to Neo4j presentation
jexp
 
GraphTalks Rome - Introducing Neo4j
Neo4j
 
GraphTalks Rome - Identity and Access Management
Neo4j
 
GraphTalks Rome - The Italian Business Graph
Neo4j
 
Knowledge Architecture: Graphing Your Knowledge
Neo4j
 
Journey of The Connected Enterprise - Knowledge Graphs - Smart Data
Benjamin Nussbaum
 
How to Design Retail Recommendation Engines with Neo4j
Neo4j
 
Hadoop and Graph Databases (Neo4j): Winning Combination for Bioanalytics - Jo...
Neo4j
 
Ad

Similar to Presentation on Large Scale Data Management (20)

PDF
HIGH AVAILABILITY AND LOAD BALANCING FOR POSTGRESQL DATABASES: DESIGNING AND ...
IJDMS
 
PPT
Clusters (Distributed computing)
Sri Prasanna
 
PPT
Seminar Presentation Hadoop
Varun Narang
 
PPTX
Hadoop training-in-hyderabad
sreehari orienit
 
PPT
Hadoop and Voldemort @ LinkedIn
Hadoop User Group
 
PPT
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Bhupesh Bansal
 
PDF
Solutions for Exercises: Distributed Systems 5th Edition by Coulouris & Dolli...
industriale82
 
PDF
A Survey on Big Data Analysis Techniques
ijsrd.com
 
PPT
How to scale your web app
Georgio_1999
 
PPT
Google Cloud Computing on Google Developer 2008 Day
programmermag
 
PPTX
Mapreduce is for Hadoop Ecosystem in Data Science
DakshGoti2
 
PDF
Virtual Machine Migration Techniques in Cloud Environment: A Survey
ijsrd.com
 
PPTX
Challenges in Cloud Computing – VM Migration
Sarmad Makhdoom
 
PPTX
Everything comes in 3's
delagoya
 
PPTX
Handling Data in Mega Scale Systems
Directi Group
 
PPT
Porting Spring PetClinic to GigaSpaces
Uri Cohen
 
PDF
Hadoop installation by santosh nage
Santosh Nage
 
PPT
Hw09 Production Deep Dive With High Availability
Cloudera, Inc.
 
PPT
How To Scale v2
Georgio_1999
 
HIGH AVAILABILITY AND LOAD BALANCING FOR POSTGRESQL DATABASES: DESIGNING AND ...
IJDMS
 
Clusters (Distributed computing)
Sri Prasanna
 
Seminar Presentation Hadoop
Varun Narang
 
Hadoop training-in-hyderabad
sreehari orienit
 
Hadoop and Voldemort @ LinkedIn
Hadoop User Group
 
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Bhupesh Bansal
 
Solutions for Exercises: Distributed Systems 5th Edition by Coulouris & Dolli...
industriale82
 
A Survey on Big Data Analysis Techniques
ijsrd.com
 
How to scale your web app
Georgio_1999
 
Google Cloud Computing on Google Developer 2008 Day
programmermag
 
Mapreduce is for Hadoop Ecosystem in Data Science
DakshGoti2
 
Virtual Machine Migration Techniques in Cloud Environment: A Survey
ijsrd.com
 
Challenges in Cloud Computing – VM Migration
Sarmad Makhdoom
 
Everything comes in 3's
delagoya
 
Handling Data in Mega Scale Systems
Directi Group
 
Porting Spring PetClinic to GigaSpaces
Uri Cohen
 
Hadoop installation by santosh nage
Santosh Nage
 
Hw09 Production Deep Dive With High Availability
Cloudera, Inc.
 
How To Scale v2
Georgio_1999
 
Ad

More from Chris Bunch (11)

PDF
AppScale at SB Cloud Meetup
Chris Bunch
 
PDF
Ph.D. Defense
Chris Bunch
 
PDF
A Pluggable Autoscaling System @ UCC
Chris Bunch
 
PDF
AppScale + Neptune @ HPCDB
Chris Bunch
 
PDF
Neptune @ SoCal
Chris Bunch
 
PDF
AppScale @ LA.rb
Chris Bunch
 
PDF
AppScale Talk at SBonRails
Chris Bunch
 
PDF
Active Cloud DB at CloudComp '10
Chris Bunch
 
PDF
Pregel: A System for Large-Scale Graph Processing
Chris Bunch
 
PPT
Designing the Call of Cthulhu app with Google App Engine
Chris Bunch
 
PDF
Appscale at CLOUDCOMP '09
Chris Bunch
 
AppScale at SB Cloud Meetup
Chris Bunch
 
Ph.D. Defense
Chris Bunch
 
A Pluggable Autoscaling System @ UCC
Chris Bunch
 
AppScale + Neptune @ HPCDB
Chris Bunch
 
Neptune @ SoCal
Chris Bunch
 
AppScale @ LA.rb
Chris Bunch
 
AppScale Talk at SBonRails
Chris Bunch
 
Active Cloud DB at CloudComp '10
Chris Bunch
 
Pregel: A System for Large-Scale Graph Processing
Chris Bunch
 
Designing the Call of Cthulhu app with Google App Engine
Chris Bunch
 
Appscale at CLOUDCOMP '09
Chris Bunch
 

Recently uploaded (20)

PDF
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
PDF
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PPTX
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
PDF
SFWelly Summer 25 Release Highlights July 2025
Anna Loughnan Colquhoun
 
PDF
Windsurf Meetup Ottawa 2025-07-12 - Planning Mode at Reliza.pdf
Pavel Shukhman
 
PDF
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
PDF
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
PDF
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PDF
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
PPTX
✨Unleashing Collaboration: Salesforce Channels & Community Power in Patna!✨
SanjeetMishra29
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
SFWelly Summer 25 Release Highlights July 2025
Anna Loughnan Colquhoun
 
Windsurf Meetup Ottawa 2025-07-12 - Planning Mode at Reliza.pdf
Pavel Shukhman
 
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
✨Unleashing Collaboration: Salesforce Channels & Community Power in Patna!✨
SanjeetMishra29
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 

Presentation on Large Scale Data Management

  • 1. Current Topics in MapReduce and Virtualization Presented by Chris Bunch at UCSB CS595D - Seminar on Large-Scale Data Management February 2, 2010 http://cs.ucsb.edu/~cgb
  • 2. To Recap: The “Comparison Paper” by DeWitt, Stonebraker, et al. [1] claims: Data movement is fast for Hadoop MR but slow for Vertica and DBMS-X Queries are fast on Vertica and DBMS-X and slow on Hadoop MR Conclusion: Hadoop MR bad, Vertica good
  • 3. Specifically Comparison paper claims Hadoop MR is slow because: H MR must always read the entire file MR cannot enforce a schema in the input data (parsing it becomes a bottleneck) Fault tolerance requires data shuffling between Map and Reduce
  • 4. Update In Jan. 2010’s CACM, DeWitt and Stonebraker [2] update their point of view: Hadoop MR and relational DBs complement each other Use Hadoop MR for “complex” or “quick-and-dirty” analyses. Use relational DBs for everything else.
  • 5. Another Update Dean and Ghemawat also respond in Jan. 2010’s CACM [3]: Problems are with H MR, not MR itself MR does not need to read all the input data Can use BigTable / HBase to get a subset of the input data for processing
  • 6. Continuing MR input / output doesn’t need to be simple text files (use BigTable / HBase) MR input / output data can have schemas Can be stored as Protocol Buffers Parsing a string: 1731 ns / record Parsing a Protocol Buffer: 20 ns / record
  • 7. Fundamentally: Bad Representation of Data: 137|http://www.somehost.com/index.html|602 Good Representation of Data: message Rankings { required string pageurl = 1 required int32 pagerank = 2 required int32 avgduration = 3 }
  • 8. Conclusion DeWitt and Stonebraker’s arguments are valid against Hadoop MR but not against MR itself Dean’s rebuttal clearly shows that Google MR overcomes DeWitt’s objections to it No native support for PB Serialization in Hadoop MR [4] (hybrid approach possible)
  • 9. Part 2: Virtualization Software layer for isolated execution of 1+ virtual guest system on real hardware (multicores) Improves hardware utilization, improves portability, other benefits Multiplexes hardware resources between guests
  • 10. Virtualization Emulates ISA (captures privileged instructions) and devices, manages state Without OS modification: full virtualization With OS modification: paravirtualization Hardware support for virtualization (modern AMD / Intel processors)
  • 11. Migrating VMs: Why? Load balancing Online maintenance Proactive fault tolerance Power management
  • 12. Live Migration of Virtual Machines [5] Authored by Christopher Clark, Keir Fraser, Steven Hand, Jacob Gorm Hansen, Eric Jul, Christian Limpach, Ian Pratt, and Andrew Warfield (Cambridge and University of Copenhagen) Published in NSDI 2005
  • 13. In a Nutshell Perform continuous migration while the system is running to ensure that when migration is needed, it can be done quickly. Recorded service downtimes as low as 60ms using Xen
  • 14. Motivation Process-level migration is hard Small interface between OS and VMM makes VM migration much easier Goal is to minimize application downtime, total migration time, and ensure that migration does not impact active services
  • 15. Memory Migration Techniques Push phase: Source sends memory pages to destination VM Stop-and-copy phase: Source stops, sends pages, starts destination Pull phase: Destination retrieves memory pages from source VM as needed This hybrid technique uses the first two
  • 16. Migrating Local Resources To migrate network traffic, simply send an ARP reply with the new destination Does not always work Can also create destination VM with same MAC address Local disk storage problem not addressed For now, use NFS
  • 18. Writable Working Sets Modified pages need to be re-copied over Dubbed the “Writable Working Set” Measure this by reading the dirty bitmap every 50 ms Small WWS ⇔ easy to migrate Large WWS ⇔ hard to migrate
  • 19. WWS for SPEC CINT2000
  • 20. Implementation Issues Managed migration: Daemon in a separate VM copies pages from source to destination Requires modification to Xen so that daemon can read shadow page tables Can stop source for final copy easily
  • 21. Implementation Issues Self migration: Source copies pages to destination No modification to source OS needed Stopping source for final copy is hard First stop everything except migrator program, then copy final dirty pages
  • 22. Rate Limiting If the migration process uses too much bandwidth, it can hamper other processes Relies on administrator specifying a min and max bandwidth to use Seems like it could be determined programatically
  • 23. Optimizations Don’t copy pages that are frequently dirtied Slow down write-heavy services Don’t do this to essential processes Free all unused cache pages when migration starts Can incur a greater cost if needed later
  • 24. Evaluation Hardware: 2 Dell PE-2650 servers Dual Xen 2GHz CPUs (one disabled) 2GB memory, Gigabit ethernet Software: XenLinux 2.4.27 Disk attached via NAS
  • 28. Future Work Intelligently choose the placement and movement of VMs in a cluster Expand this technique to work for VMs not on the same subnet Add support for migrating hard drives Suggest using mirrored disks for now
  • 29. Conclusions This new technique allows us to migrate VMs with low downtime Works well on applications w/small WWS Optimizations may help other cases but could impact application performance Future work looks promising
  • 30. Live Migration of Virtual Machine Based on Full System Trace and Replay [6] Authored by Haikun Liu, Hai Jin, Xiaofei Liao, Liting Hu, and Chen Yu (Huazhong University) Published in HPDC 2009
  • 31. In a Nutshell Previous methods migrate VM but incur too much downtime and too much network bandwidth. Records up to 72.4% reduction in app downtime, up to 31.5% reduction in migration time, and up to 95.9% reduction in data needed to synchronize VM state
  • 32. Motivation Pre-copy methods fail in three ways: Can’t do memory intensive operations Slowing down write-heavy processes is infeasible in real-world applications The algorithm doesn’t recover the CPU’s cache, resulting in cache and TLB misses and possible performance degradation
  • 33. Goals Minimize application downtime Minimize total migration time Minimize total data transferred All are similar to goals from previous work
  • 34. Basic Idea Synchronize the state of the two machines Second machine then will follow same state as the first unless a non-deterministic event occurs Remedy this by keeping a log of non-deterministic events (time, external input) and replaying them
  • 35. Getting Around Limitations Checkpoint / replay scheme succeeds: Can do memory intensive operations Doesn’t slow down write-heavy processes Does recover the CPU’s cache, avoiding cache and TLB misses and avoiding possible performance degradation
  • 37. Implementation Details Logging and sending logs done by source Replay performed by target Also entails monitoring R log and initializing the migration
  • 38. Implementation Details Checkpointing Pause source VM, change all pages to read-only, unpause VM Start copying pages and if writes come, redirect them to a Copy-On-Write buffer (COW) When done, merge pages and COW
  • 39. Implementation Details File system access - must be SAN Reading / writing forbidden on target VM, redirected to log file (external input) Network redirection Same as before, uses ARP broadcasting
  • 40. Experimental Setup Hardware: 2 AMD Athlon 3500+ CPUs 1 GB DDR RAM (VM only uses half) Gigabit Ethernet Software: UMLinux w/ RHEL AS3 Disk attached via NAS
  • 44. Lessons Looking at kernel-build: Has low non-determinism, so R log is small Total migration time is long because R replay ≈ R log Recall we want apps with the original condition, R replay >> R log , for best migration time
  • 46. Summary: Pros Excels when R replay >> R log Incurs less application downtime than previous work Total migration time less than previously Migrates with less traffic than previously
  • 47. Summary: Cons Works only on single processor systems Works only when ARP redirect works Performs poorly when R replay ≈ R log Still does not address regular hard drives Large size makes migration infeasible
  • 48. References [1] Pavlo et al., A Comparison of Approaches to Large-Scale Data Analysis , SIGMOD 2009 [2] Stonebraker et al., MapReduce and Parallel DBMSs: Friends or Foes? , CACM Jan. 2010 [3] Dean et al., MapReduce: A Flexible Data Processing Tool , CACM, Jan. 2010 [4] Add serialization support for Protocol Buffers , http://issues.apache.org/jira/browse/MAPREDUCE-377 [5] Clark et al., Live Migration of Virtual Machines , NSDI 2005 [6] Liu et al., Live Migration of Virtual Machine Based on Full System Trace and Replay, HPDC 2009