SlideShare a Scribd company logo
CLOUD COMPUTING
SYLLABUS
cc_mod1.ppt useful for engineering students
Why
 On-demand delivery of IT resources over the Internet.
 The delivery of computing services—including servers, storage, databases,
networking, software
cc_mod1.ppt useful for engineering students
Cloud Computing Models
cc_mod1.ppt useful for engineering students
cc_mod1.ppt useful for engineering students
cc_mod1.ppt useful for engineering students
cc_mod1.ppt useful for engineering students
cc_mod1.ppt useful for engineering students
SCALABLE COMPUTING OVER THE INTERNET
 Over 60 Years , computing technology has undergone changes in
Machine architecture
Operating system platform
Network connectivity
workloads.
 Shift from centralized computing to Parallel and distributed
computing for solving large scale problems over the internet.
 Distributed computing becomes data-intensive and network-
centric.
 Large-scale Internet applications leverage parallel & distributed
computing to Enhances quality of life and information services.
THE AGE OF INTERNET COMPUTING
 Billions of people access the Internet daily, As a result, supercomputer sites and large data
centers must provide high-performance computing services to huge numbers of Internet users
concurrently.
 The Linpack Benchmark, traditionally used to measure HPC performance, is not suitable
anymore.
 This is because modern computing focuses more on handling a large number of tasks
efficiently rather than just solving complex equations quickly.
 As a result , There is shift from High-performance computing (HPC) to High-throughput
computing (HTC) systems.
 High-throughput computing (HTC) systems built with parallel and distributed computing
technologies.
 To meet growing demand, data centers need better servers, storage, and high-bandwidth
networks.
THE PLATFORM EVOLUTION
 1950–1970: Mainframe Era : Large computers (IBM 360, CDC 6400) used by
governments & businesses.
 1960–1980: Minicomputer Era: Smaller, cost-effective systems (DEC PDP-11, VAX)
for businesses & universities.
 1970–1990: Personal Computer (PC) Era : VLSI microprocessors led to the rise of
affordable PCs.
 1980–2000: Portable & Wireless Computing Laptops, mobile devices, and early wireless
connectivity.
 1990–Present: Cloud & High-Performance Computing Cloud computing, HPC, and
HTC powering large-scale applications.
cc_mod1.ppt useful for engineering students
TECHNOLOGIES FOR NETWORK-BASED SYSTEMS
 1. Multi core CPUs and Multithreading Technologies
• Advances in CPU Processors
 Today, advanced CPUs or microprocessor chips assume a multi core architecture with
dual, quad, six, or more processing cores. These processors exploit parallelism at ILP
and TLP levels.
 Over the years , we can observe the changes in processor speed and network
bandwidth.
 Multi-core CPU and many-core GPU processors can handle multiple instruction threads at different magnitudes today.
 Following figure shows the architecture of a typical multicore processor.
 Each core is essentially a processor with its own private cache (L1 cache).
 Multiple cores are housed in the same chip with an L2 cache that is shared by all cores.
MULTICORE CPU AND MANY-CORE GPU ARCHITECTURES
MULTITHREADING TECHNOLOGY
2. GPU COMPUTING
GPU Computing Model
NVIDIA Fermi GPU built with 16 streaming
multiprocessors (SMs) of 32 CUDA cores each.
POWER EFFICIENCY OF THE GPU
3. MEMORY, STORAGE, AND WIDE-AREA NETWORKING
 Memory Technology
 The growth of DRAM chip capacity has increased from 16 KB in 1976 to 64 GB in
2011.
 This shows that memory chips have experienced a 4x increase in capacity every three
years.
 Memory access time did not improve much in the past. In fact, the memory wall problem
is getting worse as the processor gets faster.
 For hard drives, capacity increased from 260 MB in 1981 to 250 GB in 2004
 Disks and Storage Technology
 Beyond 2011, disks or disk arrays have exceeded 3 TB in capacity.
 The rapid growth of flash memory and solid-state drives (SSDs) also impacts the future of HPC
and HTC systems.
 A typical SSD can handle 300,000 to 1 million write cycles per block.
 Power consumption, cooling, and packaging will limit large system development.
 Wide-Area Networking
 The rapid growth of Ethernet bandwidth has increased from 10 Mbps in
1979 to 1 Gbps in 1999, and 40 ~ 100 GE in 2011.
 High-bandwidth networking increases the capability of building massively distributed systems.
 Most data centers are using Gigabit Ethernet as the interconnect in their server clusters
 System-Area Interconnects
 The nodes in small clusters are mostly interconnected by an Ethernet switch or a local
area network (LAN).
 Following figure shows, a LAN typically is used to connect client hosts to big servers
 A storage area network (SAN) connects servers to network storage such as disk arrays.
 Network attached storage (NAS) connects client hosts directly to the disk arrays.
4.VIRTUAL MACHINES AND VIRTUALIZATION MIDDLEWARE
 Virtual Machines
 A Virtual Machine (VM) is a software-based emulation of a physical
computer.
 The VM is built with virtual resources managed by a guest OS to run a specific
application.
 Between the VMs and the host platform, we need to deploy a middleware
layer called a virtual machine monitor (VMM).
 VMM is also called as hypervisor.
 VM Primitive Operations
 The VMs can be multiplexed between hardware machine.
 A VM can be suspended and stored in stable storage.
 A suspended VM can be resumed.
 A VM can be migrated from one hardware platform to another.
5. DATA CENTER VIRTUALIZATION FOR CLOUD COMPUTING
 Cloud platforms choose the popular x86 processors. Low-cost terabyte disks and
Gigabit Ethernet are used to build data centers.
 Data center design prioritizes overall efficiency and cost-effectiveness rather than
just maximizing processing speed.
 A large data center may be built with thousands of servers. Smaller data centers are
typically built with hundreds of servers.
 The cost to build and maintain data center servers has increased over the years.
 Only 30 percent of data center costs goes toward purchasing IT equipment and
remaining costs goes to management and maintenance.
 Convergence of Technologies
 Hardware virtualization and multicore chips enable the existence of dynamic
configurations in the cloud.
 Utility and grid computing technologies lay the necessary foundation for computing
clouds.
 SOA, Web 2.0, and mashups of platforms are pushing the cloud another step forward.
 Autonomic Computing
 Data Center Automation
SOFTWARE ENVIRONMENTS FOR DISTRIBUTED SYSTEMS AND CLOUDS
 Service-Oriented Architecture (SOA)
 Layered Architecture for Web Services and Grids
 Web Services and Tools
 The Evolution of SOA
 Grids versus Clouds
SERVICE-ORIENTED ARCHITECTURE (SOA)
LAYERED ARCHITECTURE FOR WEB SERVICES AND GRIDS
cc_mod1.ppt useful for engineering students
cc_mod1.ppt useful for engineering students
WEB SERVICES AND TOOLS
cc_mod1.ppt useful for engineering students
cc_mod1.ppt useful for engineering students
THE EVOLUTION OF SOA
GRIDS VERSUS CLOUDS
• The boundary between grids and clouds are getting blurred in recent
years.
• For web services, workflow technologies are used to coordinate or
orchestrate services with certain specifications used to define critical
business process models such as two-phase transactions.
• In general, a grid system applies static resources, while a cloud
emphasizes elastic resources.
• For some researchers, the differences between grids and clouds are limited
only in dynamic resource allocation based on virtualization and autonomic
computing.
• One can build a grid out of multiple clouds.
• This type of grid can do a better job than a pure cloud, because it can
explicitly support negotiated resource allocation.
• Thus one may end up building with a system of systems: such as a cloud of
clouds, a grid of clouds, or a cloud of grids, or inter-clouds as a basic SOA
architecture.
TRENDS TOWARD DISTRIBUTED OPERATING
SYSTEMS
• The computers in most distributed systems are loosely
coupled.
– This is mainly due to the fact that all node machines run with an
independent operating system.
• To promote resource sharing and fast communication among node
machines, it is best to have a distributed OS that manages all
resources coherently and efficiently.
• Such a system is most likely to be a closed system, and it will likely
rely on message passing and RPCs for internode communications.
– It should be pointed out that a distributed OS is crucial for upgrading
the performance, efficiency, and flexibility of distributed applications.
FEATURES OF 3 DISTRIBUTED OS
Parallel and Distributed Programming
Models
MESSAGE-PASSING INTERFACE (MPI)
• This is the primary programming standard used to develop
parallel and concurrent programs to run on a distributed
system.
– MPI is essentially a library of subprograms that can be
called from C or FORTRAN to write parallel programs
running on a distributed system.
• The idea is to embody clusters, grid systems, and P2P
systems with upgraded web services and utility computing
applications.
– Besides MPI, distributed programming can be also supported
with low-level primitives such as the Parallel Virtual
Machine (PVM).
MAPREDUCE
• MapReduce is a web programming model for
scalable data processing on large clusters over large
data sets.
– The model is applied mainly in web-scale search and
cloud computing applications.
• The user specifies a Map function to generate a set
of intermediate key/value pairs.
• Then the user applies a Reduce function to merge
all intermediate values with the same intermediate
key.
• MapReduce is highly scalable to explore high
degrees of parallelism at different job levels.
• A typical MapReduce computation process can
handle terabytes of data on tens of thousands or
more client machines:
– Hundreds of MapReduce programs can be executed
simultaneously; in fact, thousands of MapReduce jobs
are executed on Google’s clusters every day.
cc_mod1.ppt useful for engineering students
HADOOP LIBRARY
• Hadoop offers a software platform that was originally
developed by a Yahoo! group.
• The package enables users to write and run applications over vast
amounts of distributed data.
• Users can easily scale Hadoop to store and process petabytes of data
in the web space.
– Also, Hadoop is economical in that it comes with an open source version
of MapReduce that minimizes overhead in task spawning and massive
data communication.
• It is efficient, as it processes data with a high degree of parallelism
across a large number of commodity nodes, and it is reliable in that
it automatically keeps multiple data copies to facilitate redeployment
of computing tasks upon unexpected system failures.

More Related Content

PDF
CC LECTURE NOTES (1).pdf
HasanAfwaaz1
 
PDF
Unit i cloud computing
MGkaran
 
PDF
UNIT I -Cloud Computing (1).pdf
lauroeuginbritto
 
PPT
cc_mod1.ppt useful for engineering students
Mprasad23
 
PDF
Cloud Computing BCS601 Notef of Viswesvaraya University
KARTHIKEYAJS
 
PPTX
DistributedSystemModels - cloud computing and distributed system models
harshvardhantharkar5
 
PDF
CloudComputing_UNIT1.pdf
khan593595
 
PDF
CloudComputing_UNIT1.pdf
khan593595
 
CC LECTURE NOTES (1).pdf
HasanAfwaaz1
 
Unit i cloud computing
MGkaran
 
UNIT I -Cloud Computing (1).pdf
lauroeuginbritto
 
cc_mod1.ppt useful for engineering students
Mprasad23
 
Cloud Computing BCS601 Notef of Viswesvaraya University
KARTHIKEYAJS
 
DistributedSystemModels - cloud computing and distributed system models
harshvardhantharkar5
 
CloudComputing_UNIT1.pdf
khan593595
 
CloudComputing_UNIT1.pdf
khan593595
 

Similar to cc_mod1.ppt useful for engineering students (20)

PPTX
introduction to distributed computing.pptx
ApthiriSurekha
 
PPTX
Cloud computing is a paradigm for enabling network access to a scalable and e...
satishjadhao6
 
PPTX
CLOUD COMPUTING UNIT-1
Dr K V Subba Reddy
 
PPTX
UNIT-1-PARADIGMS.pptx cloud computing cc
JahnaviNarala
 
PPT
Cluster Computers
shopnil786
 
DOCX
Cloud computing
Govardhan Gottigalla
 
PDF
IEEE Paper - A Study Of Cloud Computing Environments For High Performance App...
Angela Williams
 
PDF
Data Center for Cloud Computing - DC3X
Renaud Blanchette
 
PDF
Cloud & Data Center Networking
Thamalsha Wijayarathna
 
PPTX
Cluster Technique used in Advanced Computer Architecture.pptx
tiwarirajan1
 
PPT
Presentation-1.ppt
ssuserbfbf6f1
 
PPTX
CCS335 – CLOUD COMPUTING.pptx
NiviV4
 
PPT
System models for distributed and cloud computing
purplesea
 
PDF
R21 Sasi Engineering College cloud-computing-notes.pdf
itmohan
 
PPTX
CC & Security for learners_Module 1.pptx
mailshivaiah
 
PPTX
Cluster computing
pooja khatana
 
PPTX
CS8791 CLOUD COMPUTING_UNIT-I_FINAL_ppt (1).pptx
MALATHYANANDAN
 
PPTX
Lecture 1 - Computing Paradigms and.pptx
yaslamhazaa
 
PDF
How to Ensure Next-Generation Services
Fluke Networks
 
PPT
Inroduction to grid computing by gargi shankar verma
gargishankar1981
 
introduction to distributed computing.pptx
ApthiriSurekha
 
Cloud computing is a paradigm for enabling network access to a scalable and e...
satishjadhao6
 
CLOUD COMPUTING UNIT-1
Dr K V Subba Reddy
 
UNIT-1-PARADIGMS.pptx cloud computing cc
JahnaviNarala
 
Cluster Computers
shopnil786
 
Cloud computing
Govardhan Gottigalla
 
IEEE Paper - A Study Of Cloud Computing Environments For High Performance App...
Angela Williams
 
Data Center for Cloud Computing - DC3X
Renaud Blanchette
 
Cloud & Data Center Networking
Thamalsha Wijayarathna
 
Cluster Technique used in Advanced Computer Architecture.pptx
tiwarirajan1
 
Presentation-1.ppt
ssuserbfbf6f1
 
CCS335 – CLOUD COMPUTING.pptx
NiviV4
 
System models for distributed and cloud computing
purplesea
 
R21 Sasi Engineering College cloud-computing-notes.pdf
itmohan
 
CC & Security for learners_Module 1.pptx
mailshivaiah
 
Cluster computing
pooja khatana
 
CS8791 CLOUD COMPUTING_UNIT-I_FINAL_ppt (1).pptx
MALATHYANANDAN
 
Lecture 1 - Computing Paradigms and.pptx
yaslamhazaa
 
How to Ensure Next-Generation Services
Fluke Networks
 
Inroduction to grid computing by gargi shankar verma
gargishankar1981
 
Ad

Recently uploaded (20)

PDF
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
PDF
Responsible AI and AI Ethics - By Sylvester Ebhonu
Sylvester Ebhonu
 
PDF
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
PPTX
Applied-Statistics-Mastering-Data-Driven-Decisions.pptx
parmaryashparmaryash
 
PDF
Peak of Data & AI Encore - Real-Time Insights & Scalable Editing with ArcGIS
Safe Software
 
PDF
AI Unleashed - Shaping the Future -Starting Today - AIOUG Yatra 2025 - For Co...
Sandesh Rao
 
PDF
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
PPTX
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
PPTX
Introduction to Flutter by Ayush Desai.pptx
ayushdesai204
 
PPTX
The Future of AI & Machine Learning.pptx
pritsen4700
 
PDF
Software Development Methodologies in 2025
KodekX
 
PPTX
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
PPTX
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
PDF
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
PDF
AI-Cloud-Business-Management-Platforms-The-Key-to-Efficiency-Growth.pdf
Artjoker Software Development Company
 
PDF
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
PPTX
Simple and concise overview about Quantum computing..pptx
mughal641
 
PDF
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
PDF
Brief History of Internet - Early Days of Internet
sutharharshit158
 
PDF
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
Responsible AI and AI Ethics - By Sylvester Ebhonu
Sylvester Ebhonu
 
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
Applied-Statistics-Mastering-Data-Driven-Decisions.pptx
parmaryashparmaryash
 
Peak of Data & AI Encore - Real-Time Insights & Scalable Editing with ArcGIS
Safe Software
 
AI Unleashed - Shaping the Future -Starting Today - AIOUG Yatra 2025 - For Co...
Sandesh Rao
 
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
Introduction to Flutter by Ayush Desai.pptx
ayushdesai204
 
The Future of AI & Machine Learning.pptx
pritsen4700
 
Software Development Methodologies in 2025
KodekX
 
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
AI-Cloud-Business-Management-Platforms-The-Key-to-Efficiency-Growth.pdf
Artjoker Software Development Company
 
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
Simple and concise overview about Quantum computing..pptx
mughal641
 
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
Brief History of Internet - Early Days of Internet
sutharharshit158
 
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
Ad

cc_mod1.ppt useful for engineering students

  • 4. Why
  • 5.  On-demand delivery of IT resources over the Internet.  The delivery of computing services—including servers, storage, databases, networking, software
  • 13. SCALABLE COMPUTING OVER THE INTERNET  Over 60 Years , computing technology has undergone changes in Machine architecture Operating system platform Network connectivity workloads.  Shift from centralized computing to Parallel and distributed computing for solving large scale problems over the internet.  Distributed computing becomes data-intensive and network- centric.  Large-scale Internet applications leverage parallel & distributed computing to Enhances quality of life and information services.
  • 14. THE AGE OF INTERNET COMPUTING  Billions of people access the Internet daily, As a result, supercomputer sites and large data centers must provide high-performance computing services to huge numbers of Internet users concurrently.  The Linpack Benchmark, traditionally used to measure HPC performance, is not suitable anymore.  This is because modern computing focuses more on handling a large number of tasks efficiently rather than just solving complex equations quickly.  As a result , There is shift from High-performance computing (HPC) to High-throughput computing (HTC) systems.  High-throughput computing (HTC) systems built with parallel and distributed computing technologies.  To meet growing demand, data centers need better servers, storage, and high-bandwidth networks.
  • 15. THE PLATFORM EVOLUTION  1950–1970: Mainframe Era : Large computers (IBM 360, CDC 6400) used by governments & businesses.  1960–1980: Minicomputer Era: Smaller, cost-effective systems (DEC PDP-11, VAX) for businesses & universities.  1970–1990: Personal Computer (PC) Era : VLSI microprocessors led to the rise of affordable PCs.  1980–2000: Portable & Wireless Computing Laptops, mobile devices, and early wireless connectivity.  1990–Present: Cloud & High-Performance Computing Cloud computing, HPC, and HTC powering large-scale applications.
  • 17. TECHNOLOGIES FOR NETWORK-BASED SYSTEMS  1. Multi core CPUs and Multithreading Technologies • Advances in CPU Processors  Today, advanced CPUs or microprocessor chips assume a multi core architecture with dual, quad, six, or more processing cores. These processors exploit parallelism at ILP and TLP levels.  Over the years , we can observe the changes in processor speed and network bandwidth.
  • 18.  Multi-core CPU and many-core GPU processors can handle multiple instruction threads at different magnitudes today.  Following figure shows the architecture of a typical multicore processor.  Each core is essentially a processor with its own private cache (L1 cache).  Multiple cores are housed in the same chip with an L2 cache that is shared by all cores. MULTICORE CPU AND MANY-CORE GPU ARCHITECTURES
  • 20. 2. GPU COMPUTING GPU Computing Model NVIDIA Fermi GPU built with 16 streaming multiprocessors (SMs) of 32 CUDA cores each.
  • 22. 3. MEMORY, STORAGE, AND WIDE-AREA NETWORKING  Memory Technology  The growth of DRAM chip capacity has increased from 16 KB in 1976 to 64 GB in 2011.  This shows that memory chips have experienced a 4x increase in capacity every three years.  Memory access time did not improve much in the past. In fact, the memory wall problem is getting worse as the processor gets faster.  For hard drives, capacity increased from 260 MB in 1981 to 250 GB in 2004
  • 23.  Disks and Storage Technology  Beyond 2011, disks or disk arrays have exceeded 3 TB in capacity.  The rapid growth of flash memory and solid-state drives (SSDs) also impacts the future of HPC and HTC systems.  A typical SSD can handle 300,000 to 1 million write cycles per block.  Power consumption, cooling, and packaging will limit large system development.  Wide-Area Networking  The rapid growth of Ethernet bandwidth has increased from 10 Mbps in 1979 to 1 Gbps in 1999, and 40 ~ 100 GE in 2011.  High-bandwidth networking increases the capability of building massively distributed systems.  Most data centers are using Gigabit Ethernet as the interconnect in their server clusters
  • 24.  System-Area Interconnects  The nodes in small clusters are mostly interconnected by an Ethernet switch or a local area network (LAN).  Following figure shows, a LAN typically is used to connect client hosts to big servers  A storage area network (SAN) connects servers to network storage such as disk arrays.  Network attached storage (NAS) connects client hosts directly to the disk arrays.
  • 25. 4.VIRTUAL MACHINES AND VIRTUALIZATION MIDDLEWARE  Virtual Machines  A Virtual Machine (VM) is a software-based emulation of a physical computer.  The VM is built with virtual resources managed by a guest OS to run a specific application.  Between the VMs and the host platform, we need to deploy a middleware layer called a virtual machine monitor (VMM).  VMM is also called as hypervisor.
  • 26.  VM Primitive Operations  The VMs can be multiplexed between hardware machine.  A VM can be suspended and stored in stable storage.  A suspended VM can be resumed.  A VM can be migrated from one hardware platform to another.
  • 27. 5. DATA CENTER VIRTUALIZATION FOR CLOUD COMPUTING  Cloud platforms choose the popular x86 processors. Low-cost terabyte disks and Gigabit Ethernet are used to build data centers.  Data center design prioritizes overall efficiency and cost-effectiveness rather than just maximizing processing speed.  A large data center may be built with thousands of servers. Smaller data centers are typically built with hundreds of servers.  The cost to build and maintain data center servers has increased over the years.  Only 30 percent of data center costs goes toward purchasing IT equipment and remaining costs goes to management and maintenance.
  • 28.  Convergence of Technologies  Hardware virtualization and multicore chips enable the existence of dynamic configurations in the cloud.  Utility and grid computing technologies lay the necessary foundation for computing clouds.  SOA, Web 2.0, and mashups of platforms are pushing the cloud another step forward.  Autonomic Computing  Data Center Automation
  • 29. SOFTWARE ENVIRONMENTS FOR DISTRIBUTED SYSTEMS AND CLOUDS  Service-Oriented Architecture (SOA)  Layered Architecture for Web Services and Grids  Web Services and Tools  The Evolution of SOA  Grids versus Clouds
  • 31. LAYERED ARCHITECTURE FOR WEB SERVICES AND GRIDS
  • 38. GRIDS VERSUS CLOUDS • The boundary between grids and clouds are getting blurred in recent years. • For web services, workflow technologies are used to coordinate or orchestrate services with certain specifications used to define critical business process models such as two-phase transactions. • In general, a grid system applies static resources, while a cloud emphasizes elastic resources. • For some researchers, the differences between grids and clouds are limited only in dynamic resource allocation based on virtualization and autonomic computing. • One can build a grid out of multiple clouds. • This type of grid can do a better job than a pure cloud, because it can explicitly support negotiated resource allocation. • Thus one may end up building with a system of systems: such as a cloud of clouds, a grid of clouds, or a cloud of grids, or inter-clouds as a basic SOA architecture.
  • 39. TRENDS TOWARD DISTRIBUTED OPERATING SYSTEMS • The computers in most distributed systems are loosely coupled. – This is mainly due to the fact that all node machines run with an independent operating system. • To promote resource sharing and fast communication among node machines, it is best to have a distributed OS that manages all resources coherently and efficiently. • Such a system is most likely to be a closed system, and it will likely rely on message passing and RPCs for internode communications. – It should be pointed out that a distributed OS is crucial for upgrading the performance, efficiency, and flexibility of distributed applications.
  • 40. FEATURES OF 3 DISTRIBUTED OS
  • 41. Parallel and Distributed Programming Models
  • 42. MESSAGE-PASSING INTERFACE (MPI) • This is the primary programming standard used to develop parallel and concurrent programs to run on a distributed system. – MPI is essentially a library of subprograms that can be called from C or FORTRAN to write parallel programs running on a distributed system. • The idea is to embody clusters, grid systems, and P2P systems with upgraded web services and utility computing applications. – Besides MPI, distributed programming can be also supported with low-level primitives such as the Parallel Virtual Machine (PVM).
  • 43. MAPREDUCE • MapReduce is a web programming model for scalable data processing on large clusters over large data sets. – The model is applied mainly in web-scale search and cloud computing applications. • The user specifies a Map function to generate a set of intermediate key/value pairs. • Then the user applies a Reduce function to merge all intermediate values with the same intermediate key.
  • 44. • MapReduce is highly scalable to explore high degrees of parallelism at different job levels. • A typical MapReduce computation process can handle terabytes of data on tens of thousands or more client machines: – Hundreds of MapReduce programs can be executed simultaneously; in fact, thousands of MapReduce jobs are executed on Google’s clusters every day.
  • 46. HADOOP LIBRARY • Hadoop offers a software platform that was originally developed by a Yahoo! group. • The package enables users to write and run applications over vast amounts of distributed data. • Users can easily scale Hadoop to store and process petabytes of data in the web space. – Also, Hadoop is economical in that it comes with an open source version of MapReduce that minimizes overhead in task spawning and massive data communication. • It is efficient, as it processes data with a high degree of parallelism across a large number of commodity nodes, and it is reliable in that it automatically keeps multiple data copies to facilitate redeployment of computing tasks upon unexpected system failures.