cc_mod1.ppt useful for engineering students

 On-demand delivery of IT resources over the Internet.
 The delivery of computing services—including servers, storage, databases,
networking, software

SCALABLE COMPUTING OVER THE INTERNET
 Over 60 Years , computing technology has undergone changes in
Machine architecture
Operating system platform
Network connectivity
workloads.
 Shift from centralized computing to Parallel and distributed
computing for solving large scale problems over the internet.
 Distributed computing becomes data-intensive and network-
centric.
 Large-scale Internet applications leverage parallel & distributed
computing to Enhances quality of life and information services.

THE AGE OF INTERNET COMPUTING
 Billions of people access the Internet daily, As a result, supercomputer sites and large data
centers must provide high-performance computing services to huge numbers of Internet users
concurrently.
 The Linpack Benchmark, traditionally used to measure HPC performance, is not suitable
anymore.
 This is because modern computing focuses more on handling a large number of tasks
efficiently rather than just solving complex equations quickly.
 As a result , There is shift from High-performance computing (HPC) to High-throughput
computing (HTC) systems.
 High-throughput computing (HTC) systems built with parallel and distributed computing
technologies.
 To meet growing demand, data centers need better servers, storage, and high-bandwidth
networks.

THE PLATFORM EVOLUTION
 1950–1970: Mainframe Era : Large computers (IBM 360, CDC 6400) used by
governments & businesses.
 1960–1980: Minicomputer Era: Smaller, cost-effective systems (DEC PDP-11, VAX)
for businesses & universities.
 1970–1990: Personal Computer (PC) Era : VLSI microprocessors led to the rise of
affordable PCs.
 1980–2000: Portable & Wireless Computing Laptops, mobile devices, and early wireless
connectivity.
 1990–Present: Cloud & High-Performance Computing Cloud computing, HPC, and
HTC powering large-scale applications.

TECHNOLOGIES FOR NETWORK-BASED SYSTEMS
 1. Multi core CPUs and Multithreading Technologies
• Advances in CPU Processors
 Today, advanced CPUs or microprocessor chips assume a multi core architecture with
dual, quad, six, or more processing cores. These processors exploit parallelism at ILP
and TLP levels.
 Over the years , we can observe the changes in processor speed and network
bandwidth.

 Multi-core CPU and many-core GPU processors can handle multiple instruction threads at different magnitudes today.
 Following figure shows the architecture of a typical multicore processor.
 Each core is essentially a processor with its own private cache (L1 cache).
 Multiple cores are housed in the same chip with an L2 cache that is shared by all cores.
MULTICORE CPU AND MANY-CORE GPU ARCHITECTURES

2. GPU COMPUTING
GPU Computing Model
NVIDIA Fermi GPU built with 16 streaming
multiprocessors (SMs) of 32 CUDA cores each.

3. MEMORY, STORAGE, AND WIDE-AREA NETWORKING
 Memory Technology
 The growth of DRAM chip capacity has increased from 16 KB in 1976 to 64 GB in
2011.
 This shows that memory chips have experienced a 4x increase in capacity every three
years.
 Memory access time did not improve much in the past. In fact, the memory wall problem
is getting worse as the processor gets faster.
 For hard drives, capacity increased from 260 MB in 1981 to 250 GB in 2004

 Disks and Storage Technology
 Beyond 2011, disks or disk arrays have exceeded 3 TB in capacity.
 The rapid growth of flash memory and solid-state drives (SSDs) also impacts the future of HPC
and HTC systems.
 A typical SSD can handle 300,000 to 1 million write cycles per block.
 Power consumption, cooling, and packaging will limit large system development.
 Wide-Area Networking
 The rapid growth of Ethernet bandwidth has increased from 10 Mbps in
1979 to 1 Gbps in 1999, and 40 ~ 100 GE in 2011.
 High-bandwidth networking increases the capability of building massively distributed systems.
 Most data centers are using Gigabit Ethernet as the interconnect in their server clusters

 System-Area Interconnects
 The nodes in small clusters are mostly interconnected by an Ethernet switch or a local
area network (LAN).
 Following figure shows, a LAN typically is used to connect client hosts to big servers
 A storage area network (SAN) connects servers to network storage such as disk arrays.
 Network attached storage (NAS) connects client hosts directly to the disk arrays.

4.VIRTUAL MACHINES AND VIRTUALIZATION MIDDLEWARE
 Virtual Machines
 A Virtual Machine (VM) is a software-based emulation of a physical
computer.
 The VM is built with virtual resources managed by a guest OS to run a specific
application.
 Between the VMs and the host platform, we need to deploy a middleware
layer called a virtual machine monitor (VMM).
 VMM is also called as hypervisor.

 VM Primitive Operations
 The VMs can be multiplexed between hardware machine.
 A VM can be suspended and stored in stable storage.
 A suspended VM can be resumed.
 A VM can be migrated from one hardware platform to another.

5. DATA CENTER VIRTUALIZATION FOR CLOUD COMPUTING
 Cloud platforms choose the popular x86 processors. Low-cost terabyte disks and
Gigabit Ethernet are used to build data centers.
 Data center design prioritizes overall efficiency and cost-effectiveness rather than
just maximizing processing speed.
 A large data center may be built with thousands of servers. Smaller data centers are
typically built with hundreds of servers.
 The cost to build and maintain data center servers has increased over the years.
 Only 30 percent of data center costs goes toward purchasing IT equipment and
remaining costs goes to management and maintenance.

 Convergence of Technologies
 Hardware virtualization and multicore chips enable the existence of dynamic
configurations in the cloud.
 Utility and grid computing technologies lay the necessary foundation for computing
clouds.
 SOA, Web 2.0, and mashups of platforms are pushing the cloud another step forward.
 Autonomic Computing
 Data Center Automation

SOFTWARE ENVIRONMENTS FOR DISTRIBUTED SYSTEMS AND CLOUDS
 Service-Oriented Architecture (SOA)
 Layered Architecture for Web Services and Grids
 Web Services and Tools
 The Evolution of SOA
 Grids versus Clouds

SERVICE-ORIENTED ARCHITECTURE (SOA)

LAYERED ARCHITECTURE FOR WEB SERVICES AND GRIDS

GRIDS VERSUS CLOUDS
• The boundary between grids and clouds are getting blurred in recent
years.
• For web services, workflow technologies are used to coordinate or
orchestrate services with certain specifications used to define critical
business process models such as two-phase transactions.
• In general, a grid system applies static resources, while a cloud
emphasizes elastic resources.
• For some researchers, the differences between grids and clouds are limited
only in dynamic resource allocation based on virtualization and autonomic
computing.
• One can build a grid out of multiple clouds.
• This type of grid can do a better job than a pure cloud, because it can
explicitly support negotiated resource allocation.
• Thus one may end up building with a system of systems: such as a cloud of
clouds, a grid of clouds, or a cloud of grids, or inter-clouds as a basic SOA
architecture.

TRENDS TOWARD DISTRIBUTED OPERATING
SYSTEMS
• The computers in most distributed systems are loosely
coupled.
– This is mainly due to the fact that all node machines run with an
independent operating system.
• To promote resource sharing and fast communication among node
machines, it is best to have a distributed OS that manages all
resources coherently and efficiently.
• Such a system is most likely to be a closed system, and it will likely
rely on message passing and RPCs for internode communications.
– It should be pointed out that a distributed OS is crucial for upgrading
the performance, efficiency, and flexibility of distributed applications.

Parallel and Distributed Programming
Models

MESSAGE-PASSING INTERFACE (MPI)
• This is the primary programming standard used to develop
parallel and concurrent programs to run on a distributed
system.
– MPI is essentially a library of subprograms that can be
called from C or FORTRAN to write parallel programs
running on a distributed system.
• The idea is to embody clusters, grid systems, and P2P
systems with upgraded web services and utility computing
applications.
– Besides MPI, distributed programming can be also supported
with low-level primitives such as the Parallel Virtual
Machine (PVM).

MAPREDUCE
• MapReduce is a web programming model for
scalable data processing on large clusters over large
data sets.
– The model is applied mainly in web-scale search and
cloud computing applications.
• The user specifies a Map function to generate a set
of intermediate key/value pairs.
• Then the user applies a Reduce function to merge
all intermediate values with the same intermediate
key.

• MapReduce is highly scalable to explore high
degrees of parallelism at different job levels.
• A typical MapReduce computation process can
handle terabytes of data on tens of thousands or
more client machines:
– Hundreds of MapReduce programs can be executed
simultaneously; in fact, thousands of MapReduce jobs
are executed on Google’s clusters every day.

HADOOP LIBRARY
• Hadoop offers a software platform that was originally
developed by a Yahoo! group.
• The package enables users to write and run applications over vast
amounts of distributed data.
• Users can easily scale Hadoop to store and process petabytes of data
in the web space.
– Also, Hadoop is economical in that it comes with an open source version
of MapReduce that minimizes overhead in task spawning and massive
data communication.
• It is efficient, as it processes data with a high degree of parallelism
across a large number of commodity nodes, and it is reliable in that
it automatically keeps multiple data copies to facilitate redeployment
of computing tasks upon unexpected system failures.

cc_mod1.ppt useful for engineering students

More Related Content

Similar to cc_mod1.ppt useful for engineering students (20)

Recently uploaded (20)

cc_mod1.ppt useful for engineering students