SlideShare a Scribd company logo
International JournalVolume 1, Number Engineering(IJCET), ISSN 0976 – 6367(Print),
 International Journal of Computer Engineering and Technology
 ISSN 0976 – 6375(Online)
                             of Computer 2, Sept – Oct (2010), © IAEME
and Technology (IJCET), ISSN 0976 – 6367(Print)
ISSN 0976 – 6375(Online) Volume 1                                     IJCET
Number 2, Sept - Oct (2010), pp. 85-96                            ©IAEME
© IAEME, http://www.iaeme.com/ijcet.html

       ADAPTIVE LOAD BALANCING TECHNIQUES IN
              GLOBAL SCALE GRID ENVIRONMENT
                                        D.Asir
                                     PG Scholar
                    Department of Computer Science and Engineering
                           Karunya University, Coimbatore
                            E-Mail: asird@karunya.edu.in

                                  Shamila Ebenezer
                                  Assistant Professor
                    Department of Computer Science and Engineering
                           Karunya University, Coimbatore
                          E-Mail: shamila_cse@karunya.edu

                                      Daniel.D,
                                     PG Scholar
                    Department of Computer Science and Engineering
                           Karunya University, Coimbatore
                           E-Mail: Daniel_joen@yahoo.com

 ABSTRACT
        Data partitioning and load balancing are important components of parallel
 computations. Many different partitioning strategies have been developed, with great
 effectiveness in parallel applications. But the load-balancing problem is not yet solved
 completely; new applications and architectures require new partitioning features.
 Increased use of heterogeneous computing architectures requires partitioners that account
 for non-uniform computing, network, and memory resources. This paper surveys
 different adaptive technique for a partial differential system to solve load balancing
 problem.
        Index Terms: Dynamic load balancing; Performance characterization;
 Adaptive mesh refinement.




                                            85
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 2, Sept – Oct (2010), © IAEME


I. INTRODUCTION
        Adaptive Load Balancing Operate smoothly and scale reliably when facing spikes
in data volumes or unexpected utilization loads on the grid. Also it selects the best node
for session execution based on resource requirements and availability. An application-
centric performance characterization of dynamic partitioning and Load balancing
techniques for distributed adaptive grid hierarchies that underlie parallel adaptive mesh
refinement (AMR) techniques [1,14] for the solution of partial differential equations.
Early adaptive techniques of mesh motion (r-refinement) have been giving way to
methods that combine mesh refinement/coarsening (h-refinement) with order variation
(p-refinement) [3]. As advances in computer architecture enable the solution of complex
three-dimensional problems, the efficiency, reliability, and robustness provided by
adaptively will make its use even more advantageous. Parallel computation will be
essential in these simulations. Processor load-balancing must be dynamic since frequent
adaptive enrichment will upset a balanced computation. An adaptive finite element
method, have workloads that are unpredictable or change during the computation; such
applications require dynamic load balancers that adjust the decomposition as the
computation proceeds. Numerous strategies for static and dynamic load balancing have
been developed, including recursive bisection (RB) methods, space filling curve (SFC)
partitioning and graph partitioning, multilevel, and diffusive methods [7,10]. These
methods provide effective partitioning for many applications, perhaps suggesting that the
load-balancing problem is solved. Efficient parallel execution of these irregular grid
applications requires the partitioning of the associated graph into p parts with the
following two objectives: (i) each partition has an equal amount of total vertex weight;
(ii) the total weight of the edges cut by the partitions is minimized [2]. Simulation of
three dimensional flow with chemical reactions and plasma discharge in complex
geometries is one of the most resource demanding problems in computational science,
requiring both high performance and high-throughput computing. Grid computing
technologies opened up new opportunities to access virtually unlimited computational
resources, and inspired many researchers to develop new methodologies and algorithms
for parallel distributed applications on the Grid.



                                                 86
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 2, Sept – Oct (2010), © IAEME


II. ALB ALGORITHMS
A. Adaptive mesh-refinement algorithms (AMR)
1) Space-Filling Curves: Space-filling curves (SFC) [1] are a class of locality preserving
  mappings from d-dimensional space to 1- dimensional space. The self similar or
  recursive nature of mappings can be exploited to represent a hierarchical structure and
  to maintain locality across different levels of hierarchy. The SFC representation of the
  adaptive grid hierarchy is a 1-D ordered list of composite grid blocks where each
  composite block represents a block of the entire grid hierarchy and may contain more
  than one grid level.
2) Independent Grid Distribution: Distributes the grids independently across the
  processors. This distribution leads to balanced loads and no redistribution is required
  when grids are created or deleted. In the adaptive grid hierarchy, a fine grid typically
  corresponds to a small region of the underlying coarse grid. If both, the fine and coarse
  grid are distributed over the entire set of processors, all the processors will
  communicate with the small set of processors corresponding to the associated coarse
  grid region, causing a serialization bottleneck.
3) Combined Grid Distribution: Distributes the total work load in the grid hierarchy by
  first forming a simple linear structure by abutting grids at a level and then decomposing
  this structure into partitions of equal load. Regriding operations involving the creation
  or deletion of a grid are extremely expensive, as they require an almost complete
  redistribution of the grid hierarchy [4].The combined grid decomposition does not
  exploit the parallelism available within a level of the hierarchy.
4) Independent Level Distribution: Each level of the grid hierarchy is distributed by
  partitioning the combined load of all component grids at the level among the
  processors. This scheme overcomes some of the drawbacks of the independent grid
  distribution. Parallelism within a level of the hierarchy is exploited. Although the inter-
  grid communication bottleneck is reduced in this case, the required scatter
  communications can be expensive. Creation or deletion of component grids at any level
  requires a redistribution of the entire level.




                                                 87
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 2, Sept – Oct (2010), © IAEME


5) Iterative Tree balancing: A table is created from the grids at each time step, which
  keeps pointers to neighboring and parent grids. for every grid, immediate neighbors and
  children are also considered along with load distribution. Thus load balancing, inter
  level communication and intra level communication are addressed together. This
  scheme is used for distributing fine-element meshes and is promising as it deals with all
  the constraints to some extent.
6) Weighted Distribution: First assign a weight to each of these overheads. This weight
  defines the significance and contribution of the overhead to the overall application
  performance, The next step uses these weights to compute the affinity of each
  component grid to the different processors. Initially, grids have no affinity for any
  processor.
B. Dynamic Load Balancing via Tiling
        Tiling load-balancing system [3] is a modification of the global load-balancing
technique of that is applicable to a wide class of two-dimensional, uniform-grid
applications. Global balance is achieved by performing local balancing within
overlapping processor neighborhoods, where each processor is defined to be the center of
a neighborhood. Local balance involves element migrations to processors in the same
neighborhood that have elements sharing edges. tiling system is required by the adaptive
refinement algorithm. Because elemental workloads may vary due to refinement, the
tiling algorithm must account for elemental workloads when performing local load
balancing.
C. Multi criteria Geometric Partitioning:
        Crash simulations are “multiphase" applications consisting of two separate
phases: computation of forces and contact detection. Obtaining a single decomposition
that is good with respect to both phases would remove the need for communication
between phases. Each object would have multiple loads, corresponding to its workload in
each phase. The challenge would be computing a single decomposition that is balanced
with respect to all loads. Such a multi criteria partitioner could be used in other situations
as well, such as balancing both computational work and memory usage. Most geometric
partitioners reduce the partitioning problem [6] to a one-dimensional problem. Multi



                                                 88
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 2, Sept – Oct (2010), © IAEME


criteria load balancing can be formulated as either a multi constraint or multi objective
problem. Often, the balance of each load is considered a constraint and has to satisfy a
certain tolerance. Such a formulation fits the standard form, where, in this case, there is
no objective, only constraints. Unfortunately, there is no guarantee that a solution exists
to this problem. In practice, we want a “best possible" decomposition [7], even if the
desired balance criteria cannot be satisfied. Thus, an alternative is to make the constraints
objectives; that is, we want to achieve as good balance as possible with respect to all
loads.
D. Repartitioning Algorithms Based on Multilevel Diffusion
         The multilevel graph partitioning algorithm [2] implemented in METIS has three
phases, a coarsening phase a partitioning phase, and a refinement phase. During the
coarsening phase, a sequence of smaller graphs are constructed from an input graph by
collapsing vertices together. When enough vertices have been collapsed together so that
the coarsest graph is sufficiently small, a kway partition is found. Finally, the partition of
the coarsest graph is projected back to the original graph by refining it at each
uncoarsening level using a kway partitioning refinement algorithm. In the coarsening
phase, only pairs of nodes that belong to the same partition are considered for merging.
Hence, the initial partition of the coarsest level graph is identical to the input partition of
the graph that is being repartitioned and thus does not need to be computed. This makes
the coarsening phase completely parallelizable, as coarsening is local to each processor.
The uncoarsening phase of MLD contains two subphases: multilevel diffusion and
multilevel refinement. In the multi-level diffusion phase, balance is sought on the
coarsest graph in a process similar to multilevel refinement. This is accomplished by
forcing the migration of vertices out of overbalanced partitions.




                                                 89
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 2, Sept – Oct (2010), © IAEME




                       Figure 2.1 Multilevel diffusion repartitioning
        Multilevel diffusion repartitioning algorithms are made up of three phases, graph
coarsening, multilevel diffusion, and multilevel refinement. The coarsening phase results
in a series of contracted graphs. The multilevel diffusion phase balances the graph using
the very coarsest graphs. The multilevel refinement phase seeks to improve the edge-cut
disturbed by the balancing process. Optionally, the multilevel diffusion can be guided by
a diffusion solution. We will refer to our multilevel undirected diffusion repartitioning
algorithm as MLD and to our multilevel directed diffusion repartitioning algorithm as
MLDD. Single-level directed diffusion (SLDD) will be used to provide a comparison
with our multilevel diffusion schemes. In SLDD, diffusion and refinement are performed
only on the original input graph and thus, no graph contraction is performed.
E. SAMR (Structured Adaptive Mesh Refinement)
        Adaptive Characteristics of SAMR Applications [14] are analyzed from four
aspects: granularity, dynamicity, imbalance and dispersion.



                                                 90
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 2, Sept – Oct (2010), © IAEME


1) Granularity: The basic entity for data movement is a grid. Each grid consists of a
    computational interior and a ghost zone. The computational interior is the region of
    interest that has been refined from the immediately coarser level; the ghost zone is the
    part added to exterior of computational interior in order to obtain boundary
    information. For the computational interior, there is a requirement for the minimum
    number of cells, which is equal to the refinement ratio to the power of the number of
    dimensions.
2) Dynamicity: After each time-step of every level, the adaptation process is invoked
    based on one or more refinement criteria defined at the beginning of the simulation.
    The local regions satisfying the criteria will be refined. High frequency of adaptation
    requires the underlying DLB method to execute very fast, as well as to maintain high
    quality of load balancing.
3) Load Imbalance: The ideal balanced load is calculated. The standard deviation is
    pretty small compared to the average load, which means that the average load reflects
    the entire load distribution.
4) Dispersion: A few processors whose loads are increased dramatically and most
    processors have little or no change. All the processors can be grouped into four
    subgroups and each subgroup has similar characteristics with the percentage of
    refinement ranging from zero to 86% .These calculation indicates that different
    datasets exhibit different load distribution, and the underlying DLB scheme should
    provide high quality of load balancing for all these datasets. After taking into
    consideration the adaptive characteristics of the SAMR application, we developed an
    improved DLB scheme. DLB is composed of two steps: moving grid phase and
    splitting-grid phase.
Moving Grid Phase:
Step 1: Assign Moveflag, Splitflag as one and Lastmin,Lastmax as zero.
Step 2: When the condition Maxload/Avgload > threshold is set, the load is imbalanced.
Step 3: Then the Maxproc moves its grid to Minproc(using global information) under the
        condition the load is no more than (threshold * Avgload-Minload)
Step 4: This phase continues until all grids residing on the Maxproc are too large to be
        moved.


                                                 91
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 2, Sept – Oct (2010), © IAEME


Splitting Grid Phase:
Step 1: The Maxproc finds the Maxgrid.
Step 2: If the size of Maxgrid is no more than (Avgload-Minload) the grid moved to
          Minproc from Maxproc.
Step 3: If not Maxproc Splits the grid into two smaller grids.
Step 4: Any one size is around (Avgload-Minload) will be redistributed to Minproc.
F. Adaptive workload balancing (AWLB) on heterogeneous resources
        One of the factors that determine the performance of parallel applications on
heterogeneous resources is the quality of the workload distribution, e.g. through
functional decomposition or domain decomposition. Optimal load distribution is
characterized by two things: (1) all processors have a workload proportional to their
computational capacity and (2) communications between the processors are minimized.
These goals are conflicting since the communication is minimized when all the workload
is processed by a single processor and no communication takes place, and distributing the
workload inevitably incurs communication overheads. Thus, it is necessary to find a
balance and define a metric [15] that characterizes the quality of workload distribution
for a parallel problem.
1. Benchmark the resources dynamically assigned to the parallel application; measure the
  resource characteristics that constitute the set of resource parameters µ (available
  processing power, memory and links bandwidth).
2. Estimate the range of possible values of the application parameter fc. The minimal
  value is fmin=0, which corresponds to the case when no communications occur
  between the parallel processes of the application. The upper bound can be calculated
  based on the following reasoning: For the parallel processing to make sense, that is to
  ensure that running a parallel program on several processors is faster than sequential
  execution, the calculation time should exceed communication time. For homogeneous
  resources this can be expressed as follow


3. Search through the range of possible values of fc in [0 . . . fc max] to find the optimal
  value fc* minimizing the application execution time. For each value of fc calculate the



                                                 92
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 2, Sept – Oct (2010), © IAEME


  corresponding load distribution based on the resource parameters µ .With this
  distribution perform one time step, and measure the execution time the target
  optimization function. Selection of the next value of fc can be done by any optimization
  method for unimodal smooth functions; for instance a simple line-search method can be
  used.
4. Execute further calculations using the discovered fc*.
5. In the case of dynamic resources where performance is influenced by other factors
  (which is generally the case on the Grid), a periodic reestimation of resource
  parameters µ and load redistribution shall be performed during run-time of the
  application. Re-balancing shall be invoked if the application performance over the last
  step drops more than a certain user-defined threshold.
6. If the application is dynamically changing then fc*must be periodically re-estimated on
  the same set of resources.
G. The Path Algorithm
There are two steps to implement the PATH algorithm:
          First Step: We use simple single-packet algorithm (SMSP) to check the network
structure and to get the bottleneck link Lk. Compared with the standard single-packet
algorithm (SDSP) [12], SMSP algorithm does not have to measure the bandwidth of each
link of the whole network.
          Second Step: Use Packet Train with header probe to measure the bandwidth of
the link Lk. The source sends out a header packet H and a packet train T1, T2,… Tn.
Both the header and the packet train are UDP packets. All the packets Ti of the packet-
train are of the same size. Sh, the size of header packet H is much larger than St, the size
of Ti. Each packet Ti contains only 8 bytes, used for identifying the packet.
          We denote the time-to-live (TTL) of a packet by tj if the packet expires after
reaching router Rj. The TTL of all the packet-train packets Ti is tj. So the Ti packets will
stop at router Rj. Rj would respond through ICMP time-exceeded packets to the source.
III EVALUATION
          Efficient data structures used for adaptive refinement and tiling include trees of
grids with finer grids regarded as offspring of coarser ones. Within each grid, AVL tree



                                                 93
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 2, Sept – Oct (2010), © IAEME


structures [3] permit easy insertion and deletion of elements as they migrate between
processors. Similar tree structures at inter-processor boundaries facilitate the transfer of
data between neighboring processors. Most previous work focuses on incorporating
environment information into preselected partitioning algorithms [6,7,10]. As an
alternative, such information could be used to select appropriate partitioning strategies.
The work assigned to these nodes is then recursively partitioned among the nodes in their
sub trees. Different partitioning methods can be used in each level and sub tree to
produce effective partitions with respect to the network; for example, graph or hyper
graph partitioners could minimize communication between nodes connected by slow
networks while fast geometric partitioners operate within each node. A repartitioning of a
dynamic graph can be computed by simply partitioning the new graph from scratch.
However, since no concern is given for the existing partition, most vertices are not likely
to be assigned to their initial partitions with this method. Intelligent remapping of the
resulting partition can reduce the required movement of vertices, but vertex migration can
still be quite high. The second strategy is to use the existing partitioning as input for a
repartitioning algorithm and to attempt to minimize the difference between the original
partition and the output partition. This strategy can result in much smaller vertex
migration compared to schemes that partition the modified graph from scratch. our
multilevel diffusion repartitioning algorithms are made up of three phases, graph
coarsening, multilevel diffusion, and multilevel refinement. The coarsening phase results
in a series of contracted graphs. The multilevel diffusion phase balances the graph using
the very coarsest graphs. The multilevel refinement phase [3] seeks to improve the edge-
cut disturbed by the balancing process. Optionally, the multilevel diffusion can be guided
by a diffusion solution. DLB is not a Scratch-Remap Scheme because it takes into
consideration the previous load distribution during the current redistribution process. As
compared to Diffusion Scheme, our DLB scheme differs from it in two manners. First,
our DLB scheme addresses the issue of coarse granularity of SAMR applications [14]. It
splits large-sized grids located on overloaded processors if just the movement of grids is
not enough to handle load imbalance. Second, our DLB scheme chooses the direct data
movement between overloaded and under loaded processors instead of just between
neighboring processors.


                                                 94
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 2, Sept – Oct (2010), © IAEME


IV CONCLUSION
        In this paper we surveyed various Adaptive techniques for balancing the load in a
global scale grid environment. By using DLB scheme including moving-grid phase and
split-grid phase, the total execution time of SAMR applications was reduced up to 47%,
and the quality of load balancing was improved by more than two times especially when
the number of processors is larger than 16. In multilevel diffusion technique the results
on a variety of synthetic and application meshes show that it is a robust scheme for
repartitioning a wide variety of adaptive meshes. For adaptive finite element methods,
data movement from an old decomposition to a new one can consume orders of
magnitude more time than the actual computation of a new decomposition; highly
incremental partitioning strategies that minimize data movement are important for high
performance of adaptive simulations
REFERENCES
[1] Characterizing the Performance of Dynamic Distribution and Load-Balancing
    Techniques for Adaptive Grid Hierarchies, Mausumi Shee, Samip Bhavsar, and
    Manish Parashar, Proceedings of the IASTED International Conference Parallel and
    Distributed Computing and Systems November 3-6, 1999 in Cambridge
    Massachusetts, USA.
[2] Multilevel Diffusion Schemes for Repartitioning of Adaptive Meshes, Multilevel
    Diffusion Schemes for Repartitioning of Adaptive Meshes Kirk Schloegl, George
    Karypis, and Vipin Kumar, JOURNAL OF PARALLEL AND DISTRIBUTED
    COMPUTING 47, 109–124 (1997) ARTICLE NO. PC971410
[3] Parallel Adaptive hp-Refinement Techniques for Conservation Laws, Karen D.
    Devine and Joseph E. Flaherty, Applied Numerical Mathematics, 20 (1996) 367-386
    Sandia National Laboratories Tech. Rep. SAND95-1142J
[4] Adaptive Performance Modeling on Hierarchical Grid Computing Environments
    Wahid Nasri1, Luiz Angelo Steffenel and Denis Trystram, Laboratoire ID-IMAG,
    INPG, Grenoble, France, Author manuscript, published in " (2007)"
[5] Object-Based Adaptive Load Balancing for MPI Programs Milind Bhandarkar, L. V.
    Kal’e, Eric de Sturler, and Jay Hoeinger, Research funded by the U.S. Department of



                                                 95
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 2, Sept – Oct (2010), © IAEME


Energy through the University of California under Subcontract number B341494,
    October 6, 2000
[6] Parallel Dynamic Graph Partitioning for Adaptive Unstructured Meshes, C. Walshaw,
    M. Cross, and M. G. Everett, JOURNAL OF PARALLEL AND DISTRIBUTED
    COMPUTING 47, 102–108 (1997) ARTICLE NO. PC971407
[7] New Challenges in Dynamic Load Balancing, Karen D. Devine 1, Erik G. Boman,
    Robert T. Heaphy, Bruce A. Hendrickson, Sandia contract PO15162 and the
    Computer Science Research Institute at Sandia National Laboratories.
[8] H. Casanova, “Simgrid: A Toolkit for the Simulation of Application Scheduling,” in
    Proceedings of the IEEE International Symposium on Cluster Computing and the
    Grid (CCGrid’01), May 2001, pp. 430–437.
[9] G. Shao, Adaptive Scheduling of Master/Worker Applications on Distributed
    Computational Resources, Ph.D. thesis, University of California, San Diego, May
    2001.
[10] On Partitioning Dynamic Adaptive Grid Hierarchies,Manish Parashar and James
    C.Browne,      Binary     Black-Hole       NSF        Grand   challenge     (NSF      ACS/PHY
    9318152),January 1996.
[11] Hash-Storage Techniques for Adaptive multilevel solvers and their domain
    Decomposition Parallelization, Contemporary Mathematics volume 218,1998.
[12] A. B. Downey, “Using Pathchar to Estimate Internet Link Characteristics” ACM
    SIGCOMM '99 Pages: 241-250.
[13] Adaptive Load Balancing for Divide-and-Conquer Grid Applications Rob V. van
    Nieuwpoort, Jason Maassen, Gosia Wrzesi_nska, Thilo Kielmann, Henri E. Bal, 2004
    Kluwer Academic Publishers
[14] Dynamic Load Balancing for Structured Adaptive Mesh Refinement Applications,
    Zhiling Lan, Valerie E. Taylor, Greg Bryan, National Computational Science
    Alliance (ACI- 9619019)
[15] V.V. Korkhov, et al., A Grid-based Virtual Reactor: Parallel performance and
    adaptive     load     balancing,     J.    Parallel     Distrib.    Comput.      (2007),     doi:
    10.1016/j.jpdc.2007.08.010



                                                 96

More Related Content

What's hot (15)

PDF
A Novel Algorithm for Watermarking and Image Encryption
cscpconf
 
PDF
Energy-aware VM Allocation on An Opportunistic Cloud Infrastructure
Mario Jose Villamizar Cano
 
PDF
Chapter on Book on Cloud Computing 96
Michele Cermele
 
PDF
Application Of Local Search Methods For Solving A Quadratic Assignment Probl...
ertekg
 
PDF
Effective Sparse Matrix Representation for the GPU Architectures
IJCSEA Journal
 
PDF
Inversions of MobileMT data and forward modelling
Expert Geophysics Limited
 
PPT
Poster 2D Thinning
RMwebsite
 
PDF
Energy and latency aware application
csandit
 
PDF
Investigation of SLF-EMF effects on Human Body using Computer Simulation Tech...
IRJET Journal
 
PDF
FPGA Implementation of 2-D DCT & DWT Engines for Vision Based Tracking of Dyn...
IJERA Editor
 
PDF
Application of diversity techniques for multi user idma communication system
Alexander Decker
 
PDF
Median based parallel steering kernel regression for image reconstruction
csandit
 
PDF
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTION
csandit
 
PDF
GENERALIZED LEGENDRE POLYNOMIALS FOR SUPPORT VECTOR MACHINES (SVMS) CLASSIFIC...
IJNSA Journal
 
PDF
D04011824
IJMER
 
A Novel Algorithm for Watermarking and Image Encryption
cscpconf
 
Energy-aware VM Allocation on An Opportunistic Cloud Infrastructure
Mario Jose Villamizar Cano
 
Chapter on Book on Cloud Computing 96
Michele Cermele
 
Application Of Local Search Methods For Solving A Quadratic Assignment Probl...
ertekg
 
Effective Sparse Matrix Representation for the GPU Architectures
IJCSEA Journal
 
Inversions of MobileMT data and forward modelling
Expert Geophysics Limited
 
Poster 2D Thinning
RMwebsite
 
Energy and latency aware application
csandit
 
Investigation of SLF-EMF effects on Human Body using Computer Simulation Tech...
IRJET Journal
 
FPGA Implementation of 2-D DCT & DWT Engines for Vision Based Tracking of Dyn...
IJERA Editor
 
Application of diversity techniques for multi user idma communication system
Alexander Decker
 
Median based parallel steering kernel regression for image reconstruction
csandit
 
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTION
csandit
 
GENERALIZED LEGENDRE POLYNOMIALS FOR SUPPORT VECTOR MACHINES (SVMS) CLASSIFIC...
IJNSA Journal
 
D04011824
IJMER
 

Viewers also liked (19)

PDF
A novel approach for satellite imagery storage by classify
iaemedu
 
PDF
A survey of mitigating routing misbehavior in mobile ad hoc networks
iaemedu
 
PDF
A self recovery approach using halftone images for medical imagery
iaemedu
 
PDF
Prediction of customer behavior using cma
iaemedu
 
PDF
Semantic web services and its challenges
iaemedu
 
PDF
A review on phase change materials & their applications
iaemedu
 
PDF
Execution of organisational strategies a new paradigm in shaping
iaemedu
 
PDF
Measuring work attitudes of individuals among indian academia
iaemedu
 
PDF
Effectiveness of performance management system
iaemedu
 
PDF
An analytical study on investors’ awareness and perception
iaemedu
 
PDF
Performance evaluation of design build (d-b) projects
iaemedu
 
PDF
Optimization of surface finish during milling of hardened aisi4340 steel with...
iaemedu
 
PDF
Branding in nonprofit organizations the case of albania
iaemedu
 
PDF
Effectiveness of performance management system
iaemedu
 
PDF
Numerical simulation of flow modeling in ducted axial fan using simpson’s 13r...
iaemedu
 
PDF
Traffic study on road network to identify the short term road improvement pro...
iaemedu
 
PDF
Optimizing the parameters of wavelets for pattern matching using ga no restri...
iaemedu
 
PDF
Spectral approach to image projection with cubic
iaemedu
 
PDF
Antenna miniaturization techniques
iaemedu
 
A novel approach for satellite imagery storage by classify
iaemedu
 
A survey of mitigating routing misbehavior in mobile ad hoc networks
iaemedu
 
A self recovery approach using halftone images for medical imagery
iaemedu
 
Prediction of customer behavior using cma
iaemedu
 
Semantic web services and its challenges
iaemedu
 
A review on phase change materials & their applications
iaemedu
 
Execution of organisational strategies a new paradigm in shaping
iaemedu
 
Measuring work attitudes of individuals among indian academia
iaemedu
 
Effectiveness of performance management system
iaemedu
 
An analytical study on investors’ awareness and perception
iaemedu
 
Performance evaluation of design build (d-b) projects
iaemedu
 
Optimization of surface finish during milling of hardened aisi4340 steel with...
iaemedu
 
Branding in nonprofit organizations the case of albania
iaemedu
 
Effectiveness of performance management system
iaemedu
 
Numerical simulation of flow modeling in ducted axial fan using simpson’s 13r...
iaemedu
 
Traffic study on road network to identify the short term road improvement pro...
iaemedu
 
Optimizing the parameters of wavelets for pattern matching using ga no restri...
iaemedu
 
Spectral approach to image projection with cubic
iaemedu
 
Antenna miniaturization techniques
iaemedu
 
Ad

Similar to Adaptive load balancing techniques in global scale grid environment (20)

PDF
A New Approach for Dynamic Load Balancing Using Simulation In Grid Computing
IRJET Journal
 
PDF
Grid computing for load balancing strategies
International Journal of Science and Research (IJSR)
 
PDF
Lj2419141918
IJERA Editor
 
PDF
AN ENTROPIC OPTIMIZATION TECHNIQUE IN HETEROGENEOUS GRID COMPUTING USING BION...
ijcsit
 
PDF
Nimble@itcecnogrid novel toolkit for computing weather
iaemedu
 
PDF
Max Min Fair Scheduling Algorithm using In Grid Scheduling with Load Balancing
IJORCS
 
PDF
An efficient scheduling policy for load balancing model for computational gri...
Alexander Decker
 
PDF
Use of genetic algorithm for
ijitjournal
 
PDF
Adaptive job scheduling with load balancing for workflow application
iaemedu
 
PDF
J0210053057
researchinventy
 
PDF
Dn32717720
IJERA Editor
 
PDF
A Survey of Job Scheduling Algorithms Whit Hierarchical Structure to Load Ba...
Editor IJCATR
 
PPTX
Adaptive Execution Support for Malleable Computation
Qian Lin
 
PDF
GRAPH MATCHING ALGORITHM FOR TASK ASSIGNMENT PROBLEM
IJCSEA Journal
 
PDF
Grid resource discovery a survey and comparative analysis 2
IAEME Publication
 
PDF
9 ijcse-01223
Shivlal Mewada
 
PDF
99 103
Ijarcsee Journal
 
PDF
Multilevel Hybrid Cognitive Load Balancing Algorithm for Private/Public Cloud...
IDES Editor
 
PDF
1844 1849
Editor IJARCET
 
PDF
1844 1849
Editor IJARCET
 
A New Approach for Dynamic Load Balancing Using Simulation In Grid Computing
IRJET Journal
 
Grid computing for load balancing strategies
International Journal of Science and Research (IJSR)
 
Lj2419141918
IJERA Editor
 
AN ENTROPIC OPTIMIZATION TECHNIQUE IN HETEROGENEOUS GRID COMPUTING USING BION...
ijcsit
 
Nimble@itcecnogrid novel toolkit for computing weather
iaemedu
 
Max Min Fair Scheduling Algorithm using In Grid Scheduling with Load Balancing
IJORCS
 
An efficient scheduling policy for load balancing model for computational gri...
Alexander Decker
 
Use of genetic algorithm for
ijitjournal
 
Adaptive job scheduling with load balancing for workflow application
iaemedu
 
J0210053057
researchinventy
 
Dn32717720
IJERA Editor
 
A Survey of Job Scheduling Algorithms Whit Hierarchical Structure to Load Ba...
Editor IJCATR
 
Adaptive Execution Support for Malleable Computation
Qian Lin
 
GRAPH MATCHING ALGORITHM FOR TASK ASSIGNMENT PROBLEM
IJCSEA Journal
 
Grid resource discovery a survey and comparative analysis 2
IAEME Publication
 
9 ijcse-01223
Shivlal Mewada
 
Multilevel Hybrid Cognitive Load Balancing Algorithm for Private/Public Cloud...
IDES Editor
 
1844 1849
Editor IJARCET
 
1844 1849
Editor IJARCET
 
Ad

More from iaemedu (20)

PDF
Tech transfer making it as a risk free approach in pharmaceutical and biotech in
iaemedu
 
PDF
Integration of feature sets with machine learning techniques
iaemedu
 
PDF
Effective broadcasting in mobile ad hoc networks using grid
iaemedu
 
PDF
Effect of scenario environment on the performance of mane ts routing
iaemedu
 
PDF
Survey on transaction reordering
iaemedu
 
PDF
Website based patent information searching mechanism
iaemedu
 
PDF
Revisiting the experiment on detecting of replay and message modification
iaemedu
 
PDF
Performance analysis of manet routing protocol in presence
iaemedu
 
PDF
Performance measurement of different requirements engineering
iaemedu
 
PDF
Mobile safety systems for automobiles
iaemedu
 
PDF
Efficient text compression using special character replacement
iaemedu
 
PDF
Agile programming a new approach
iaemedu
 
PDF
A survey on the performance of job scheduling in workflow application
iaemedu
 
PDF
A comprehensive study of non blocking joining technique
iaemedu
 
PDF
A comparative study on multicast routing using dijkstra’s
iaemedu
 
PDF
The detection of routing misbehavior in mobile ad hoc networks
iaemedu
 
PDF
Visual cryptography scheme for color images
iaemedu
 
PDF
Software process methodologies and a comparative study of various models
iaemedu
 
PDF
Software metric analysis methods for product development
iaemedu
 
PDF
Software process and product quality assurance in it organizations
iaemedu
 
Tech transfer making it as a risk free approach in pharmaceutical and biotech in
iaemedu
 
Integration of feature sets with machine learning techniques
iaemedu
 
Effective broadcasting in mobile ad hoc networks using grid
iaemedu
 
Effect of scenario environment on the performance of mane ts routing
iaemedu
 
Survey on transaction reordering
iaemedu
 
Website based patent information searching mechanism
iaemedu
 
Revisiting the experiment on detecting of replay and message modification
iaemedu
 
Performance analysis of manet routing protocol in presence
iaemedu
 
Performance measurement of different requirements engineering
iaemedu
 
Mobile safety systems for automobiles
iaemedu
 
Efficient text compression using special character replacement
iaemedu
 
Agile programming a new approach
iaemedu
 
A survey on the performance of job scheduling in workflow application
iaemedu
 
A comprehensive study of non blocking joining technique
iaemedu
 
A comparative study on multicast routing using dijkstra’s
iaemedu
 
The detection of routing misbehavior in mobile ad hoc networks
iaemedu
 
Visual cryptography scheme for color images
iaemedu
 
Software process methodologies and a comparative study of various models
iaemedu
 
Software metric analysis methods for product development
iaemedu
 
Software process and product quality assurance in it organizations
iaemedu
 

Adaptive load balancing techniques in global scale grid environment

  • 1. International JournalVolume 1, Number Engineering(IJCET), ISSN 0976 – 6367(Print), International Journal of Computer Engineering and Technology ISSN 0976 – 6375(Online) of Computer 2, Sept – Oct (2010), © IAEME and Technology (IJCET), ISSN 0976 – 6367(Print) ISSN 0976 – 6375(Online) Volume 1 IJCET Number 2, Sept - Oct (2010), pp. 85-96 ©IAEME © IAEME, http://www.iaeme.com/ijcet.html ADAPTIVE LOAD BALANCING TECHNIQUES IN GLOBAL SCALE GRID ENVIRONMENT D.Asir PG Scholar Department of Computer Science and Engineering Karunya University, Coimbatore E-Mail: [email protected] Shamila Ebenezer Assistant Professor Department of Computer Science and Engineering Karunya University, Coimbatore E-Mail: [email protected] Daniel.D, PG Scholar Department of Computer Science and Engineering Karunya University, Coimbatore E-Mail: [email protected] ABSTRACT Data partitioning and load balancing are important components of parallel computations. Many different partitioning strategies have been developed, with great effectiveness in parallel applications. But the load-balancing problem is not yet solved completely; new applications and architectures require new partitioning features. Increased use of heterogeneous computing architectures requires partitioners that account for non-uniform computing, network, and memory resources. This paper surveys different adaptive technique for a partial differential system to solve load balancing problem. Index Terms: Dynamic load balancing; Performance characterization; Adaptive mesh refinement. 85
  • 2. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print), ISSN 0976 – 6375(Online) Volume 1, Number 2, Sept – Oct (2010), © IAEME I. INTRODUCTION Adaptive Load Balancing Operate smoothly and scale reliably when facing spikes in data volumes or unexpected utilization loads on the grid. Also it selects the best node for session execution based on resource requirements and availability. An application- centric performance characterization of dynamic partitioning and Load balancing techniques for distributed adaptive grid hierarchies that underlie parallel adaptive mesh refinement (AMR) techniques [1,14] for the solution of partial differential equations. Early adaptive techniques of mesh motion (r-refinement) have been giving way to methods that combine mesh refinement/coarsening (h-refinement) with order variation (p-refinement) [3]. As advances in computer architecture enable the solution of complex three-dimensional problems, the efficiency, reliability, and robustness provided by adaptively will make its use even more advantageous. Parallel computation will be essential in these simulations. Processor load-balancing must be dynamic since frequent adaptive enrichment will upset a balanced computation. An adaptive finite element method, have workloads that are unpredictable or change during the computation; such applications require dynamic load balancers that adjust the decomposition as the computation proceeds. Numerous strategies for static and dynamic load balancing have been developed, including recursive bisection (RB) methods, space filling curve (SFC) partitioning and graph partitioning, multilevel, and diffusive methods [7,10]. These methods provide effective partitioning for many applications, perhaps suggesting that the load-balancing problem is solved. Efficient parallel execution of these irregular grid applications requires the partitioning of the associated graph into p parts with the following two objectives: (i) each partition has an equal amount of total vertex weight; (ii) the total weight of the edges cut by the partitions is minimized [2]. Simulation of three dimensional flow with chemical reactions and plasma discharge in complex geometries is one of the most resource demanding problems in computational science, requiring both high performance and high-throughput computing. Grid computing technologies opened up new opportunities to access virtually unlimited computational resources, and inspired many researchers to develop new methodologies and algorithms for parallel distributed applications on the Grid. 86
  • 3. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print), ISSN 0976 – 6375(Online) Volume 1, Number 2, Sept – Oct (2010), © IAEME II. ALB ALGORITHMS A. Adaptive mesh-refinement algorithms (AMR) 1) Space-Filling Curves: Space-filling curves (SFC) [1] are a class of locality preserving mappings from d-dimensional space to 1- dimensional space. The self similar or recursive nature of mappings can be exploited to represent a hierarchical structure and to maintain locality across different levels of hierarchy. The SFC representation of the adaptive grid hierarchy is a 1-D ordered list of composite grid blocks where each composite block represents a block of the entire grid hierarchy and may contain more than one grid level. 2) Independent Grid Distribution: Distributes the grids independently across the processors. This distribution leads to balanced loads and no redistribution is required when grids are created or deleted. In the adaptive grid hierarchy, a fine grid typically corresponds to a small region of the underlying coarse grid. If both, the fine and coarse grid are distributed over the entire set of processors, all the processors will communicate with the small set of processors corresponding to the associated coarse grid region, causing a serialization bottleneck. 3) Combined Grid Distribution: Distributes the total work load in the grid hierarchy by first forming a simple linear structure by abutting grids at a level and then decomposing this structure into partitions of equal load. Regriding operations involving the creation or deletion of a grid are extremely expensive, as they require an almost complete redistribution of the grid hierarchy [4].The combined grid decomposition does not exploit the parallelism available within a level of the hierarchy. 4) Independent Level Distribution: Each level of the grid hierarchy is distributed by partitioning the combined load of all component grids at the level among the processors. This scheme overcomes some of the drawbacks of the independent grid distribution. Parallelism within a level of the hierarchy is exploited. Although the inter- grid communication bottleneck is reduced in this case, the required scatter communications can be expensive. Creation or deletion of component grids at any level requires a redistribution of the entire level. 87
  • 4. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print), ISSN 0976 – 6375(Online) Volume 1, Number 2, Sept – Oct (2010), © IAEME 5) Iterative Tree balancing: A table is created from the grids at each time step, which keeps pointers to neighboring and parent grids. for every grid, immediate neighbors and children are also considered along with load distribution. Thus load balancing, inter level communication and intra level communication are addressed together. This scheme is used for distributing fine-element meshes and is promising as it deals with all the constraints to some extent. 6) Weighted Distribution: First assign a weight to each of these overheads. This weight defines the significance and contribution of the overhead to the overall application performance, The next step uses these weights to compute the affinity of each component grid to the different processors. Initially, grids have no affinity for any processor. B. Dynamic Load Balancing via Tiling Tiling load-balancing system [3] is a modification of the global load-balancing technique of that is applicable to a wide class of two-dimensional, uniform-grid applications. Global balance is achieved by performing local balancing within overlapping processor neighborhoods, where each processor is defined to be the center of a neighborhood. Local balance involves element migrations to processors in the same neighborhood that have elements sharing edges. tiling system is required by the adaptive refinement algorithm. Because elemental workloads may vary due to refinement, the tiling algorithm must account for elemental workloads when performing local load balancing. C. Multi criteria Geometric Partitioning: Crash simulations are “multiphase" applications consisting of two separate phases: computation of forces and contact detection. Obtaining a single decomposition that is good with respect to both phases would remove the need for communication between phases. Each object would have multiple loads, corresponding to its workload in each phase. The challenge would be computing a single decomposition that is balanced with respect to all loads. Such a multi criteria partitioner could be used in other situations as well, such as balancing both computational work and memory usage. Most geometric partitioners reduce the partitioning problem [6] to a one-dimensional problem. Multi 88
  • 5. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print), ISSN 0976 – 6375(Online) Volume 1, Number 2, Sept – Oct (2010), © IAEME criteria load balancing can be formulated as either a multi constraint or multi objective problem. Often, the balance of each load is considered a constraint and has to satisfy a certain tolerance. Such a formulation fits the standard form, where, in this case, there is no objective, only constraints. Unfortunately, there is no guarantee that a solution exists to this problem. In practice, we want a “best possible" decomposition [7], even if the desired balance criteria cannot be satisfied. Thus, an alternative is to make the constraints objectives; that is, we want to achieve as good balance as possible with respect to all loads. D. Repartitioning Algorithms Based on Multilevel Diffusion The multilevel graph partitioning algorithm [2] implemented in METIS has three phases, a coarsening phase a partitioning phase, and a refinement phase. During the coarsening phase, a sequence of smaller graphs are constructed from an input graph by collapsing vertices together. When enough vertices have been collapsed together so that the coarsest graph is sufficiently small, a kway partition is found. Finally, the partition of the coarsest graph is projected back to the original graph by refining it at each uncoarsening level using a kway partitioning refinement algorithm. In the coarsening phase, only pairs of nodes that belong to the same partition are considered for merging. Hence, the initial partition of the coarsest level graph is identical to the input partition of the graph that is being repartitioned and thus does not need to be computed. This makes the coarsening phase completely parallelizable, as coarsening is local to each processor. The uncoarsening phase of MLD contains two subphases: multilevel diffusion and multilevel refinement. In the multi-level diffusion phase, balance is sought on the coarsest graph in a process similar to multilevel refinement. This is accomplished by forcing the migration of vertices out of overbalanced partitions. 89
  • 6. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print), ISSN 0976 – 6375(Online) Volume 1, Number 2, Sept – Oct (2010), © IAEME Figure 2.1 Multilevel diffusion repartitioning Multilevel diffusion repartitioning algorithms are made up of three phases, graph coarsening, multilevel diffusion, and multilevel refinement. The coarsening phase results in a series of contracted graphs. The multilevel diffusion phase balances the graph using the very coarsest graphs. The multilevel refinement phase seeks to improve the edge-cut disturbed by the balancing process. Optionally, the multilevel diffusion can be guided by a diffusion solution. We will refer to our multilevel undirected diffusion repartitioning algorithm as MLD and to our multilevel directed diffusion repartitioning algorithm as MLDD. Single-level directed diffusion (SLDD) will be used to provide a comparison with our multilevel diffusion schemes. In SLDD, diffusion and refinement are performed only on the original input graph and thus, no graph contraction is performed. E. SAMR (Structured Adaptive Mesh Refinement) Adaptive Characteristics of SAMR Applications [14] are analyzed from four aspects: granularity, dynamicity, imbalance and dispersion. 90
  • 7. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print), ISSN 0976 – 6375(Online) Volume 1, Number 2, Sept – Oct (2010), © IAEME 1) Granularity: The basic entity for data movement is a grid. Each grid consists of a computational interior and a ghost zone. The computational interior is the region of interest that has been refined from the immediately coarser level; the ghost zone is the part added to exterior of computational interior in order to obtain boundary information. For the computational interior, there is a requirement for the minimum number of cells, which is equal to the refinement ratio to the power of the number of dimensions. 2) Dynamicity: After each time-step of every level, the adaptation process is invoked based on one or more refinement criteria defined at the beginning of the simulation. The local regions satisfying the criteria will be refined. High frequency of adaptation requires the underlying DLB method to execute very fast, as well as to maintain high quality of load balancing. 3) Load Imbalance: The ideal balanced load is calculated. The standard deviation is pretty small compared to the average load, which means that the average load reflects the entire load distribution. 4) Dispersion: A few processors whose loads are increased dramatically and most processors have little or no change. All the processors can be grouped into four subgroups and each subgroup has similar characteristics with the percentage of refinement ranging from zero to 86% .These calculation indicates that different datasets exhibit different load distribution, and the underlying DLB scheme should provide high quality of load balancing for all these datasets. After taking into consideration the adaptive characteristics of the SAMR application, we developed an improved DLB scheme. DLB is composed of two steps: moving grid phase and splitting-grid phase. Moving Grid Phase: Step 1: Assign Moveflag, Splitflag as one and Lastmin,Lastmax as zero. Step 2: When the condition Maxload/Avgload > threshold is set, the load is imbalanced. Step 3: Then the Maxproc moves its grid to Minproc(using global information) under the condition the load is no more than (threshold * Avgload-Minload) Step 4: This phase continues until all grids residing on the Maxproc are too large to be moved. 91
  • 8. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print), ISSN 0976 – 6375(Online) Volume 1, Number 2, Sept – Oct (2010), © IAEME Splitting Grid Phase: Step 1: The Maxproc finds the Maxgrid. Step 2: If the size of Maxgrid is no more than (Avgload-Minload) the grid moved to Minproc from Maxproc. Step 3: If not Maxproc Splits the grid into two smaller grids. Step 4: Any one size is around (Avgload-Minload) will be redistributed to Minproc. F. Adaptive workload balancing (AWLB) on heterogeneous resources One of the factors that determine the performance of parallel applications on heterogeneous resources is the quality of the workload distribution, e.g. through functional decomposition or domain decomposition. Optimal load distribution is characterized by two things: (1) all processors have a workload proportional to their computational capacity and (2) communications between the processors are minimized. These goals are conflicting since the communication is minimized when all the workload is processed by a single processor and no communication takes place, and distributing the workload inevitably incurs communication overheads. Thus, it is necessary to find a balance and define a metric [15] that characterizes the quality of workload distribution for a parallel problem. 1. Benchmark the resources dynamically assigned to the parallel application; measure the resource characteristics that constitute the set of resource parameters µ (available processing power, memory and links bandwidth). 2. Estimate the range of possible values of the application parameter fc. The minimal value is fmin=0, which corresponds to the case when no communications occur between the parallel processes of the application. The upper bound can be calculated based on the following reasoning: For the parallel processing to make sense, that is to ensure that running a parallel program on several processors is faster than sequential execution, the calculation time should exceed communication time. For homogeneous resources this can be expressed as follow 3. Search through the range of possible values of fc in [0 . . . fc max] to find the optimal value fc* minimizing the application execution time. For each value of fc calculate the 92
  • 9. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print), ISSN 0976 – 6375(Online) Volume 1, Number 2, Sept – Oct (2010), © IAEME corresponding load distribution based on the resource parameters µ .With this distribution perform one time step, and measure the execution time the target optimization function. Selection of the next value of fc can be done by any optimization method for unimodal smooth functions; for instance a simple line-search method can be used. 4. Execute further calculations using the discovered fc*. 5. In the case of dynamic resources where performance is influenced by other factors (which is generally the case on the Grid), a periodic reestimation of resource parameters µ and load redistribution shall be performed during run-time of the application. Re-balancing shall be invoked if the application performance over the last step drops more than a certain user-defined threshold. 6. If the application is dynamically changing then fc*must be periodically re-estimated on the same set of resources. G. The Path Algorithm There are two steps to implement the PATH algorithm: First Step: We use simple single-packet algorithm (SMSP) to check the network structure and to get the bottleneck link Lk. Compared with the standard single-packet algorithm (SDSP) [12], SMSP algorithm does not have to measure the bandwidth of each link of the whole network. Second Step: Use Packet Train with header probe to measure the bandwidth of the link Lk. The source sends out a header packet H and a packet train T1, T2,… Tn. Both the header and the packet train are UDP packets. All the packets Ti of the packet- train are of the same size. Sh, the size of header packet H is much larger than St, the size of Ti. Each packet Ti contains only 8 bytes, used for identifying the packet. We denote the time-to-live (TTL) of a packet by tj if the packet expires after reaching router Rj. The TTL of all the packet-train packets Ti is tj. So the Ti packets will stop at router Rj. Rj would respond through ICMP time-exceeded packets to the source. III EVALUATION Efficient data structures used for adaptive refinement and tiling include trees of grids with finer grids regarded as offspring of coarser ones. Within each grid, AVL tree 93
  • 10. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print), ISSN 0976 – 6375(Online) Volume 1, Number 2, Sept – Oct (2010), © IAEME structures [3] permit easy insertion and deletion of elements as they migrate between processors. Similar tree structures at inter-processor boundaries facilitate the transfer of data between neighboring processors. Most previous work focuses on incorporating environment information into preselected partitioning algorithms [6,7,10]. As an alternative, such information could be used to select appropriate partitioning strategies. The work assigned to these nodes is then recursively partitioned among the nodes in their sub trees. Different partitioning methods can be used in each level and sub tree to produce effective partitions with respect to the network; for example, graph or hyper graph partitioners could minimize communication between nodes connected by slow networks while fast geometric partitioners operate within each node. A repartitioning of a dynamic graph can be computed by simply partitioning the new graph from scratch. However, since no concern is given for the existing partition, most vertices are not likely to be assigned to their initial partitions with this method. Intelligent remapping of the resulting partition can reduce the required movement of vertices, but vertex migration can still be quite high. The second strategy is to use the existing partitioning as input for a repartitioning algorithm and to attempt to minimize the difference between the original partition and the output partition. This strategy can result in much smaller vertex migration compared to schemes that partition the modified graph from scratch. our multilevel diffusion repartitioning algorithms are made up of three phases, graph coarsening, multilevel diffusion, and multilevel refinement. The coarsening phase results in a series of contracted graphs. The multilevel diffusion phase balances the graph using the very coarsest graphs. The multilevel refinement phase [3] seeks to improve the edge- cut disturbed by the balancing process. Optionally, the multilevel diffusion can be guided by a diffusion solution. DLB is not a Scratch-Remap Scheme because it takes into consideration the previous load distribution during the current redistribution process. As compared to Diffusion Scheme, our DLB scheme differs from it in two manners. First, our DLB scheme addresses the issue of coarse granularity of SAMR applications [14]. It splits large-sized grids located on overloaded processors if just the movement of grids is not enough to handle load imbalance. Second, our DLB scheme chooses the direct data movement between overloaded and under loaded processors instead of just between neighboring processors. 94
  • 11. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print), ISSN 0976 – 6375(Online) Volume 1, Number 2, Sept – Oct (2010), © IAEME IV CONCLUSION In this paper we surveyed various Adaptive techniques for balancing the load in a global scale grid environment. By using DLB scheme including moving-grid phase and split-grid phase, the total execution time of SAMR applications was reduced up to 47%, and the quality of load balancing was improved by more than two times especially when the number of processors is larger than 16. In multilevel diffusion technique the results on a variety of synthetic and application meshes show that it is a robust scheme for repartitioning a wide variety of adaptive meshes. For adaptive finite element methods, data movement from an old decomposition to a new one can consume orders of magnitude more time than the actual computation of a new decomposition; highly incremental partitioning strategies that minimize data movement are important for high performance of adaptive simulations REFERENCES [1] Characterizing the Performance of Dynamic Distribution and Load-Balancing Techniques for Adaptive Grid Hierarchies, Mausumi Shee, Samip Bhavsar, and Manish Parashar, Proceedings of the IASTED International Conference Parallel and Distributed Computing and Systems November 3-6, 1999 in Cambridge Massachusetts, USA. [2] Multilevel Diffusion Schemes for Repartitioning of Adaptive Meshes, Multilevel Diffusion Schemes for Repartitioning of Adaptive Meshes Kirk Schloegl, George Karypis, and Vipin Kumar, JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING 47, 109–124 (1997) ARTICLE NO. PC971410 [3] Parallel Adaptive hp-Refinement Techniques for Conservation Laws, Karen D. Devine and Joseph E. Flaherty, Applied Numerical Mathematics, 20 (1996) 367-386 Sandia National Laboratories Tech. Rep. SAND95-1142J [4] Adaptive Performance Modeling on Hierarchical Grid Computing Environments Wahid Nasri1, Luiz Angelo Steffenel and Denis Trystram, Laboratoire ID-IMAG, INPG, Grenoble, France, Author manuscript, published in " (2007)" [5] Object-Based Adaptive Load Balancing for MPI Programs Milind Bhandarkar, L. V. Kal’e, Eric de Sturler, and Jay Hoeinger, Research funded by the U.S. Department of 95
  • 12. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print), ISSN 0976 – 6375(Online) Volume 1, Number 2, Sept – Oct (2010), © IAEME Energy through the University of California under Subcontract number B341494, October 6, 2000 [6] Parallel Dynamic Graph Partitioning for Adaptive Unstructured Meshes, C. Walshaw, M. Cross, and M. G. Everett, JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING 47, 102–108 (1997) ARTICLE NO. PC971407 [7] New Challenges in Dynamic Load Balancing, Karen D. Devine 1, Erik G. Boman, Robert T. Heaphy, Bruce A. Hendrickson, Sandia contract PO15162 and the Computer Science Research Institute at Sandia National Laboratories. [8] H. Casanova, “Simgrid: A Toolkit for the Simulation of Application Scheduling,” in Proceedings of the IEEE International Symposium on Cluster Computing and the Grid (CCGrid’01), May 2001, pp. 430–437. [9] G. Shao, Adaptive Scheduling of Master/Worker Applications on Distributed Computational Resources, Ph.D. thesis, University of California, San Diego, May 2001. [10] On Partitioning Dynamic Adaptive Grid Hierarchies,Manish Parashar and James C.Browne, Binary Black-Hole NSF Grand challenge (NSF ACS/PHY 9318152),January 1996. [11] Hash-Storage Techniques for Adaptive multilevel solvers and their domain Decomposition Parallelization, Contemporary Mathematics volume 218,1998. [12] A. B. Downey, “Using Pathchar to Estimate Internet Link Characteristics” ACM SIGCOMM '99 Pages: 241-250. [13] Adaptive Load Balancing for Divide-and-Conquer Grid Applications Rob V. van Nieuwpoort, Jason Maassen, Gosia Wrzesi_nska, Thilo Kielmann, Henri E. Bal, 2004 Kluwer Academic Publishers [14] Dynamic Load Balancing for Structured Adaptive Mesh Refinement Applications, Zhiling Lan, Valerie E. Taylor, Greg Bryan, National Computational Science Alliance (ACI- 9619019) [15] V.V. Korkhov, et al., A Grid-based Virtual Reactor: Parallel performance and adaptive load balancing, J. Parallel Distrib. Comput. (2007), doi: 10.1016/j.jpdc.2007.08.010 96