SlideShare a Scribd company logo
1
Cluster Setup Manual
Using Ubuntu and MPICH
prepared by BSSE-0516
2
Cluster Setup Manual
Using Ubuntu and MPICH
Institute of Information Technology,
University of Dhaka
Date of Submission
13 June 2015
Submitted to
Emon Kumar Dey
Course Instructor of SE-501
Lecturer
Institute of Information Technology
University of Dhaka
Submitted by
Md. Rakib Hossain
(BSSE 0516)
Submitted to
Amit Seal Ami
Lab Instructor of SE-501
Lecturer
Institute of Information Technology
University of Dhaka
3
TABLE OF CONTENTS
BACKGROUND..............................................................................................5
OBJECTIVES .................................................................................................5
BOARD OBJECTIVE: ..................................................................................................................... 5
SPECIFIC OBJECTIVES: ................................................................................................................. 5
ORIGIN OF THE DOCUMENT............................................................................5
WHAT IS PARALLEL COMPUTING? ..................................................................6
WHERE USED PARALLEL COMPUTING?............................................................6
SCIENCE AND ENGINEERING: ....................................................................................................... 6
INDUSTRIAL AND COMMERCIAL:.................................................................................................. 7
WHAT IS CLUSTER COMPUTING? ....................................................................7
WHAT ARE MPI AND MPICH? ..........................................................................7
MPI: ............................................................................................................................................ 7
MPICH:....................................................................................................................................... 8
MAKE A CLUSTER FOR PARALLEL COMPUTING................................................8
WHAT ARE THE PREREQUISITES? ................................................................................................. 8
WHAT ARE THE REQUIRED PACKAGES NEEDED TO BE INSTALLED?...................9
INSTALLING MPICH.................................................................................................................... 9
INSTALLING SSH SERVER.......................................................................................................... 10
INSTALLING GCC ...................................................................................................................... 10
HOW TO SETUP THE CLUSTER ENVIRONMENT? ..............................................11
DEFINING HOSTNAMES:.............................................................................................................. 11
SETTING UP AUTHORIZED KEYS FOR PASSWORD LESS COMMUNICATION BETWEEN NODES:....... 13
SETTING UP A MACHINE-FILE: .................................................................................................... 15
HOW TO WRITE A PROGRAM USING MPICH?............................................................................ 16
HOW TO COMPILE AND EXECUTE THE PROGRAM USING MPICH?............................................. 16
1) Using USB Flash Drive: ............................................................................................... 17
2) Using scp command:..................................................................................................... 17
3) Using sharing master folder:........................................................................................ 17
4
CONCLUSION..............................................................................................19
APPENDIX ..................................................................................................19
LIST OF COMMAND USED IN THIS MANUAL:................................................................................ 19
REFERENCES ..............................................................................................20
5
Background
Generally, a software program has been developed for serial computation. In order to solve a
computing problem, an algorithm is formulated and applied as a successive flow of instructions.
All these instructions are usually executed on a CPU in one computer. Merely single instruction
may execute at a time after that instruction is finished, the next instruction is executed.
In our real world there are lots of computing problem that needs huge calculation and
concurrency such as weather forecast, planetary movements, galaxy formulation etc. These types
of problems are so large and/or complex that it is impractical or impossible to solve them on a
single computer using serial computation especially given limited computer memory.
Objectives
Board Objective:
The main objective of this document is to show how to setup a cluster using two computer
having Linux based OS (Ubuntu 14.04) using MPICH.
Specific Objectives:
 To learn what is parallel computing and how it works.
 Why parallel computing is needed and important.
 What is cluster computing and how it works?
 What are MPI and MPICH and how they work?
 How to make a cluster
 How to write the first parallel computing code and execute it on cluster
Origin of the Document
As per our course tutor Mr. Emon Kumar Dey instructed us to submit a report as part of the
course evaluation, I prepare the paper. By writing the report I have learnt how to setup a cluster
for parallel computing. I have also learnt how to write a code that may execute parallel on a
cluster. So we are very thankful to our course tutor for giving us the opportunity.
6
What is Parallel Computing?
Parallel computing is used for multiple processing components simultaneously to resolve a
problem. This is carried out by splitting the problem into independent section in order that every
single processing component can easily execute its section of the algorithm simultaneously with
the others. The processing components may be diverse including resources like a single
computer with multiple processors, several networked computers, specialized hardware, or any
combination of the above
Figure 1: Parallel Computing
Where used Parallel Computing?
Science and Engineering:
 Historically, parallel computing has been considered to be "the high end of computing",
and has been used to model difficult problems in many areas of science and engineering:
o Atmosphere, Earth, Environment
o Physics - applied, nuclear, particle,
condensed matter, high pressure, fusion,
photonics
o Bioscience, Biotechnology, Genetics
o Chemistry, Molecular Sciences
o Geology, Seismology
o Mechanical Engineering - from
prosthetics to spacecraft
o Electrical Engineering, Circuit
Design, Microelectronics
o Computer Science,
Mathematics
o Defense, Weapons
7
Industrial and Commercial:
 Today, commercial applications provide an equal or greater driving force in the
development of faster computers. These applications require the processing of large
amounts of data in sophisticated ways. For example:
o Databases, data mining
o Oil exploration
o Web search engines, web based
business services
o Medical imaging and diagnosis
o Pharmaceutical design
o Financial and economic modeling
o Management of national and multi-
national corporations
o Advanced graphics and virtual reality,
particularly in the entertainment industry
o Networked video and multi-media
technologies
o Collaborative work environments
What is Cluster Computing?
Cluster computing is a model of computing where a collection of computers are interconnected
with each other in order that they can behave like a single entity. The components of a cluster are
normally linked to one another through fast local area networks, among each node running its
own instance of an operating system. Generally in most situations, all the nodes use the same
hardware and the same operating system, even though in a few configurations different operating
systems may be used for each computer, as well as different hardware.
Clusters are generally implemented to enhance performance and availability over compared to a
single computer, while usually being much more cost-effective than single computers of
comparable speed or availability.
Cluster Computing is used for parallel processing, load balancing etc. Clustering is a popular
strategy for implementing parallel processing programs as it enables the programs to run each
independent component simultaneously in each computer. Clusters are able to executing multiple
complex instructions by distributing workload throughout all connected computers. Clustering
enhances the system's availability to users, its aggregate performance, and overall tolerance to
faults and component failures.
What are MPI and MPICH?
MPI: Message Passing Interface (MPI) is a standardized and transportable message-passing
system developed by a group of researchers from academia and industry to work on a wide range
8
of parallel computers. The goal of the Message Passing Interface is to establish a portable,
efficient, and flexible standard for message passing that will be widely used for writing message
passing programs. Using this API connected computer in a cluster can communicate to each
other through message passing .It enables them to send and receive message and then
synchronize themselves.
The standard defines the syntax and semantics of a core of library routines useful to a wide range
of users writing portable message-passing programs in different computer programming
languages such as Fortran, C, C++ and Java.
MPICH: MPICH is a high performance and widely portable implementation of the Message
Passing Interface (MPI) standard.
Make a Cluster for Parallel Computing
What are the Prerequisites?
Here we develop a cluster using MPI and then write a program, compile and execute this
program using MPICH
The prerequisites are
1. We need at-least two computers having Linux distribution installed on it (here we use
Ubuntu 14.04 LTS) .We have to make it sure that our system has GCC installed. As
Ubuntu has installed GCC built in on it, for the time being we need not to install it
2. A network connection between this two computers and they must have IP addresses
assigned on them. Now let us assume that we have two computers fulfilling our
prerequisites. Let the host name and the IP addresses of these computers be. Here we call
all this computer node
misubeimp 10.255.4.125
minhas-pc 10.255.4.98
9
What are the Required Packages needed to be installed?
Installing MPICH
As we are using mpich2 as our message passing system so we should install its packages in all
the nodes. To install it we have to run the following command
***Here one important point must be noted that we have to install the same version of MPICH in
all the nodes. We use MPICH2.To check the version properly run the following command.
To ensure where installing MPICH2 run the following command
10
We may test that the program did indeed install successfully by entering the following
commands in all nodes.
Installing SSH Server
Mpich communicates among the nodes using remote login and also distribute the processes
among the nodes through remote log in. So we need to confirm that our nodes have the ability of
remote log in. Remote log in can be performed using different way for example telnets, openssh
.In our case we use openssh as it gives better security of data than telnet. To install openssh we
have to run the following command in all nodes.
Installing GCC
As it is said before Ubuntu has built-in GCC installed but to ensure that which version of GCC is
installed we may execute the following command
If exceptionally GCC is missed then we may easily installed it using the following command in
all nodes.
11
How to Setup the Cluster Environment?
Defining hostnames:
We have to define the two node host name in every node hosts file. To define that we have to
edit the host file by executing the following command
First the hosts file looks like the given picture. We have to edit the hosts file with our two nodes
ip address and host name in this way mentioned in the below picture
After editing out hosts file is looked as the following picture.
Add a New User for running MPI programs:
Now we have to add a new user in every node for running MPI program. In every node the new
user must have the same user name. It is better to give the same password for availability. So
first we make a directory in our root directory then we add our new user in the directory. Our
12
new user name would be mpiuser and the directory name is cluster. To do it we have to run the
following command.
As we directly add our user to the cluster directory without create our user that’s why the system
will ask to add the new user automatically
.
For the time being here we just enter our password and leave all other stuff default.
Now our new mpiuser successfully add to this directory and we now changes the owner ship of
this directory to mpiuser .To do it we have to enter the following command.
13
Setting up Authorized Keys for password less communication
between nodes:
After successfully adding new user now we log in to the new user.
Now we will generate a new ssh key. On executing the following command
Here it'll ask for a paraphrase. Leave it blank as we want to create a password-less ssh
(Assuming that we have a trusted LAN with no security issues).
14
After executing the command a folder called .ssh will be created in home directory. It’s a hidden
folder. This folder will contain a file id_dsa.pub that contains your public key. This public key is
used for sending cryptic message. The distinguishing technique used in public key cryptography
is the use of asymmetric key algorithms, where the key used to encrypt a message is not the same
as the key used to decrypt it. Each user has a pair of cryptographic keys- a public encryption
key and a private decryption key. The publicly available encrypting-key is widely distributed,
while the private decrypting-key is known only to the recipient. Messages are encrypted with the
recipient's public key and can only be decrypted with the corresponding private key. The keys are
related mathematically, but the private key cannot feasibly be derived from the public key.
Now copy this key to another file called authorized_keys in the same directory. Execute the
Commands in the terminal
The authorized_keys file contains the key for one node and it will look like the following
picture. It will show misubeimp pc’s mpiuser authorize keys
We have also got another authorized_keys in minhas-pc‘s mpiuser authorize keys. Now we
have to make a common authorized_keys file for both user pc so that both nodes contains the
same keys in their authorized_keys file. We may do it using simply copy past command with
nano editor.
After make the common file the file would be look like the below picture.
15
Setting up the keys we setup the environment successfully. Up to that log out from the mpiuser
and restart the pc.
Setting up a machine-file:
Now we create a file called "machinefile" in mpiuser home directory with node names followed
by a colon and a number of processes to we want to execute on each node.
16
How to Write a Program using MPICH?
Now we write our very first program using mpich convention. There is a demo example is given
below.
How to Compile and Execute the Program Using MPICH?
To compile the above program using mpich we have to execute the following command
.
After compiling we can execute the compiled file using mpich to our local node without using
the machine file. To do it we have to execute the following command.
17
Now as our goal is to execute the compiled program in both nods of our cluster so we have to
make sure that in both nodes the compiled file and the machine file are present in their mpiuser
home directory. After that we may execute the program with machine in any node.
Here we can transfer the executable file in all nodes using different ways.
1. Using Usb Flash drive
2. Using scp command
3. Using sharing master folder.
1) Using USB Flash Drive: When we transfer the executable file among the nodes using
usb flash drive we have to place the file in the same location of mpiuser account. We
also make sure that this file is in executable mode .To do that we may run the following
command to make the file executable.
2) Using scp command: We may transfer the executable file using scp command from
one node to another node into the same location. To do that we may execute the
following command.
3) Using sharing master folder: Here we first make a folder in all nodes, and then we'll
store our data and programs in this folder. And then we share the contents of this folder
located on the master node to all the other nodes. As we didn’t make any master folder in
our cluster so if someone is interested to know how to make a master folder, they are
requested to check out the following link’s 2 and 3 no points. MpichCluster.
18
Now to execute the program in multiple modes we have to execute the following commands
with machine file
Now it will show the following output
19
Conclusion
In the manual, firstly we have discussed on parallel computing, how parallel computing works,
in which situation we need to compute parallel and why parallel computing is necessary for
scientific research and industrial works. The main focus of this manual based on how to setup a
cluster using Ubuntu (14.04) operating system and mpich along with writing the first parallel
program and executes it in multiple pc parallel. To do so step by step procedures are mentioned
in this manual .We hope this manual will be very helpful for understanding cluster computing
and setting up first cluster computer.
Appendix
List of command used in this manual:
1) misubeimp@misubeimp:~$ sudo apt-get update
2) misubeimp@misubeimp:~$ sudo apt-get install mpich2
3) misubeimp@misubeimp:~$ mpichversion
4) misubeimp@misubeimp:~$ which mpiexec
5) misubeimp@misubeimp:~$ which mpirun
6) misubeimp@misubeimp:~$ sudo apt-get install openssh-server
7) misubeimp@misubeimp:~$ gcc -- version
8) misubeimp@misubeimp:~$ sudo apt-get install build-essential
9) misubeimp@misubeimp:~$ sudo gedit /etc/hosts
10) misubeimp@misubeimp:~$ sudo mkdir /cluster
11) misubeimp@misubeimp:~$ sudo adduser mpiuser --home /clutser
12) misubeimp@misubeimp:~$ sudo chown mpiuser / cluster
13) misubeimp@misubeimp:~$ su – mpiuser
14) mpiuser@misubeimp:~$ cd .ssh
15) mpiuser@misubeimp:~$ ssh-keygen -t dsa
16) mpiuser@misubeimp:~$ cat id_dsa.pub >> authorized_keys
17) mpiuser@misubeimp:~$ cat authorized_keys
18) mpiuser@misubeimp:~$ nano authorized_keys
19) misubeimp@misubeimp:~$ mpicc I_am_alive –o I_am_alive.c
20) misubeimp@misubeimp:~$ mpiexe –n 8 ./I_am_alive
21) misubeimp@misubeimp:~$ sudo scp I_am_alive @minhas-pc
22) misubeimp@misubeimp:~$ sudo chmod +x I_am_alive
23) misubeimp@misubeimp:~$ mpiexe –n 8 –f machinefile ./I_am_alive
20
References
https://help.ubuntu.com/community/MpichCluster
https://computing.llnl.gov/tutorials/parallel_comp/

More Related Content

Similar to Cluster Setup Manual Using Ubuntu and MPICH (20)

PPTX
Clustering by AKASHMSHAH
Akash M Shah
 
PDF
High–Performance Computing
BRAC University Computer Club
 
PPTX
54665962-Nav-Cluster-Computing.pptx
YashAhire28
 
PDF
HPC_Week_01.pdf
Engr. Mir Muhammad
 
PPTX
Cloud Computing-UNIT 1 claud computing basics
moeincanada007
 
PDF
Complete MPICH2 Clustering Manual in Ubuntu
Minhas Kamal
 
PPT
Cluster Computing
NIKHIL NAIR
 
PPTX
Seminar PPT on computer cluster by unknown.pptx
p4969246
 
PDF
Building SuperComputers @ Home
Abhishek Parolkar
 
PPTX
(19-23)CC Unit-1 ppt.pptx
NithishaYadavv
 
PDF
R21 Sasi Engineering College cloud-computing-notes.pdf
itmohan
 
PPTX
492 final presentation
MaheshWosti
 
PPT
CLUSTER COMPUTING
KITE www.kitecolleges.com
 
PPT
Chapter 1 pc
Hanif Durad
 
PPTX
Cluster computing
Adarsh110
 
PPTX
Cluster computing
Adarsh110
 
PPTX
Rgk cluster computing project
OstopD
 
PDF
lecture01_Introduction.pdf
MarlonMagtibay2
 
ODP
Distributed Computing
Sudarsun Santhiappan
 
PPT
Lecture 1
Mr SMAK
 
Clustering by AKASHMSHAH
Akash M Shah
 
High–Performance Computing
BRAC University Computer Club
 
54665962-Nav-Cluster-Computing.pptx
YashAhire28
 
HPC_Week_01.pdf
Engr. Mir Muhammad
 
Cloud Computing-UNIT 1 claud computing basics
moeincanada007
 
Complete MPICH2 Clustering Manual in Ubuntu
Minhas Kamal
 
Cluster Computing
NIKHIL NAIR
 
Seminar PPT on computer cluster by unknown.pptx
p4969246
 
Building SuperComputers @ Home
Abhishek Parolkar
 
(19-23)CC Unit-1 ppt.pptx
NithishaYadavv
 
R21 Sasi Engineering College cloud-computing-notes.pdf
itmohan
 
492 final presentation
MaheshWosti
 
CLUSTER COMPUTING
KITE www.kitecolleges.com
 
Chapter 1 pc
Hanif Durad
 
Cluster computing
Adarsh110
 
Cluster computing
Adarsh110
 
Rgk cluster computing project
OstopD
 
lecture01_Introduction.pdf
MarlonMagtibay2
 
Distributed Computing
Sudarsun Santhiappan
 
Lecture 1
Mr SMAK
 

Recently uploaded (20)

PPTX
Pyhton with Mysql to perform CRUD operations.pptx
Ramakrishna Reddy Bijjam
 
PPTX
Views on Education of Indian Thinkers Mahatma Gandhi.pptx
ShrutiMahanta1
 
PDF
People & Earth's Ecosystem -Lesson 2: People & Population
marvinnbustamante1
 
PPTX
Mathematics 5 - Time Measurement: Time Zone
menchreo
 
PPTX
How to Create a PDF Report in Odoo 18 - Odoo Slides
Celine George
 
PPTX
Quarter1-English3-W4-Identifying Elements of the Story
FLORRACHELSANTOS
 
PDF
community health nursing question paper 2.pdf
Prince kumar
 
PPT
Talk on Critical Theory, Part II, Philosophy of Social Sciences
Soraj Hongladarom
 
PPTX
THE TAME BIRD AND THE FREE BIRD.pptxxxxx
MarcChristianNicolas
 
PPTX
grade 5 lesson ENGLISH 5_Q1_PPT_WEEK3.pptx
SireQuinn
 
PPTX
Unit 2 COMMERCIAL BANKING, Corporate banking.pptx
AnubalaSuresh1
 
PPTX
How to Manage Large Scrollbar in Odoo 18 POS
Celine George
 
PPTX
BANDHA (BANDAGES) PPT.pptx ayurveda shalya tantra
rakhan78619
 
PPTX
ASRB NET 2023 PREVIOUS YEAR QUESTION PAPER GENETICS AND PLANT BREEDING BY SAT...
Krashi Coaching
 
PDF
Chapter-V-DED-Entrepreneurship: Institutions Facilitating Entrepreneurship
Dayanand Huded
 
PPTX
STAFF DEVELOPMENT AND WELFARE: MANAGEMENT
PRADEEP ABOTHU
 
PDF
0725.WHITEPAPER-UNIQUEWAYSOFPROTOTYPINGANDUXNOW.pdf
Thomas GIRARD, MA, CDP
 
PDF
SSHS-2025-PKLP_Quarter-1-Dr.-Kerby-Alvarez.pdf
AishahSangcopan1
 
PDF
The dynastic history of the Chahmana.pdf
PrachiSontakke5
 
PDF
The Constitution Review Committee (CRC) has released an updated schedule for ...
nservice241
 
Pyhton with Mysql to perform CRUD operations.pptx
Ramakrishna Reddy Bijjam
 
Views on Education of Indian Thinkers Mahatma Gandhi.pptx
ShrutiMahanta1
 
People & Earth's Ecosystem -Lesson 2: People & Population
marvinnbustamante1
 
Mathematics 5 - Time Measurement: Time Zone
menchreo
 
How to Create a PDF Report in Odoo 18 - Odoo Slides
Celine George
 
Quarter1-English3-W4-Identifying Elements of the Story
FLORRACHELSANTOS
 
community health nursing question paper 2.pdf
Prince kumar
 
Talk on Critical Theory, Part II, Philosophy of Social Sciences
Soraj Hongladarom
 
THE TAME BIRD AND THE FREE BIRD.pptxxxxx
MarcChristianNicolas
 
grade 5 lesson ENGLISH 5_Q1_PPT_WEEK3.pptx
SireQuinn
 
Unit 2 COMMERCIAL BANKING, Corporate banking.pptx
AnubalaSuresh1
 
How to Manage Large Scrollbar in Odoo 18 POS
Celine George
 
BANDHA (BANDAGES) PPT.pptx ayurveda shalya tantra
rakhan78619
 
ASRB NET 2023 PREVIOUS YEAR QUESTION PAPER GENETICS AND PLANT BREEDING BY SAT...
Krashi Coaching
 
Chapter-V-DED-Entrepreneurship: Institutions Facilitating Entrepreneurship
Dayanand Huded
 
STAFF DEVELOPMENT AND WELFARE: MANAGEMENT
PRADEEP ABOTHU
 
0725.WHITEPAPER-UNIQUEWAYSOFPROTOTYPINGANDUXNOW.pdf
Thomas GIRARD, MA, CDP
 
SSHS-2025-PKLP_Quarter-1-Dr.-Kerby-Alvarez.pdf
AishahSangcopan1
 
The dynastic history of the Chahmana.pdf
PrachiSontakke5
 
The Constitution Review Committee (CRC) has released an updated schedule for ...
nservice241
 
Ad

Cluster Setup Manual Using Ubuntu and MPICH

  • 1. 1 Cluster Setup Manual Using Ubuntu and MPICH prepared by BSSE-0516
  • 2. 2 Cluster Setup Manual Using Ubuntu and MPICH Institute of Information Technology, University of Dhaka Date of Submission 13 June 2015 Submitted to Emon Kumar Dey Course Instructor of SE-501 Lecturer Institute of Information Technology University of Dhaka Submitted by Md. Rakib Hossain (BSSE 0516) Submitted to Amit Seal Ami Lab Instructor of SE-501 Lecturer Institute of Information Technology University of Dhaka
  • 3. 3 TABLE OF CONTENTS BACKGROUND..............................................................................................5 OBJECTIVES .................................................................................................5 BOARD OBJECTIVE: ..................................................................................................................... 5 SPECIFIC OBJECTIVES: ................................................................................................................. 5 ORIGIN OF THE DOCUMENT............................................................................5 WHAT IS PARALLEL COMPUTING? ..................................................................6 WHERE USED PARALLEL COMPUTING?............................................................6 SCIENCE AND ENGINEERING: ....................................................................................................... 6 INDUSTRIAL AND COMMERCIAL:.................................................................................................. 7 WHAT IS CLUSTER COMPUTING? ....................................................................7 WHAT ARE MPI AND MPICH? ..........................................................................7 MPI: ............................................................................................................................................ 7 MPICH:....................................................................................................................................... 8 MAKE A CLUSTER FOR PARALLEL COMPUTING................................................8 WHAT ARE THE PREREQUISITES? ................................................................................................. 8 WHAT ARE THE REQUIRED PACKAGES NEEDED TO BE INSTALLED?...................9 INSTALLING MPICH.................................................................................................................... 9 INSTALLING SSH SERVER.......................................................................................................... 10 INSTALLING GCC ...................................................................................................................... 10 HOW TO SETUP THE CLUSTER ENVIRONMENT? ..............................................11 DEFINING HOSTNAMES:.............................................................................................................. 11 SETTING UP AUTHORIZED KEYS FOR PASSWORD LESS COMMUNICATION BETWEEN NODES:....... 13 SETTING UP A MACHINE-FILE: .................................................................................................... 15 HOW TO WRITE A PROGRAM USING MPICH?............................................................................ 16 HOW TO COMPILE AND EXECUTE THE PROGRAM USING MPICH?............................................. 16 1) Using USB Flash Drive: ............................................................................................... 17 2) Using scp command:..................................................................................................... 17 3) Using sharing master folder:........................................................................................ 17
  • 4. 4 CONCLUSION..............................................................................................19 APPENDIX ..................................................................................................19 LIST OF COMMAND USED IN THIS MANUAL:................................................................................ 19 REFERENCES ..............................................................................................20
  • 5. 5 Background Generally, a software program has been developed for serial computation. In order to solve a computing problem, an algorithm is formulated and applied as a successive flow of instructions. All these instructions are usually executed on a CPU in one computer. Merely single instruction may execute at a time after that instruction is finished, the next instruction is executed. In our real world there are lots of computing problem that needs huge calculation and concurrency such as weather forecast, planetary movements, galaxy formulation etc. These types of problems are so large and/or complex that it is impractical or impossible to solve them on a single computer using serial computation especially given limited computer memory. Objectives Board Objective: The main objective of this document is to show how to setup a cluster using two computer having Linux based OS (Ubuntu 14.04) using MPICH. Specific Objectives:  To learn what is parallel computing and how it works.  Why parallel computing is needed and important.  What is cluster computing and how it works?  What are MPI and MPICH and how they work?  How to make a cluster  How to write the first parallel computing code and execute it on cluster Origin of the Document As per our course tutor Mr. Emon Kumar Dey instructed us to submit a report as part of the course evaluation, I prepare the paper. By writing the report I have learnt how to setup a cluster for parallel computing. I have also learnt how to write a code that may execute parallel on a cluster. So we are very thankful to our course tutor for giving us the opportunity.
  • 6. 6 What is Parallel Computing? Parallel computing is used for multiple processing components simultaneously to resolve a problem. This is carried out by splitting the problem into independent section in order that every single processing component can easily execute its section of the algorithm simultaneously with the others. The processing components may be diverse including resources like a single computer with multiple processors, several networked computers, specialized hardware, or any combination of the above Figure 1: Parallel Computing Where used Parallel Computing? Science and Engineering:  Historically, parallel computing has been considered to be "the high end of computing", and has been used to model difficult problems in many areas of science and engineering: o Atmosphere, Earth, Environment o Physics - applied, nuclear, particle, condensed matter, high pressure, fusion, photonics o Bioscience, Biotechnology, Genetics o Chemistry, Molecular Sciences o Geology, Seismology o Mechanical Engineering - from prosthetics to spacecraft o Electrical Engineering, Circuit Design, Microelectronics o Computer Science, Mathematics o Defense, Weapons
  • 7. 7 Industrial and Commercial:  Today, commercial applications provide an equal or greater driving force in the development of faster computers. These applications require the processing of large amounts of data in sophisticated ways. For example: o Databases, data mining o Oil exploration o Web search engines, web based business services o Medical imaging and diagnosis o Pharmaceutical design o Financial and economic modeling o Management of national and multi- national corporations o Advanced graphics and virtual reality, particularly in the entertainment industry o Networked video and multi-media technologies o Collaborative work environments What is Cluster Computing? Cluster computing is a model of computing where a collection of computers are interconnected with each other in order that they can behave like a single entity. The components of a cluster are normally linked to one another through fast local area networks, among each node running its own instance of an operating system. Generally in most situations, all the nodes use the same hardware and the same operating system, even though in a few configurations different operating systems may be used for each computer, as well as different hardware. Clusters are generally implemented to enhance performance and availability over compared to a single computer, while usually being much more cost-effective than single computers of comparable speed or availability. Cluster Computing is used for parallel processing, load balancing etc. Clustering is a popular strategy for implementing parallel processing programs as it enables the programs to run each independent component simultaneously in each computer. Clusters are able to executing multiple complex instructions by distributing workload throughout all connected computers. Clustering enhances the system's availability to users, its aggregate performance, and overall tolerance to faults and component failures. What are MPI and MPICH? MPI: Message Passing Interface (MPI) is a standardized and transportable message-passing system developed by a group of researchers from academia and industry to work on a wide range
  • 8. 8 of parallel computers. The goal of the Message Passing Interface is to establish a portable, efficient, and flexible standard for message passing that will be widely used for writing message passing programs. Using this API connected computer in a cluster can communicate to each other through message passing .It enables them to send and receive message and then synchronize themselves. The standard defines the syntax and semantics of a core of library routines useful to a wide range of users writing portable message-passing programs in different computer programming languages such as Fortran, C, C++ and Java. MPICH: MPICH is a high performance and widely portable implementation of the Message Passing Interface (MPI) standard. Make a Cluster for Parallel Computing What are the Prerequisites? Here we develop a cluster using MPI and then write a program, compile and execute this program using MPICH The prerequisites are 1. We need at-least two computers having Linux distribution installed on it (here we use Ubuntu 14.04 LTS) .We have to make it sure that our system has GCC installed. As Ubuntu has installed GCC built in on it, for the time being we need not to install it 2. A network connection between this two computers and they must have IP addresses assigned on them. Now let us assume that we have two computers fulfilling our prerequisites. Let the host name and the IP addresses of these computers be. Here we call all this computer node misubeimp 10.255.4.125 minhas-pc 10.255.4.98
  • 9. 9 What are the Required Packages needed to be installed? Installing MPICH As we are using mpich2 as our message passing system so we should install its packages in all the nodes. To install it we have to run the following command ***Here one important point must be noted that we have to install the same version of MPICH in all the nodes. We use MPICH2.To check the version properly run the following command. To ensure where installing MPICH2 run the following command
  • 10. 10 We may test that the program did indeed install successfully by entering the following commands in all nodes. Installing SSH Server Mpich communicates among the nodes using remote login and also distribute the processes among the nodes through remote log in. So we need to confirm that our nodes have the ability of remote log in. Remote log in can be performed using different way for example telnets, openssh .In our case we use openssh as it gives better security of data than telnet. To install openssh we have to run the following command in all nodes. Installing GCC As it is said before Ubuntu has built-in GCC installed but to ensure that which version of GCC is installed we may execute the following command If exceptionally GCC is missed then we may easily installed it using the following command in all nodes.
  • 11. 11 How to Setup the Cluster Environment? Defining hostnames: We have to define the two node host name in every node hosts file. To define that we have to edit the host file by executing the following command First the hosts file looks like the given picture. We have to edit the hosts file with our two nodes ip address and host name in this way mentioned in the below picture After editing out hosts file is looked as the following picture. Add a New User for running MPI programs: Now we have to add a new user in every node for running MPI program. In every node the new user must have the same user name. It is better to give the same password for availability. So first we make a directory in our root directory then we add our new user in the directory. Our
  • 12. 12 new user name would be mpiuser and the directory name is cluster. To do it we have to run the following command. As we directly add our user to the cluster directory without create our user that’s why the system will ask to add the new user automatically . For the time being here we just enter our password and leave all other stuff default. Now our new mpiuser successfully add to this directory and we now changes the owner ship of this directory to mpiuser .To do it we have to enter the following command.
  • 13. 13 Setting up Authorized Keys for password less communication between nodes: After successfully adding new user now we log in to the new user. Now we will generate a new ssh key. On executing the following command Here it'll ask for a paraphrase. Leave it blank as we want to create a password-less ssh (Assuming that we have a trusted LAN with no security issues).
  • 14. 14 After executing the command a folder called .ssh will be created in home directory. It’s a hidden folder. This folder will contain a file id_dsa.pub that contains your public key. This public key is used for sending cryptic message. The distinguishing technique used in public key cryptography is the use of asymmetric key algorithms, where the key used to encrypt a message is not the same as the key used to decrypt it. Each user has a pair of cryptographic keys- a public encryption key and a private decryption key. The publicly available encrypting-key is widely distributed, while the private decrypting-key is known only to the recipient. Messages are encrypted with the recipient's public key and can only be decrypted with the corresponding private key. The keys are related mathematically, but the private key cannot feasibly be derived from the public key. Now copy this key to another file called authorized_keys in the same directory. Execute the Commands in the terminal The authorized_keys file contains the key for one node and it will look like the following picture. It will show misubeimp pc’s mpiuser authorize keys We have also got another authorized_keys in minhas-pc‘s mpiuser authorize keys. Now we have to make a common authorized_keys file for both user pc so that both nodes contains the same keys in their authorized_keys file. We may do it using simply copy past command with nano editor. After make the common file the file would be look like the below picture.
  • 15. 15 Setting up the keys we setup the environment successfully. Up to that log out from the mpiuser and restart the pc. Setting up a machine-file: Now we create a file called "machinefile" in mpiuser home directory with node names followed by a colon and a number of processes to we want to execute on each node.
  • 16. 16 How to Write a Program using MPICH? Now we write our very first program using mpich convention. There is a demo example is given below. How to Compile and Execute the Program Using MPICH? To compile the above program using mpich we have to execute the following command . After compiling we can execute the compiled file using mpich to our local node without using the machine file. To do it we have to execute the following command.
  • 17. 17 Now as our goal is to execute the compiled program in both nods of our cluster so we have to make sure that in both nodes the compiled file and the machine file are present in their mpiuser home directory. After that we may execute the program with machine in any node. Here we can transfer the executable file in all nodes using different ways. 1. Using Usb Flash drive 2. Using scp command 3. Using sharing master folder. 1) Using USB Flash Drive: When we transfer the executable file among the nodes using usb flash drive we have to place the file in the same location of mpiuser account. We also make sure that this file is in executable mode .To do that we may run the following command to make the file executable. 2) Using scp command: We may transfer the executable file using scp command from one node to another node into the same location. To do that we may execute the following command. 3) Using sharing master folder: Here we first make a folder in all nodes, and then we'll store our data and programs in this folder. And then we share the contents of this folder located on the master node to all the other nodes. As we didn’t make any master folder in our cluster so if someone is interested to know how to make a master folder, they are requested to check out the following link’s 2 and 3 no points. MpichCluster.
  • 18. 18 Now to execute the program in multiple modes we have to execute the following commands with machine file Now it will show the following output
  • 19. 19 Conclusion In the manual, firstly we have discussed on parallel computing, how parallel computing works, in which situation we need to compute parallel and why parallel computing is necessary for scientific research and industrial works. The main focus of this manual based on how to setup a cluster using Ubuntu (14.04) operating system and mpich along with writing the first parallel program and executes it in multiple pc parallel. To do so step by step procedures are mentioned in this manual .We hope this manual will be very helpful for understanding cluster computing and setting up first cluster computer. Appendix List of command used in this manual: 1) misubeimp@misubeimp:~$ sudo apt-get update 2) misubeimp@misubeimp:~$ sudo apt-get install mpich2 3) misubeimp@misubeimp:~$ mpichversion 4) misubeimp@misubeimp:~$ which mpiexec 5) misubeimp@misubeimp:~$ which mpirun 6) misubeimp@misubeimp:~$ sudo apt-get install openssh-server 7) misubeimp@misubeimp:~$ gcc -- version 8) misubeimp@misubeimp:~$ sudo apt-get install build-essential 9) misubeimp@misubeimp:~$ sudo gedit /etc/hosts 10) misubeimp@misubeimp:~$ sudo mkdir /cluster 11) misubeimp@misubeimp:~$ sudo adduser mpiuser --home /clutser 12) misubeimp@misubeimp:~$ sudo chown mpiuser / cluster 13) misubeimp@misubeimp:~$ su – mpiuser 14) mpiuser@misubeimp:~$ cd .ssh 15) mpiuser@misubeimp:~$ ssh-keygen -t dsa 16) mpiuser@misubeimp:~$ cat id_dsa.pub >> authorized_keys 17) mpiuser@misubeimp:~$ cat authorized_keys 18) mpiuser@misubeimp:~$ nano authorized_keys 19) misubeimp@misubeimp:~$ mpicc I_am_alive –o I_am_alive.c 20) misubeimp@misubeimp:~$ mpiexe –n 8 ./I_am_alive 21) misubeimp@misubeimp:~$ sudo scp I_am_alive @minhas-pc 22) misubeimp@misubeimp:~$ sudo chmod +x I_am_alive 23) misubeimp@misubeimp:~$ mpiexe –n 8 –f machinefile ./I_am_alive