Build Your Own Clustermatic Cluster
Installation Instructions for Fedora Core 4 and Clustermatic 5
You should have the following parts:
- 4 single-processor Athlon PCs
- 4 Intel network cards (already installed)
- 4 network cables
- 1 fast ethernet switch and its power adapter
- 1 3Com network card (already installed)
- 1 keyboard
- 1 mouse
- 1 monitor
- 1 power strip
- 5 power cables
- 4 Fedora Core 4 CD-ROMs
(discs 1, 2, 3 & 4, from fedora.redhat.com)
- 3 Clustermatic 5 CD-ROMs
- 1 TCB Cluster CD-ROM
(Sun Grid Engine Packages, NAMD examples and
Part 1: Install Fedora Core 4 on the Master Node
If you've installed Fedora Core before, the following may be quite tedious.
If you've never installed Fedora Core before, the following may be quite
mysterious. It's a necessary evil in either case.
- Plug the monitor into the power strip and turn it on.
- Find the machine with two network cards; this is the master node.
- Plug the master node into the power strip and connect the monitor,
keyboard, and mouse.
- Power on the master node, open the CD-ROM drive, insert
Fedora Core disk 1, and press the reset button. The machine should
boot from the CD-ROM.
If you wait too long and your machine starts booting off of the hard
drive just press the reset button to make it boot from the CD-ROM.
If your machine still insists on booting from the hard drive
you may need to modify its BIOS settings.
- When the Fedora Core screen comes up, hit enter.
If you don't have a mouse it is suggested that you type linux
text at this point to do a text based install. The process is very
similar to the graphical install.
- Skip testing the CD media.
This takes far too long and has no real benefit for fresh installs.
- Click Next to the "Welcome to Fedora Core Linux!" message.
- Select English as your installation language.
- Select a US model keyboard.
- If your mouse was not automatically detected,
select a generic 3-button PS/2 mouse.
- If the installer detects that another version of Linux is installed,
click Install Fedora Core and hit Next.
- Select a Workstation install.
We typically use Custom, but Workstation is good enough for now.
- Select Autopartition.
We don't store files on our cluster machines, even on the master nodes,
so it doesn't matter how the disk is set up. We use dedicated fileservers
- Select "Remove all partitions on this system".
Again, we don't keep data on cluster machines.
- Yes, you really want to delete all existing data.
Of course, at home you might not want to do this.
- Click Next at the GRUB boot loader screen.
- At the network configuration screen set both cards to "Activate on
boot." Select device "Eth0" and click edit. In the dialog that appears, unlick
the Configure using DHCP checkbox and then enter 10.0.4.1 in the IP Address
Field and 255.255.255.0 in the Netmask field. Click Ok after you have made
This will be the interface to the private network.
- "Eth1" is for the outside network, and you should
input the IP Address given to you by your instructor. The
netmask should be 255.255.255.0. Select "OK" when done
with the interface.
- Enter these settings:
Primary DNS: 18.104.22.168
Secondary DNS: 22.214.171.124
Tertiary DNS: 126.96.36.199
and select "OK" to continue.
Note: these values are specific to our network. If you
want to set up your own cluster later on, you'll have to get
these addresses from your local sysadmin (which might be
- On the next screen, select "America/Chicago" as the
timezone, and change the network timeserver to
Select "OK" to continue.
- Disable the firewall and SELinux.
In most cases your cluster will not be connecting to the outside
world, so it should be safe to disable the firewall if you trust your
network, if not you'll need to enable it.
- Click Proceed when the installer warns you about not having a
- The hardware clock should be set to GMT. Pick your time zone.
- Pick a root password that you will remember. Write it down.
- You don't need to customize the software selection or pick individual packages.
However, you may want to do this for a production system.
This is by far the easiest time to add packages to your cluster.
On the other hand, the default install has 2 GB of software,
so you could save some time in the next step if you pared the list down.
- Start the installation. It will take between 15 and 25 minutes to
install Fedora Core 4 and will prompt you as necessary for additional disks.
- Make a boot floppy.
Having a Linux boot floppy can be invaluable. A floppy made now
will be unable to load kernel modules once you install Clustermatic, but
it will still allow you to boot your machine and fix any misconfigurations.
You probably won't need it today, though.
- At this point your Fedora Core 4 box is installed. Reboot the system
when prompted to.
- After rebooting, the Welcome to Fedora Core 4 screen will pop up.
- You will need to Agree to the License Agreement before continuing.
- Verify that the computer is set to the correct time, if not change
it to be correct.
- At the Display Configuration screen you can either just click next,
or adjust the monitor configuration to "Generic LCD 1024x768". After you have
done this you can adjust the default screen resolution to 1024x768.
You would normally only use the console of a production cluster
during initial configuration or adding nodes, and you don't need a GUI
for either of those, so there is little reason to configure X-Windows.
Having multiple terminals available will be useful for this exercise,
so we'll go ahead and configure X-Windows anyway.
- Create a username and password for yourself.
- Click Next at the sound card configuration screen.
- Click Next at the Additional CDs screen.
- Click Next at the Finish Setup screen.
- Congratulations, you've installed Fedora Core 4.
Part 2: Install Clustermatic 5 on the Master Node
The following will be new to everyone. You'll need to know how to
use a unix text editor. The examples below use the mouse-driven editor
"gedit" rather than the more common "vi".
- Go to Desktop, System Settings,
Security Level. Enable the Firewall, and set eth0 to be a trusted device, or the slaves
won't be able to download the kernel.
- Open a terminal (right-click on the desktop, Open Terminal).
- Run gedit /etc/modules.conf and
swap eth0 and eth1 if necessary so that the network alias lines read:
The hands-on master nodes have two network cards, one from Intel and one
from 3Com. Editing /etc/modules.conf ensures that the Intel network card
is the private network (eth0) and the 3Com network card is the outside world
(eth1). If you had two identical network cards you would need to determine
which was which by trial and error. The hands-on slave nodes only have the
alias eth0 eepro100
alias eth1 3c59x
- Insert the Clustermatic 5 CD and wait for it to appear on your
If you're not running X-Windows you need to mount it with
- Install Clustermatic with rpm -ivh --force /media/cdrom/RPMS/i686/kernel*
Substitute the directory for your architecture if you are not using a 32 bit Intel or AMD processor. The --force option enables you to install a version previous to the one you currently have running.
- Make an initrd image for this new kernel with /sbin/mkinitrd /boot/initrd-2.6.9-cm46 2.6.9-cm46
If you installed a different kernel in the provious step adjust accordingly.
- gedit /boot/grub/grub.conf and add:
kernel /vmlinuz-2.6.9-cm46 ro root=/dev/VolGroup00/LogGroup00 rhgb
and edit the default line (a zero-base index into the list of kernels that follows) to point to the new entry you made
Make sure that the root device is the same as for the old kernel. If you installed a different kernel in the previous step you should adjust the kernel and initrd image appropriately.
- Unmount the Clustermatic CD with eject and remove it.
- Copy the compat-libstdc++-33 package (required for NAMD) from
the workshop CD or website and install it with rpm -i compat-libstdc++*.
- Reboot the computer into the new kernel that you just installed.
- Login again and open up a terminal window.
- Insert the Clustermatic 5 CD again.
- Install the remaining packages with rpm -ivh /media/cdrom/RPMS/i586/beo*.rpm /media/cdrom/RPMS/i586/m*.rpm /media/cdrom/RPMS/i586/bp*.rpm
- gedit /etc/clustermatic/config and edit the following lines:
- Verify that your interface line reads:
- Change the nodes line to the number of slave nodes you will have:
- Change the iprange line to provide the corresponding number of addresses:
iprange 0 10.0.4.10 10.0.4.12
- Change the kernelimage to point at the proper kernel:
If you installed a different kernel in previous steps, adjust accordingly.
- Add this line to the libraries section:
libraries /lib/libtermcap* /lib/libdl* /usr/lib/libz* /lib/libgcc_s*
This ensures that libraries needed by NAMD are available on the slave nodes.
- If you are going to want to share home directories to the slave nodes, then gedit /etc/clustermatic/config.boot to add
also gedit /etc/clustermatic/fstab to add
MASTER:/home /home nfs defaults 0 0
and gedeit /etc/exports to add (with a tab beween /home and *)
and finally /sbin/chkconfig nfs on and /sbin/service nfs start
- You may remove the CD.
Part 3: Attach and Boot the Slave Nodes
There is more information about using ClusterMatic at the end of this guide.
- Plug the private network switch into an outlet on the power strip.
- Connect the master node's Intel network card (this is the same type of
card that the slave nodes have, while the other card has a barely legible
"3Com" ingraved on it) to the switch.
- Log in as root and open a terminal.
- Create the level 2 boot image with beoboot -2 -n
This builds the second stage boot image, which the slaves will
download from the master over the network. You only need to run this
command when you change the boot options in
/etc/clustermatic/config or /etc/clustermatic/config.boot.
- Start up Clustermatic services with /sbin/service clustermatic start
- Open a second terminal and run /usr/lib/beoboot/bin/nodeadd -a eth0
The nodeadd program will run until you kill it with Ctrl-C. Leave it running!
This process is only needed when adding new nodes to the cluster.
The nodeadd program captures the hardware ethernet address of any machine
trying to boot on the private network (eth0), adds it to node list in
/etc/clustermatic/config, and makes the beoboot daemon read the new list (-a).
When a new node is detected nodeadd will print the hardware address followed
by a message about sending SIGHUP to beoserv.
- For each slave node plug in it's power cable and network cable.
- Power on each slave node and insert a Clustermatic 5 CD.
- Switch to the second terminal and kill nodeadd with Ctrl-C
- Run tail /etc/beowulf/config to see the new (uncommented) node addresses.
- Check the status of the cluster with bpstat
Make sure as many nodes are up as the number of slaves you have.
If you had not modified the nodes and iprange lines in /etc/clustermatic/config
to match the size of your cluster, you would see the extra nodes
harmlessly listed as down.
- Examine the log file from node 0 with less /var/log/beowulf/node.0
Each node has its own log file in /var/log/clustermatic.
These log files only contain output from the final stages of slave startup,
after the second stage kernel has contacted the master node.
- View the kernel messages from node 0 with bpsh 0 dmesg | less
The bpsh command allows any binary installed on the master node
to execute on one or more slave nodes (see options in the appendix).
Interpreted scripts or programs requiring files found only on the master node
cannot be run via bpsh.
- Reboot all the slaves with bpctl --slave all --reboot
If you see any "Node is down" messages, these indicate that fewer
than the number of nodes given in /etc/clustermatic/config were up when you
issued the command.
- Log out.
Part 4: Installing Sun Grid Engine
- Log into the system and open a terminal.
- Begin the installation with adding a user to run SGE. adduser sgeadmin
- Change to the home directory of the sgeadmin user you just created. cd /home/sgeadmin
- Insert the TCB Cluster CD into the Master Node.
- Unpack the Common and Platform specific packages off the CD into the home directory of the sgeadmin user.
tar xzf /media/cdrom/sge-6.0u6-bin-lx24-x86.tar.gz
tar xzf /media/cdrom/sge-6.0u6-common.tar.gz
- Set your SGE_ROOT environment variable to the sgeadmin's home directory. export SGE_ROOT=/home/sgeadmin
- Run the setfileperm.sh script provided to fix file permissions. util/setfileperm.sh $SGE_ROOT
- Run gedit /etc/services and add the lines
in the appropriate place. Save the file.
- Run the QMaster installer. cd $SGE_ROOT
- Hit Return at the introduction screen.
- When choosing the Grid Engine admin user account hit y to specify a non-root user, and then enter sgeadmin as the user account. Hit return to progress.
- Verify that the Grid Engine root directory is set to /home/sgeadmin.
- Since we have set the ports needed from sge_qmaster and sge_engine in a previous step, we should be able to hit Return through the next two prompts.
- Hit return to set the name of your cell to "default".
- Use the default install options for the spool directory configuration.
- We already ran the file permission script so we can hit yes and skip this step.
- Since we are only going to have one execution host (cluster) we can say y to them being in the same domain name.
- The install script will then create some directories.
- Use the default options for the Spooling/DB questions.
- When prompted for group id range, use the default range of 20000-20100 unless you have a reason to do otherwise.
- Use the default options for the spool directory.
- The next step asks you to input an email address for the user who should receive problem reports. Typically this will be the person responsible for maintaining the cluster, but for now enter root@localhost
- Verify that your configuration options are correct.
- Hit yes so that the qmaster will startup when the computer boots.
- The next step asks you to enter in the names of your Execution Hosts (clusters). Say no to using a filename and then when prompted for a host, enter hostname.
- The next thing that the configuration program will ask you to do is to select a scheduler profile. Normal will work for most situations, so that's what we'll use now.
- Our queue master is now installed. Run . /home/sgeadmin/default/common/settings.sh to set some environment variables up. Note, that you should add this line into your login shell so have access to the grid engine utilities.
- Each cluster must also have the execution host installed on it. In this case our only cluster is the one we've been setting up. Begin by running qconf -sh. If hostname is not listed as execution host you will need to add it as a administrative host by running qconf -ah hostname. Similarly, add hostname as a submit host by running qconf -as hostname.
- Use . /home/sgeadmin/install_execd to start up the Execution host configuration script.
- Like the qmaster installation we can use all of the default options.
- After install_execd finishes running, use . /home/sgeadmin/default/common/settings.sh to set our environmental variables accordingly.
- Congratulations you now have a queuing system setup for your
cluster. Now to do some real work.
Appendix: Usage Options for Common Bproc Utilities
bpstat: monitor status of slave nodes
bpctl: alter state of slave nodes
Usage: bpstat [options] [nodes ...]
-h,--help Display this message and exit.
-v,--version Display version information and exit.
The nodes argument is a comma delimited list of the following:
Single node numbers - "4" means node number 4
Node ranges - "5-8" means node numbers 5,6,7,8
Node classes - "allX" means all slave nodes with status X
"all" means all slave nodes
More than one nodes argument can be given.
Valid node states are:
down boot error unavailable up
Node list display flags:
-c,--compact Print compacted listing of nodes. (default)
-l,--long Print long listing of nodes.
-a,--address Print node addresses.
-s,--status Print node status.
-n,--number Print node numbers.
-t,--total Print total number of nodes.
Node list sorting flags:
-R,--sort-reverse Reverse sort order.
-N,--sort-number Sort by node number.
-S,--sort-status Sort by node status.
-O,--keep-order Don't sort node list.
-U,--update Continuously update status
-L,--lock "locked" mode for running on an unattended terminal
-A hostname Print the node number that corresponds to a
host name or IP address.
-p Display process state.
-P Eat "ps" output and augment. (doesnt work well.)
bpsh: run programs on slave nodes
Usage: bpctl [options]
-h,--help Print this message and exit
-v,--version Print version information and exit
-M,--master Send a command to the master node
-S num,--slave num Send a command to slave node num
-s state,--state state Set the state of the node to state
-r dir,--chroot dir Cause slave daemon to chroot to dir
-R,--reboot Reboot the slave node
-H,--halt Halt the slave node
-P,--pwroff Power off the slave node
--cache-purge-fail Purge library cache fail list
--cache-purge Purge library cache
Reconnect to front end.
-m mode,--mode mode Set the permission bits of a node
-u user,--user user Set the user ID of a node
-g group,--group group Set the group ID of a node
-f Fast - do not wait for acknowledgement from
remote nodes when possible.
The valid node states are:
down boot error unavailable up
bpcp: copy files to slave nodes
Usage: bpsh [options] nodenumber command
bpsh -a [options] command
bpsh -A [options] command
-h Display this message and exit
-v Display version information and exit
Node selection options:
-a Run the command on all nodes which are up.
-A Run the command on all nodes which are not down.
IO forwarding options:
-n Redirect stdin from /dev/null
-N No IO forwarding
-L Line buffer output from remote nodes.
-p Prefix each line of output with the node number
it is from. (implies -L)
-s Show the output from each node sequentially.
-d Print a divider between the output from each
node. (implies -s)
-b ## Set IO buffer size to ## bytes. This affects the
maximum line length for line buffered IO. (default=4096)
Redirect standard in from file on the remote node.
Redirect standard out to file on the remote node.
Redirect standard error to file on the remote node.
Usage: bpcp [-p] f1 f2
bpcp [-r] [-p] f1 ... fn directory
-h Display this message and exit.
-v Display version information and exit.
-p Preserve file timestamps.
-r Copy recursively.
Paths on slave nodes are prefixed by nodenumber:, e.g., 0:/tmp/
Clustermatic web site
and Clustermatic 5 README
BProc: Beowulf Distributed Process Space web site