[ previous ] [ Abstract ] [ Copyright Notice ] [ Contents ] [ next ]

FAI Guide (Fully Automatic Installation)
Chapter 7 How to build a Beowulf cluster using FAI


This chapter describes the particularities about building a Beowulf cluster using Debian GNU/Linux and FAI. For more information about the Beowulf concept look at http://www.beowulf.org.


7.1 Planing the Beowulf setup

The example of a Beowulf cluster consists of one master node and 25 clients. A big rack was assembled where all the cases were put into. A keyboard and a monitor were also put into the rack, which are most of the time connected to the master server. But since we have a very long cables for monitor and keyboard, they can also be connected to all nodes, if something has to be changed in BIOS or when looking for errors, when a node does not boot. Power supply is another topic you have to think about. Don't connect many nodes to one power cord and one outlet. Distribute them among several breakout boxes and outlets. And what about the heat emission ? A dozen nodes in a small room can create to much heat, so you will need an air condition. Will the power supplies of each node go to stand by mode or are all nodes are turned on simultaneously after a power failure ?

All computers are connected to a Fast Ethernet switch. The master node (or master server) is called nucleus. It has two network cards. One for the connection to the external Internet, one for the connection to the internal cluster network. If connected from the external Internet, it's called nucleus, but the cluster nodes accesses the master node with name atom00, which is a name for the second network interface. The master server is also the install server for the computing nodes. A local Debian mirror will be installed on the local harddisk. The home directories of all user accounts is also located on the master server. It will be exported via NFS to all computing nodes. NIS will be used to distribute account, host and printer information to all nodes.

All client nodes atom01 to atom25 are connected via the switch with the second interface card of the master node. They can only connect to the other nodes or the master, but can't communicate to any host outside their cluster network. So, all services (NTP, DNS, NIS, ...) must be available on the master server. I choose the class C network address 192.168.42.0 for building the local Beowulf cluster network. You can replace the subnet 42 with any other number you like. I you have more that 253 computing nodes, choose a class A network address (10.X.X.X).

In the phase of preparing the installation, you have to boot the first install client many time, until there's no fault in your configuration scripts. Therefore you should have physical access to the master server and one client node. If you have few space, connect both computers to a switch box, so one keyboard and monitor can be shared among both.


7.2 Set up the master server

The master server will be installed by hand, cause it could be your first computer installed with Debian. It you have already a Debian host running, you can also install it via FAI. Create a partition on /files/scratch for the local Debian mirror with more that 3GB space available.


7.2.1 Set up the network

Add following lines for the second network card to /etc/network/interfaces:

     # Beowulf cluster connection
     iface eth1 inet static
     address 192.168.42.250
     netmask 255.255.255.0
     broadcast 192.168.42.255

Add the IP addresses for the client nodes. The FAI package has an example for this /etc/hosts:

     # Beowulf nodes
     # atom00 is the master server
     192.168.42.250 atom00
     192.168.42.1 atom01
     192.168.42.2 atom02

You can give the internal Beowulf network a name when you add this line to /etc/networks:

     beowcluster 192.168.42.0

Activate the second network interface with: /etc/init.d/networking start.


7.2.2 Setting up NIS

Add a normal user account tom which is the person how edits the configuration space and manages the local Debian mirror:

     # adduser tom
     # addgroup linuxadmin

This user should also be in the group linuxadmin. So, add a line to /etc/group:

     linuxadmin:x:101:tom

To initialize the master server as NIS server call ypinit -m. Then, copy the file netgroup from the examples directory to /etc and edit other files there. Adjust access to the NIS service.

     # cat /etc/ypserv.securenets
     # Always allow access for localhost
     255.0.0.0       127.0.0.0
     # This line gives access to the Beowulf cluster
     255.255.255.0 192.168.42.0

Rebuild the NIS maps: # cd /var/yp; make


7.2.3 Create a local Debian mirror

Now user tom can create a local Debian mirror on /files/scratch/ using mkdebmirror. This will need about 2.6 GB disk space for Debian 2.2 (aka potato). Export this directory to the netgroup @faiclients read only.


7.2.4 Install FAI package on the master server

Add following packages to the install server:

     nucleus:/# apt-get install task-dns-server ntp tftp bootp nfs-kernel-server fai fai-kernels

Configure NTP so that the master server and all client nodes will have the same correct system time.

It's very important to use the internal network name atom00 for the master sever (not the external name nucleus) in /etc/bootptab and /etc/fai.conf. Replace the strings FAISERVER with atom00 in /etc/bootptab and uncomment the following line in /etc/fai.conf so the Beowulf nodes can use the name for connecting their master server.

     NFSROOT_ETC_HOSTS="192.168.42.250 atom00"

/etc/bootptab:

     .
     .
     .failocal:\
             :tc=.faiglobal:\
             :sa=atom00:\
             :ts=atom00:\
             :T170="atom00:/usr/local/share/fai":\
             :T171="sysinfo":\
             :T172="verbose createvt sshd":\
             :sm=255.255.255.0:\
             :gw=192.168.42.250:\
             :dn=beowulf.debian.org:\
             :ds=192.168.42.250:\
             :ys=atom00:yd=nisnucleus:\
             :nt=atom00:
     .
     .


7.2.5 Prepare network booting

Uncomment the following line in /etc/inetd.conf:

     #bootps dgram udp wait root /usr/sbin/bootpd bootpd -i -t 120

and restart the inetd daemon. The user tom should have permission to create and the symlinks for booting via network card, so change the group and add some utilities.

     # chgrp -R linuxadmin /boot/fai; chmod -R g+rwx /boot/fai
     # cp /usr/share/fai/utils/* /usr/local/bin

Now, the user tom can create a symlink in /boot/fai using

     >tlink atom_install atom01

to boot the first client node for the first time. Then start to adjust the configuration for your client nodes. Don't forget to build the kernel for the cluster nodes using make-kpkg(8).


7.3 Tools for Beowulf cluster

Following tools for a Beowulf cluster now are available in /usr/local/bin:

tlink
Change the symbolic link that point to the kernel image for booting from network card.

all_hosts
Print a list of all hosts, print only the hosts which respond to a ping or the hosts which do not respond. The complete list of hosts is defined by the netgroup allhosts. Look at /usr/share/doc/fai/examples/etc/netgroup for an example.

rshall
Execute a command on all hosts which are up via rsh. Uses all_hosts to get the list of all hosts up. You can also use the dsh(1) command (dancer's shell, or distributed shell).


7.4 Wake on LAN with 3Com network cards

Wake on LAN is a very nice feature to power on a computer without having physical access to it. By sending a special ethernet paket to the network card, the computer will be turned on. Following things have to be done, to use the wake on LAN (WOL) feature.

  1. Connect the network card to the Wake-On-LAN connector on the motherboard using a 3 pin cable.
  2. My ASUS K7M motherboard has a jumper called Vaux (3VSBSLT) which allows to select the voltage supplied to add-in PCI cards. Set it to Add 3VSB (3 Volt stand by).
  3. Turn on the wake on LAN feature in BIOS
  4. For a 2.2 kernel you have to use the following driver: http://www.uow.edu.au/~andrewm/linux/#3c59x-bc

There's a little problem to enable the wake on LAN feature with a 2.2.19 kernel and a 3Com 3C905C network card. You have to use a patched 3c59x driver. But I managed it to get work. Download the file poll-ioctl-2.2.18-pre16.c.gz and copy it to kernel sources to drivers/net/3c59x.c. Then make a new kernel package and install this new kernel. To wake up a computer use the tool etherwake (in woody) or get the single C source file. For more information look at http://www.scyld.com/expert/wake-on-lan.html.


[ previous ] [ Abstract ] [ Copyright Notice ] [ Contents ] [ next ]
FAI Guide (Fully Automatic Installation)
Version 1.3.1 for FAI version 2.2.3, 15 november 2001
Thomas Lange lange@informatik.uni-koeln.de