openMosixview - a cluster-management GUI


the home of openMosix



SourceForge Logo

freshmeat.net

openMosix + diskless nodes:


At first you have to setup a DHCP-server which answers the DHCP-request
for an ip-adress when a diskless client boots. This DHCP-Server
(i call it master in this howto) acts additional as an NFS-server which
exports the whole client-filesystems so the diskless-cluster-nodes
(i call them slaves in this howto) can grab this FS (filesystem) for
booting as soon as it has its ip.
Just run a "normal"-openMosix setup on the master-node.
Be sure you included NFS-server-support in your kernel-configuration.
There are two kinds (or maybe a lot more) types of NFS:
  kernel-nfs
or
  nfs-daemon
It does not matter which one you use but my experiences shows to use
kernel-nfs in "older" kernels (like 2.2.18) and daemon-nfs in "newer" ones.
The NFS in newer kernels sometimes does not work properly.

If your master-node is running with the new openMosix-kernel start
with one filesystem as slave-node. Here the steps to create it:

Calculate at least 300-500 MB for each slave. Create an extra directory for the
whole cluster-filesystem and make a symbolic-link to /tftpboot.
(The /tftpboot-directory or link is required because the slaves searches for
a directory named /tftpboot/ip-adress-of-slave for booting. You can change
this only by editing the kernel-sources)
Then create a directory named like the ip of the first slave you want to
configure, e.g.
  mkdir /tftpboot/192.168.45.45

Depending on the space you have on the cluster-filesystem now copy the whole
filesystem from the master-node to the directory of the first slave.
If you have less space just copy:

/bin
/usr/bin
/usr/sbin
/etc
/var

You can configure that the slave gets the rest per NFS later.
Be sure to create empty directories for the mount-points.

The filesystem-structure in /tftpboot/192.168.45.45/ has to be similar
to / on the master.
Now you have to edit

/tftpboot/192.168.45.45/etc/HOSTNAME //insert the hostname of the slave
/tftpboot/192.168.45.45/etc/hosts //insert the hostname+ip of the slave

Depending on your distribution you have to change the ip-configuration
of the slave :

/tftpboot/192.168.45.45/etc/rc.config

/tftpboot/192.168.45.45/etc/sysconfig/network

/tftpboot/192.168.45.45/etc/sysconfig/network-scripts/ifcfg-eth0

Change the ip-configuration for the slave as you like.

Edit the file

/tftpboot/192.168.45.45/etc/fstab //the FS the slave will get per NFS
coresponding to
/etc/exports //the FS the master will export to the slaves

e.g. for a slave fstab:

master:/tftpboot/192.168.88.222 / nfs hard,intr,rw 0 1
none /proc nfs defaults 0 0
master:/root /root nfs soft,intr,rw 0 2
master:/opt /opt nfs soft,intr,ro 0 2
master:/usr/local /usr/local nfs soft,intr,ro 0 2
master:/data/ /data nfs soft,intr,rw 0 2
master:/usr/X11R6 /usr/X11R6 nfs soft,intr,ro 0 2
master:/usr/share /usr/share nfs soft,intr,ro 0 2
master:/usr/lib /usr/lib nfs soft,intr,ro 0 2
master:/usr/include /usr/include nfs soft,intr,ro 0 2
master:/cdrom /cdrom nfs soft,intr,ro 0 2
master:/var/log /var/log nfs soft,intr,rw 0 2
e.g. for a master exports:

/tftpboot/192.168.45.45    *(rw,no_all_squash,no_root_squash)
/usr/local    *(rw,no_all_squash,no_root_squash)
/root    *(rw,no_all_squash,no_root_squash)
/opt    *(ro)
/data    *(rw,no_all_squash,no_root_squash)
/usr/X11R6 *(ro)
/usr/share    *(ro)
/usr/lib    *(ro)
/usr/include    *(ro)
/var/log    *(rw,no_all_squash,no_root_squash)
/usr/src    *(rw,no_all_squash,no_root_squash)

If you mount /var/log (rw) from the NFS-server you have on central log-file!
(it worked very well for me. just "tail -f /var/log/messages" on the master
and you always know what is going on)
The cluster-filesystem for your first slave will be ready now.
Configure the slave-kernel now. If you have the same hardware on your cluster
you can reuse the configuration of the master-node.
Change the configuration for the slave like the following:

CONFIG_IP_PNP_DHCP=y
and
CONFIG_ROOT_NFS=y

Use as less modules as possible (maybe no modules at all)
because the configuration is a bit tricky.

Now (it is well described in the beowulf-howtos) you have to create
a nfsroot-device.
It is only used for patching the slave-kernel to boot from NFS.

   mknod /dev/nfsroot b 0 255
   rdev bzImage /dev/nfsroot

Here "bzImage" has to be your diskless-slave-kernel you find it
in /usr/src/linux-version/arch/i386/boot after succesfull compilation.

Then you have to change the root-device for that kernel

   rdev -o 498 -R bzImage 0

and copy the kernel to a floppy-disk

   dd if=bzImage of=/dev/fd0

Now you are nearly ready! You just have to configre DHCP on the master.
You need the MAC-adress (hardware adress) of the network card of
your first slave.
The easiest way to get this adress is to boot the client with the already
created boot-floppy (it will fail but it will tell you its MAC-adress).
If the kernel was configured alright for the slave the system should
come up from the floppy, booting the diskless-kernel, detecting its
network-card and sending an DHCP- and ARP request. It will tell you its
hardware adress during that moment! It looks like : 68:00:10:37:09:83.

Edit the file /etc/dhcp.conf like the following sample:

option subnet-mask 255.255.255.0;
default-lease-time 6000;
max-lease-time 72000;
subnet 192.168.45.0 netmask 255.255.255.0 {
    range 192.168.45.253 192.168.45.254;
    option broadcast-address 192.168.45.255;
    option routers 192.168.45.1;
}
host firstslave
{
    hardware ethernet 68:00:10:37:09:83;
    fixed-address firstslave;
    server-name "master";
}


Now you can start DHCP and NFS with their init scripts:

   /etc/init.d/nfsserver start

  /etc/init.d/dhcp start

You got it!! It is (nearly) ready now!

Boot your first-slave with the boot-floppy (again). It should work now.
Shortly after recognizing its network-card the slave gets its ip-adress
from the DHCP-server and its root-filesystem (and the rest) per NFS.
You should notice that modules included in the slave-kernel-config
must exist on the master too, because the slaves are mounting the
/lib-directory from the master. So they use the same modules (if any).

It will be easier to update or install additional libraries or
applications if you mount as much as possible from the master.
On the other hand if all slaves have their own complete
filesystem in /tftpboot your cluster may be a bit faster because
of not so many reads/writes hits on the NFS-server.

You have to add a .rhost file in /root (for user root) on each
cluster-member which should look like this:

node1 root
node2 root
node3 root
....

You also have to enable remote-login per rsh in the /etc/inetd.conf.
You should have these two lines in it if your linux-distribution uses inetd:

shell stream tcp nowait root /bin/mosrun mosrun -l -z /usr/sbin/tcpd in.rshd -L
login stream tcp nowait root /bin/mosrun mosrun -l -z /usr/sbin/tcpd in.rlogind

And for xinetd:

service shell
{
socket_type = stream
protocol = tcp
wait = no
user = root
server = /usr/sbin/in.rshd
server_args = -L
}
service login
{
socket_type = stream
protocol = tcp
wait = no
user = root
server = /usr/sbin/in.rlogind
server_args = -n
}


You have to restart inetd afterwards so that it reads the new configuration
   /etc/init.d/inetd restart

or

   /etc/init.d/xinetd restart

There may be another switch in your distribution-configuration-utility
where you can configure the security of the system. Change it to
"enable remote root login".
Do not use this in insecure environments!!! Use SSH instead of RSH!
You can use openMosixview with RSH or SSH.

Configuring SSH for remote login without password is a bit tricky.
Take a look at the "HOWTO use openMosix/openMosixview with SSH?"
at this website.

If you want to copy files to a node in this diskless-cluster you have
now two possibilities.
You can use rcp or scp for copying remote or you can use just cp and
copy files on the master
to the cluster-filesystem of one node.
The following two commands are equal:

   rcp /etc/hosts 192.168.45.45./etc

   cp /etc/hosts /tftpboot/192.168.45.45/etc/

howto created by Matt Rechenburg
I forgot something? Sure. Mail what has to be added.


Linux is a registered trademark of Linus Torvalds; openMosix is developed by Moshe Bar, All rights and copyright on openMosix reserved by Moshe Bar; MOSIX is developed by Prof. Amnon Barak, All rights and Copyright on MOSIX reserved by amnon at cs.huji.ac.il; SuSE Linux is a registered trademark of SuSE; RedHat Linux is a registered trademark of RedHat; Mandrake Linux is a registered trademark of Mandrake; Debian Linux is a registered trademark of Debian; openMosixview/Mosixview is GPL software and based on QT from Trolltech (please read the GPL-Licence policy); the 3dmosmon is developed by Johnny Cache. all other registered trademarks are owned by their owners