HCL cluster/hcl node install configuration log

From HCL
Revision as of 13:07, 25 April 2010 by Rhiggins (talk | contribs) (General Installation)

Jump to: navigation, search

HCL Nodes will be installed from a clone of a root node, hcl07. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained.

General Installation

Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format.

Install long list of packages.

Configure network interface as follows:

# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).

# The loopback network interface
auto lo eth1 eth0

iface lo inet loopback

# The primary network interface
allow-hotplug eth0
iface eth0 inet dhcp

allow-hotplug eth1
iface eth1 inet dhcp

Change the hosts file so that it does not list the node's hostname

127.0.0.1       localhost

# The following lines are desirable for IPv6 capable hosts
::1     localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

Ganglia

Install the ganglia-monitor package.

Configure ganglia monitor by editing /etc/ganglia/gmond.conf so that it contains:

cluster {
  name = "HCL Cluster"
  owner = "University College Dublin"
  latlong = "unspecified"
  url = "http://hcl.ucd.ie/"
}

And ...

/* Feel free to specify as many udp_send_channels as you like.  Gmond
   used to only support having a single channel */
udp_send_channel {
  mcast_join = 239.2.11.72
  port = 8649
  ttl = 1
}

/* You can specify as many udp_recv_channels as you like as well. */
udp_recv_channel {
  mcast_join = 239.2.11.72
  port = 8649
  bind = 239.2.11.72
}

After all packages are complete execute:

service ganglia-monitor restart

Complications

Hostnames

Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname stored on the image of the root node. A bug describing the setting of a hostname via DHCP is described here

The solution we will use is to add the file /etc/dhcp3/dhclient-exit-hooks.d/hostname with the contents:

if [[ -n $new_host_name ]]; then
  echo "$new_host_name" > /etc/hostname
  /bin/hostname $new_host_name
fi

The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). Note, the hostname of the machine will be set by the last interface that is configured via DHCP, in the current configuration that will be eth0. If an interface is reconfigured using dhclient, the hostname will be reset to the name belonging to that interface.

Further, the current hostnames for the second interface on nodes eth1 are invalid. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails.

udev and Network Interfaces

The udev system attempts to keep network interface names consistent regardless of changing hardware. This may be useful for laptops with wirless cards that a plugged in and out, but it causes problems when trying to install our root node image across all machines in the cluster. A description of the problem can be read here.

The solution is to remove the udev rules for persistent network interfaces, and disable the generator script for these rules. On the root cloning node do the following

  1. remove the file /etc/udev/rules.d/70-persistent-net.rules
  2. and to the top of the file: /lib/udev/rules.d/75-persistent-net-generator.rules, the following lines:
# skip generation of persistent network interfaces
ACTION=="*",                            GOTO="persistent_net_generator_end"