Difference between revisions of "HCL cluster/hcl node install configuration log"
|  (→Ganglia) |  (→Ganglia) | ||
| Line 65: | Line 65: | ||
| service gmond restart | service gmond restart | ||
| </source> | </source> | ||
| − | Pointing your browser to [http://192.168.21.254/ganglia/index.php] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. | + | Pointing your browser to [http://192.168.21.254/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. | 
| =Complications= | =Complications= | ||
Revision as of 20:01, 24 April 2010
HCL Nodes will be installed from a clone of a root node, hcl07. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained.
Contents
General Installation
Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format.
Install long list of packages.
Configure network interface as follows:
# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).
# The loopback network interface
auto lo eth1 eth0
iface lo inet loopback
# The primary network interface
allow-hotplug eth0
iface eth0 inet dhcp
allow-hotplug eth1
iface eth1 inet dhcp
Ganglia
Install the packages gmetad ganglia-montior and ganglia-webfrontend.
Configure the front end by appending to /etc/apache2/apache2.conf, the following:
Include /etc/ganglia-webfrontend/apache.conf
Configure gmetad by adding to the /etc/ganglia/gmetad.conf, the following line:
data_source "HCL Cluster" localhost
Configure ganglia monitor by editing /etc/ganglia/gmond.conf so that it contains:
cluster {
  name = "HCL Cluster"
  owner = "University College Dublin"
  latlong = "unspecified"
  url = "http://hcl.ucd.ie/"
}
And ...
/* Feel free to specify as many udp_send_channels as you like.  Gmond
   used to only support having a single channel */
udp_send_channel {
  mcast_join = 239.2.11.72
  port = 8649
  ttl = 1
}
/* You can specify as many udp_recv_channels as you like as well. */
udp_recv_channel {
  mcast_join = 239.2.11.72
  port = 8649
  bind = 239.2.11.72
}
After all packages are complete execute:
service apache2 restart
service gmetad restart
service gmond restart
Pointing your browser to here should display the monitoring page for HCL Cluster. gmond must also be installed and configured on the cluster nodes.
Complications
Hostnames
Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname stored on the image of the root node. A bug describing the setting of a hostname via DHCP is described here
The solution we will use is to add the file /etc/dhcp3/dhclient-exit-hooks.d/hostname with the contents:
if [[ -n $new_host_name ]]; then
  echo "$new_host_name" > /etc/hostname
  /bin/hostname $new_host_name
fi
The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). Note, the hostname of the machine will be set by the last interface that is configured via DHCP, in the current configuration that will be eth0. If an interface is reconfigured using dhclient, the hostname will be reset to the name belonging to that interface.
Further, the current hostnames for the second interface on nodes eth1 are invalid. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails.
udev and Network Interfaces
The udev system attempts to keep network interface names consistent regardless of changing hardware. This may be useful for laptops with wirless cards that a plugged in and out, but it causes problems when trying to install our root node image across all machines in the cluster. A description of the problem can be read here.
The solution is to remove the udev rules for persistent network interfaces, and disable the generator script for these rules. On the root cloning node do the following
- remove the file /etc/udev/rules.d/70-persistent-net.rules
- and to the top of the file: /lib/udev/rules.d/75-persistent-net-generator.rules, the following lines:
# skip generation of persistent network interfaces
ACTION=="*",                            GOTO="persistent_net_generator_end"
