Default Pxe Issues
As part of the normal (non-pxe) boot process the nodes must pass though the pxe phase of the bios and "fail". The pxe client on all of the nodes should contact the tftp server and request a file named with the node ip adress in hex. It will then proceed to remove the trailing character until it runs out of characters. Once that process fails, the node will look for a file named default (no extention).
The current default used the localboot pxe directive (part of the standard) to instruct the nodes to fail over to disk. The file (now named default.localboot) has the following contents:
default harddisk label harddisk localboot 0
A recent test with some of the newest mother boards however failed. The pxeclient did not honor the locaboot 0 option (or rather they locked up when given this option). A work around documented here (stored as a pdf for future use) recommended trying the syslinux chainloader. This required a new default file with the contents:
default harddisk label harddisk KERNEL chain.c32 APPEND hd0 0
and the addtion of the chain.c32 "kernel". The chain.c32 file was acquired here. The chain.c32 file was not present in later syslinux versions, I used version syslinux 4.04 zip file to get the require chain.c32 file. It was placed in the /tftpboot directory.
This work around changes the boot behavior of the nodes, but achieves the same result. Instead of failing over to the bios, the pxe-client downloads the chain loader. The chain loader then boots off the disk0 partition 0 (hence the parameter APPEND hd0 0). While earlier motherboards shouldn't need this fix, their operation should be unaffected by it (unless the disk is not enumerated the same on each node TODO check this). Documentation for the chain.c32 module is located here.