wiki:Internal/VirtualPL/ParallelImaging/GExec

Version 5 (modified by (none), 14 years ago) ( diff )

ParallelImaging

Using GEXEC as a mean to communicate with nodes from Console

As a next step in imaging nodes in parallel we are trying to communicate with the nodes using GEXEC instead of the now used nodehandler.

Steps in installing GEXEC

On the nodes and console first openssl needs to be installed as GEXEC uses authd as the encryption software which in turn uses private and public keys generated by Openssl. So the steps involved in installing GEXEC are as follows:

  1. Install openssl using apt-get install openssl.
  2. Download the gexec packages and its dependencies: authd and libe.
  3. Generate the public and private keys as follows:
    console.sb1# openssl genrsa -out auth_priv.pem
    console.sb1# chmod 600 auth_priv.pem
    console.sb1# openssl rsa -in auth_priv.pem -pubout -out auth_pub.pem
    
  4. Distribute the keys to all the nodes.
    console.sb1# scp auth_priv.pem node1-1:/etc/auth_priv.pem
    console.sb1# scp auth_pub.pem node1-1:/etc/auth_pub.pem
    
  1. Now install the 3 packages in the order:
    authd
    libe
    gexec
    

On newer Linux kernels (e.g., the 2.4.x ), you'll need to set the LD_ASSUME_KERNEL environment variable to "2.4.2" to avoid LinuxThreads bugs (e.g., incomplete implementation of POSIX cancellation points).

In addition the /etc/services needs to be updated with

gexec 2875/tcp #GEXEC

In order to run the client program gexec the gexec deamon (gexecd) program (/usr/local/sbin/) and authd (/usr/local/sbin/) needs to be run on all the clients. A shell script(named start attached to this page) has been written for the same purpose and added to the /etc/init.d. The links to the script at startup can be created using the command:

update-rc.d start defaults

The image running gexec is stored in repository2 in /export/orbit/image/tmp/node-1-1-2006-10-03-13-16-05.ndz.

Installing GEXEC in the PXE-Image

To install gexec service in the pxe image following changes have to be made to the pxe makefile:

  1. The GEXEC has problems running on kernel version 2.6.14 (current pxe version). So change the version to 2.6.12 (same as the baseline kernel version).
  1. Add all the lib depencies of gexec: /usr/lib/libssl.so.0.9.8 /usr/local/lib/libe.a /usr/lib/libcrypto.so.0.9.8 /usr/lib/libz.so.1 /lib/libcrypt.so.1 /lib/libpthread.so.0
  1. Add the keys auth_priv.pem and auth_pub.pem to /etc/<file_name>.
  1. Add the required binary files (gexec ,gexecd and authd) to /usr/sbin.
  1. Add a shell script start to the init.d/rcS script to be executed at the time of booting of image. The script performs 3 operations:
    1. Runs authd
  1. Runs gexecd

III.Loads the environment variable LD_ASSUME_KERNEL="2.4.2" for the reasons stated above.

Attachments (1)

Download all attachments as: .zip

Note: See TracWiki for help on using the wiki.