Changes between Initial Version and Version 1 of Internal/Infrastructure/SetupTestbed/BOOTB


Ignore:
Timestamp:
Nov 7, 2008, 5:21:33 PM (16 years ago)
Author:
korakis
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • Internal/Infrastructure/SetupTestbed/BOOTB

    v1 v1  
     1
     2[[TOC(heading=Table of Contents, depth=3)]]
     3= Building our own Testbed =
     4
     5In this section we give a detailed description of our efforts on setting up a mid-size ORBIT-like testbed. In the right of this page you can see a list of all the related actions we had to take from the initial stage until the stage of having a full functional testbed that can be remotely accessed for uploading particular experiments, running the experiments and collecting the results.
     6
     7
     8
     9== Hardware Setup ==
     10
     11=== Orbit Node ===
     12Each ORBIT Radio Node is a PC with a 1 GHz VIA C3 processor, 512 MB of RAM, 40 GB of local disk, two Ethernet ports, two 802.11 a/b/g cards and a Chassis Manager to control the node, see Figure 3.  The description about the ethernet ports is following.
     13
     14'''Control port''' - The ethernet port between 4 USB ports, it is a Rtl-8169 Gigabit ethernet port, which is used to load and control the ORBIT node and collect measurements.
     15
     16'''Data port''' - The ethernet port above two USB ports, it is a VT6102 Rhine-II 100/10baseT Ethernet port, which is used for data communication,
     17
     18'''CM port''' - The 10BaseT Ethernet port on Chassis Manager Card, which is used to communicate with gridservice (not gridservice2) [[BR]]
     19[[BR]]
     20[[Image(node-50.png)]] [[Image(orbit_node.JPG)]]
     21
     22                           
     23=== Testbed ===
     24The test bed consists of nodes and several servers. Technically, all servers can be put in one machine with at lease two ethernet ports, but it's not recommendatory because of potential security consideration. A typical test bed include three servers discribed below[[BR]]
     25[[Image(testbed.PNG)]]
     26                               
     27
     28'''Services''' - It is used to host various services including DHCP, DNS, NTP, TFTP, PXE, Frisbee, NFS, mysql, OML and Apache. We have different aliases for the management host to segregate the services that it hosts. This machine or port shall be connected with '''Control port''' of nodes. [[BR]]
     29
     30'''Console''' - It is used to run experiments with nodehandler4. Console is also connected with '''Control port''' of nodes. It may share one Ethernet port with '''Services'''. A better way is setuping a console in one machine exclusively and let it accessible by experimenters with ssh or XDMCP.[[BR]]
     31
     32'''CMC''' - It is the control and monitoring manager for all CM elements of ORBIT nodes. It is connected with '''CM port''' of nodes and can NOT share Ethernet port with '''Services''' and '''Console'''.
     33
     34In our situation, Service and Console share one ethernet port with address 10.10.0.10/16 and CMC is on another ethernet port with address 10.1.200.1/16.
     35
     36You may connect 10.10.0.10 and 10.1.200.1 with an internal route. In my situation. Since both machines connect to outside with their eth1, I explicitly set route on both machine as following.[[BR]]
     37On Console/Services
     38{{{
     39console:~# route add -host 10.1.200.1 gw 128.238.34.248 dev eth1
     40console:~# route
     41Kernel IP routing table
     42Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
     43cmc.grid.poly.e cmc.local       255.255.255.255 UGH   0      0        0 eth1
     44localnet        *               255.255.255.0   U     0      0        0 eth1
     4510.10.0.0       *               255.255.0.0     U     0      0        0 eth0
     46default         128.238.34.1    0.0.0.0         UG    0      0        0 eth1
     47}}}
     48
     49On CMC
     50{{{
     51cmc:~# route add -host 10.10.0.10 gw 128.238.34.247 dev eth1
     52cmc:~# route
     53Kernel IP routing table
     54Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
     55console.grid.po console.local   255.255.255.255 UGH   0      0        0 eth1
     56128.238.34.0    *               255.255.255.0   U     0      0        0 eth1
     5710.1.0.0        *               255.255.0.0     U     0      0        0 eth0
     58default         128.238.34.1    0.0.0.0         UG    0      0        0 eth1
     59cmc:~# ping console
     60PING console.grid.poly.edu (10.10.0.10) 56(84) bytes of data.
     6164 bytes from console.grid.poly.edu (10.10.0.10): icmp_seq=1 ttl=64 time=0.128 ms
     6264 bytes from services.grid.poly.edu (10.10.0.10): icmp_seq=2 ttl=64 time=0.139 ms
     63}}}
     64
     65== Software and Services ==
     66=== Linux Installation ===
     67You need install Linux on two or more machines. If you want to use only one machine, it must equipped with two or more ethernet interfaces. You can choose whatever Linux distribution you prefer. However, Debian is strongly recommended because some testbed related software are distributed in both  source code and Deb package.
     68The installation guide can be found [http://www.debian.org/releases/stable/installmanual here]. For more information about Debian, please visit http://www.debian.org
     69   
     70=== Configure Apt ===
     71If the Linux distribution you choosed is Debian/Ubuntu, please add the following lines in file '''/etc/apt/sources.list''', so let apt-get can find Debian packages provided by [http://www.orbit-lab.org Orbit lab].
     72
     73{{{
     74deb http://apt.orbit-lab.org/orbit testing main
     75deb http://apt.orbit-lab.org/orbit unstable main
     76deb http://apt.orbit-lab.org/orbit stable main
     77}}}
     78
     79After each time sources.list is changed, please run command "apt-get update" to resynchronize the package index files from their sources.
     80For more about sources.list and apt-get, please refer with command "man sources.list" and "man apt-get".
     81
     82=== Configure host name ===
     83If Console/Service and CMC are on different machines, please edit file '''/etc/hostname''' respectively. The host name of Console/Service is "'''console'''", and that of CMC is "'''cmc'''".
     84If the whole three server are on one machine, please set it's host name to "'''console'''".
     85
     86=== Configure network interface ===
     87Since we place Console/Service and CMC on different machine. Interface configuration file should like following[[BR]]
     88For Console/Service
     89{{{
     90# This file describes the network interfaces available on your system
     91# and how to activate them. For more information, see interfaces(5).
     92
     93# The loopback network interface
     94auto lo eth0 eth1
     95iface lo inet loopback
     96
     97# The primary network interface access to outside internet
     98allow-hotplug eth1
     99iface eth1 inet static
     100address 128.238.34.247
     101netmask 255.255.255.0
     102network 128.238.34.0
     103broadcast 128.238.34.255
     104gateway 128.238.34.1
     105#iface eth1 inet dhcp
     106## dns-* options are implemented by the resolvconf package, if installed
     107dns-nameservers 10.10.0.10 128.238.1.68
     108dns-search grid.poly.edu
     109dns-domain grid.poly.edu
     110
     111# The internal network interface for console and services(nodehandler4,
     112# dhcp, dns, gridservice2, OML, etc)
     113allow-hotplug eth0
     114iface eth0 inet static
     115address 10.10.0.10
     116netmask 255.255.0.0
     117network 10.10.0.0
     118broadcast 10.10.255.255
     119## dns-* options are implemented by the resolvconf package, if installed
     120dns-nameservers 10.10.0.10 128.238.1.68
     121dns-search grid.poly.edu
     122dns-domain grid.poly.edu
     123}}}
     124
     125For CMC
     126{{{
     127# This file describes the network interfaces available on your system
     128# and how to activate them. For more information, see interfaces(5).
     129
     130# The loopback network interface
     131auto lo eth0 eth1
     132iface lo inet loopback
     133
     134# The primary network interface access to outside internet
     135allow-hotplug eth1
     136iface eth1 inet static
     137address 128.238.34.248
     138netmask 255.255.255.0
     139network 128.238.34.0
     140broadcast 128.238.34.255
     141gateway 128.238.34.1
     142# dns-* options are implemented by the resolvconf package, if installed
     143dns-nameservers 10.10.0.10 128.238.1.68
     144dns-search grid.poly.edu
     145dns-domain grid.poly.edu
     146#iface eth1 inet dhcp
     147
     148# The internal network interface for CMC(gridservice).
     149allow-hotplug eth0
     150iface eth0 inet static
     151address 10.1.200.1
     152netmask 255.255.0.0
     153network 10.1.0.0
     154broadcast 10.1.255.255
     155## dns-* options are implemented by the resolvconf package, if installed
     156dns-nameservers 10.10.0.10 128.238.1.68
     157dns-search grid.poly.edu
     158dns-domain grid.poly.edu
     159}}}
     160
     161In each server, interface eth0 is for test bed, and eth1 is for access outside. If all servers is on one machine, the configuration of eth0 on CMC should be moved to /etc/network/interfaces on Console/Services.
     162
     163=== Name resolve ===
     164Because eth1 might connect with other DHCP/DNS servers from outside, and it used to change file /etc/resolv.conf when servers boot up, we need resolvconf to fix the setting in in resolv.conf.
     165
     166First, install "resolvconf" with command "''apt-get install resolvconf''". Then, run command "dpkg-reconfigure resolvconf" and agree to symlink /etc/resolv.conf to /etc/resolvconf/run/resolv.conf. At last, edit file /etc/resolvconf/interface-order like following
     167
     168{{{
     169# interface-order(5)
     170eth*
     171}}}
     172
     173Some comment for file /etc/network/interfaces:
     174
     175"'''dns-nameservers 10.10.0.10 128.238.1.68'''" indicates resolvconf to add nameservers 10.10.0.10 and 128.238.1.68 in resolv.conf. 10.10.0.10 is the address of DNS used for testbed, and 128.238.1.68 is the address of DNS for outside, which you can change to others in your situation. Please make sure 10.10.0.10 is the first address appear in the list.
     176
     177"'''dns-search grid.poly.edu'''" and "'''dns-domain grid.poly.edu'''" indicate resolvconf to add "search" and "domain" entries in resolv.conf. "'''grid'''" is considered as the name of test bed, which will be used in later configuration.
     178
     179The final /etc/resolv.conf might looks like below after rebooting the machine.
     180{{{
     181# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
     182#     DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
     183nameserver 10.10.0.10
     184nameserver 128.238.1.68
     185search grid.poly.edu
     186}}}
     187
     188For more information about resolvconf, please refer /usr/share/doc/resolvconf/README.gz and "man resolvconf".
     189
     190=== DHCP ===
     191'''Purpose''': This software runs a DHCP server that assigns IP addresses to clients on demand.
     192
     193'''Installation and Configuration: ''':
     194 * Run command on '''Console/Services''', and select eth0 as interface.
     195{{{
     196apt-get install dhcp3-server
     197}}}
     198 * Make sure that /etc/default/dhcp3-server has eth0 as default interface which DHCP will server for. Just like following
     199{{{
     200# On what interfaces should the DHCP server (dhcpd) serve DHCP requests?
     201#       Separate multiple interfaces with spaces, e.g. "eth0 eth1".
     202INTERFACES="eth0"
     203}}}
     204
     205 * Edit /etc/dhcp3/dhcpd.conf. The configuration looks like below.
     206{{{
     207# The ddns-updates-style parameter controls whether or not the server will
     208# attempt to do a DNS update when a lease is confirmed. We default to the
     209# behavior of the version 2 packages ('none', since DHCP v2 didn't
     210# have support for DDNS.)
     211ddns-update-style interim;
     212use-host-decl-names on;
     213
     214allow booting;
     215allow bootp;
     216
     217# option definitions common to all supported networks...
     218option domain-name "grid.poly.edu";
     219
     220default-lease-time 259200;
     221max-lease-time 259200;
     222
     223# If this DHCP server is the official DHCP server for the local
     224# network, the authoritative directive should be uncommented.
     225#authoritative;
     226
     227# Use this to send dhcp log messages to a different log file (you also
     228# have to hack syslog.conf to complete the redirection).
     229log-facility local7;
     230
     231subnet 10.10.0.0 netmask 255.255.0.0 {
     232   range 10.10.1.1 10.10.255.254;
     233   option domain-name "grid.poly.edu";
     234   ddns-updates off;
     235   ddns-domainname "grid.poly.edu";
     236   ddns-rev-domainname "in-addr.arpa";
     237   option domain-name-servers 10.10.0.10;
     238   next-server 10.10.0.10;
     239   
     240   host services {hardware ethernet 00:1B:2F:BE:EF:94; fixed-address services.grid.poly.edu;}
     241   host console {hardware ethernet 00:1B:2F:BE:DF:6E; fixed-address console.grid.poly.edu;}
     242
     243  filename "/tftpboot/pxelinux.bin";
     244
     245#node 10.10.x.y
     246
     247    group {
     248       host node1-1 {hardware ethernet 00:0F:EA:8C:AE:39; fixed-address node1-1.grid.poly.edu;}
     249       host node1-2 {hardware ethernet 00:03:2D:08:19:fe; fixed-address node1-2.grid.poly.edu;}
     250       host node1-3 {hardware ethernet 00:03:2D:07:67:CE; fixed-address node1-3.grid.poly.edu;}
     251   }
     252}
     253}}}
     254 '''Some comments on the dhcpd.conf'''[[BR]]
     255
     256 First line is '''use-host-decl-names''' on which means DNS has actual IP mappings, DHCP just gives out hostnames. This means only IP to name updates need to be done only at DNS.[[br]]
     257 '''next-server'''  is used to specify the host address of the server from which the initial boot file is to be loaded. In our case, it's the address tftp server.[[BR]]
     258 The  '''filename'''  statement is used to specify the name of the initial boot file which is to be loaded by a client. [[BR]]
     259 '''domain-name-servers''' specifies a list of Domain Name System name servers available to the client.[[BR]]
     260 The node name in form of node'''x'''-'''y''' determines it's address must be 10.10.'''x'''.'''y'''. For example, the address of node2-3 is 10.10.2.3. The actual address mapping is done by DNS.
     261
     262'''To Run: ''': /etc/init.d/dhcp3-server start  -- errors go to /var/log/daemon.log
     263
     264 === DNS ===
     265'''Purpose''': Services.poly.edu” hosts the primary DNS service for the zone grid.poly.edu. The DNS server is the standard BIND9 software packaged.
     266
     267
     268'''Installation and Configuration: ''':
     269 * ''apt-get install bind9''
     270 * You need edit Named.conf, named.conf.options and named.conf.local under /etc/bind as below
     271   * named.conf
     272{{{
     273include "/etc/bind/named.conf.options";
     274include "/etc/bind/named.conf.local";
     275
     276Controls {
     277      inet 127.0.0.1 port 953
     278      allow { 127.0.0.1; };
     279};
     280
     281// prime the server with knowledge of the root servers
     282zone "." {
     283        type hint;
     284        file "/etc/bind/db.root";
     285};
     286
     287// be authoritative for the localhost forward and reverse zones, and for
     288// broadcast zones as per RFC 1912
     289
     290zone "localhost" {
     291        type master;
     292        file "/etc/bind/db.local";
     293};
     294
     295zone "127.in-addr.arpa" {
     296        type master;
     297        file "/etc/bind/db.127";
     298};
     299
     300zone "0.in-addr.arpa" {
     301        type master;
     302        file "/etc/bind/db.0";
     303};
     304
     305zone "255.in-addr.arpa" {
     306        type master;
     307        file "/etc/bind/db.255";
     308};
     309}}}
     310   * named.conf.options
     311{{{
     312options {
     313        directory "/etc/bind";
     314        auth-nxdomain no;    # conform to RFC1035
     315        listen-on-v6 { any; };
     316};
     317}}}
     318   * named.conf.local
     319{{{
     320// Consider adding the 1918 zones here, if they are not used in your
     321// organization
     322//include "/etc/bind/zones.rfc1918";
     323zone "grid.poly.edu" IN {
     324       type master;
     325       file "/etc/bind/orbit.zone";
     326};
     327zone "in-addr.arpa" IN {
     328      type master;
     329      file "/etc/bind/zone.orbit";
     330};
     331}}}
     332 * You need create orbit.zone and zone.orbit under /etc/bind as below
     333   * orbit.zone: Forward lookup
     334{{{
     335$TTL 3600
     336@ IN  SOA services.grid.poly.edu.  root.services.grid.poly.edu. (
     337                          2008072501 ; serial
     338                          3600       ; refresh (1 hour)
     339                          600        ; retry (10 min)
     340                          10000      ; expire (2 hours)
     341                          3600    );
     342                                    ;
     343@        IN    NS        services.grid.poly.edu.
     344$ORIGIN grid.poly.edu.
     345$TTL 129600
     346windows      IN   A       10.10.1.8
     347rxwarp       IN   A       10.10.1.9
     348node1-1      IN   A       10.10.1.1
     349node1-2      IN   A       10.10.1.2
     350node1-3      IN   A       10.10.1.3
     351
     352console      IN   A      10.10.0.10
     353cmc          IN   A      10.1.200.1
     354
     355services     IN   CNAME  console.grid.poly.edu.
     356dhcp         IN   CNAME  console.grid.poly.edu.
     357frisbee      IN   CNAME  console.grid.poly.edu.
     358pxe          IN   CNAME  console.grid.poly.edu.
     359oml          IN   CNAME  console.grid.poly.edu.
     360repository   IN   CNAME  console.grid.poly.edu.
     361repository1  IN   CNAME  console.grid.poly.edu.
     362repository2  IN   CNAME  console.grid.poly.edu.
     363ntp          IN   CNAME  console.grid.poly.edu.
     364loghost      IN   CNAME  console.grid.poly.edu.
     365idb1         IN   CNAME  console.grid.poly.edu.
     366idb2         IN   CNAME  console.grid.poly.edu.
     367}}}
     368   * zone.orbit: Reverse lookup
     369{{{
     370$TTL 3600
     371@           IN SOA services.grid.poly.edu. root.services.grid.poly.edu. (
     372                                           2008021817    ; serial
     373                                           28800    ; refresh (8hours)
     374                                           900      ; retry (15 mins)
     375                                           604800   ; expire (1 week)
     376                                           86400    ; minimum (1 day)
     377                                    );
     378@              IN    NS              services.grid.poly.edu.
     379$ORIGIN in-addr.arpa.
     380$ORIGIN 10.in-addr.arpa.
     381$ORIGIN 0.10.10.in-addr.arpa.
     382$TTL 129600
     38310     IN  PTR    services.grid.poly.edu.
     38410     IN  PTR    console.grid.poly.edu.
     385
     386$ORIGIN 1.10.10.in-addr.arpa.
     3879     IN  PTR   rxwarp.grid.poly.edu.
     3888     IN  PTR   windows.grid.poly.edu.
     3891     IN  PTR   node1-1.grid.poly.edu.
     3902     IN  PTR   node1-2.grid.poly.edu.
     3913     IN  PTR   node1-3.grid.poly.edu.
     392
     393$ORIGIN 200.1.10.in-addr.arpa.
     3941     IN  PTR   cmc.grid.poly.edu.
     395}}}
     396 * Make sure there are dots at the end of the domains. The owner and group of orbit.zone and zone.orbit might be like following.
     397{{{
     398console:/etc/bind# ls -l orbit.zone zone.orbit
     399-rw-r--r-- 1 root bind 1217 2008-08-07 17:00 orbit.zone
     400-rw-r--r-- 1 root bind  928 2008-08-07 17:01 zone.orbit
     401}}}
     402
     403 '''To Run: ''':
     404/etc/init.d/bind9 start  -- errors go to /var/log/daemon.log
     405
     406 * You may run command "'''host'''" on Console/Services or CMC  as below to verify if bind works well.
     407{{{
     408console:~# host cmc
     409cmc.grid.poly.edu has address 10.1.200.1
     410console:~# host console
     411console.grid.poly.edu has address 10.10.0.10
     412console:~# host services
     413services.grid.poly.edu is an alias for console.grid.poly.edu.
     414console:~# host node1-1
     415node1-1.grid.poly.edu has address 10.10.1.1
     416console:~# host pxe
     417pxe.grid.poly.edu is an alias for console.grid.poly.edu.
     418console.grid.poly.edu has address 10.10.0.10
     419console:~# host frisbee
     420frisbee.grid.poly.edu is an alias for console.grid.poly.edu.
     421console.grid.poly.edu has address 10.10.0.10
     422}}}
     423
     424=== Apache Web Server ===
     425'''Purpose:'''
     426Apache server is required for maintaining the ORBIT local repository for debian packages and also to view the results of the experiment
     427
     428'''Installation:'''
     429 * ''apt-get install apache2'' [[BR]]
     430Note that no additional configuration is needed for apache. Also, make sure that /var/www/cgi-bin points to /usr/lib/cgi-bin (or create a soft link if one does not exist using cd /var/www/; ln –s /usr/lib/cgi-bin cgi-bin). [[BR]]
     431
     432'''To run:'''/etc/init.d/apache2 start -- errors go to /var/log/daemon.log
     433
     434We also need install libgd, which is used to view the results of the experiment. The command below can install it
     435{{{
     436apt-get install libgd-gd2-perl
     437}}}
     438
     439=== NTP ===
     440
     441'''Purpose:'''
     442 All the machines synchronize their time using the time server as the reference.
     443
     444'''Installation and Configuration:'''
     445 * ''apt-get install ntp''
     446 * You may add ntp server "pool.ntp.org" into /etc/ntpd.conf if there is server setting in it.
     447
     448'''To run:'''
     449 * /etc/init.d/ntpd start -- errors go to /var/log/daemon.log
     450
     451=== TFTP Server ===
     452'''Purpose:'''
     453TFTP is needed to install PXE images whenever you need to install an image onto the node (using Frisbee). It is also used to load a memory based image that can be used to fetch the current image of the node into the repository
     454
     455'''Installation and Configuration:'''
     456 * ''apt-get install atftpd''
     457
     458There are two options here: either to run atftpd as a standalone daemon or run it under inetd. For heavy duty tftp services, you can choose to run is as a standalone daemon.
     459For our installation, we choose the standalone daemon.
     460
     461 * Edit file /etc/inetd.conf and point the tftp directory to /tftpboot. The configuration may look like following
     462{{{
     463#:BOOT: TFTP service is provided primarily for booting.  Most sites
     464#       run this only on machines acting as "boot servers."
     465tftp            dgram   udp     wait    root /usr/sbin/atftpd   /usr/sbin/in.tftpd /tftpboot
     466}}}
     467 * The PXE image can be download from [http://witestlab.poly.edu/attachment/wiki/BOOTB/tftpboot.tar.bz2?format=raw here]. You need to extract it with command[[BR]]tar -xjvf tftpboot.tar.bz2[[BR]]
     468The final content of directory /tftpboot looks like as below.
     469{{{
     470console:~# ls /tftpboot -R
     471/tftpboot:
     472initramfs-orbit-pxe-2.0.3.gz  linux-orbit-pxe-2.6.25.1  pxelinux.bin  pxelinux.cfg
     473
     474/tftpboot/pxelinux.cfg:
     475default  orbit-2.0.3-omf
     476}}}
     477
     478In case of problems, make sure that lo interface is up.
     479
     480=== NFS Service ===
     481
     482'''Purpose:'''
     483This service is used to remotely mount directories on the nodes while fetching their image using imagezip utility. Also, Frisbee service makes use of this directory to install images onto nodes.
     484
     485'''Installation and Configuration:'''
     486
     487 * ''apt-get install nfs-kernel-server''
     488
     489 * create a path like "/export/orbit/image/tmp"
     490
     491 * Add a line in /etc/exports file as follows. "/export/orbit/image/tmp" is the default path nodes use to save frisbee images.
     492
     493{{{
     494/export/orbit/image/tmp     10.10.0.0/16(rw,sync,no_root_squash)
     495}}}
     496
     497'''To run:'''
     498 * /etc/init.d/nfs-kernel-server start -- errors go to /var/log/daemon.log
     499
     500=== Mysql Server ===
     501
     502'''Purpose:'''
     503This service is used to store the results of the experiments conducted on ORBIT
     504
     505''' Installation and Configuration:'''
     506 * ''apt-get install mysql-server-4.1''
     507 * Edit /etc/mysql/my.cnf and replace bind-address from 127.0.0.1 to 10.10.0.10
     508type mysql  and at the prompt, enter the following
     509
     510 * Replace the password by an appropriate one. Basically, this creates a new account called orbit with the password specified and allows access to mysql databases from localhost and any other machine on the network.
     511
     512'''To run:'''
     513 * /etc/init.d/mysql start -- errors go to /var/log/mysql.log
     514
     515=== ORBIT Server ===
     516Orbit specific services include nodehandler, nodeagent, frisbee, gridservices, gridservices2 and OML(Orbit Measurement Library). If you have added configuration in sources.list as described [http://witestlab.poly.edu/wiki/BOOTB#ConfigureApt here], you can follow the commands below to install them. All these services are installed on Console, except gridservices on CMC.
     517
     518For Console
     519 * ''apt-get update''[[BR]]
     520
     521 * ''apt-get install otg''[[BR]]
     522
     523 * ''apt-get install nodehandler4''[[BR]]
     524   Since some files in nodehandler4 debian package are obsolete, you need to update it with a [http://witestlab.poly.edu/attachment/wiki/BOOTB/nodehandler.tar.gz?format=raw tar ball]. Extract it and replace fold '''/opt/nodehandler4-4.2.0/'''.[[BR]]
     525   The configuration of nodehandler4 is based on [http://www.yaml.org YAML] script file /etc/nodehandler4/nodehandler.yaml. You can create a symbol link named "'''nodehandler.yaml'''" to the actual configuration file. The example and comments below could be helpful to understand it.
     526{{{
     527  1 # NOTE: use only 'spaces' to indent !
     528  2 # ('tab' indents are not supported by the ruby yaml parser used to read this file)
     529  3 #
     530  4 # This is the Config file for the NodeHandler4 on the WINLAB platform
     531  5 #
     532  6 ---
     533  7 nodehandler:
     534  8   name_resolv: |
     535  9     name = nil
     536 10     if NodeHandler.JUST_PRINT
     537 11       name = 'debug'
     538 12     else
     539 13       # take first subdomain as grid name (sb0.orbit-lab.org)
     540 14       IO.popen('hostname -d') {|f| name = f.gets.split('.')[0] }
     541 15     end
     542 16
     543 17   testbed:
     544 18
     545 19     # Config Parameter for the "default" Testbed
     546 20     #
     547 21     # In the WINLAB setting the default testbed is the "grid" testbed, using gridservice2
     548 22     default:
     549 23
     550 24       repository:
     551 25         path: [".", "../repository", "/opt/nodehandler4-4.2.0/repository"]
     552 26
     553 27       pxe:
     554 28         # This is the URL where NH can find the PXE GridService
     555 29         url: 'http://pxe:5022/pxe'
     556 30
     557 31       cmc:
     558 32         # This is the URL where NH can find the CMC GridService
     559 33         url: "http://cmc:5012/cmc"  # Contact the CMC of GS 1 - Not ported yet for GS 2
     560 34
     561 35       oml:
     562 36         # NodeAgents will use the numerical IP address in 'local_host' to connect
     563 37         # to the machine running the NodeHandler, in order to retrieve the OML defs
     564 38         # (in XML, and generated by NH). These OML defs are used by the NAs' applications
     565 39         # Thus, 'local_host' = Control IP address (reachable by NAs) of the NH's machine
     566 40         local_host: '10.10.0.10'
     567 41         # The parameters below are the contact details for the OML GridService
     568 42         url: "http://oml:5022/oml"
     569 43         port: 5022
     570 44         host: "oml"
     571 45
     572 46       frisbee:
     573 47         # The parameters below are the contact details for the Frisbee GridService
     574 48         default_disk: '/dev/sda'
     575 49         url: 'http://frisbee:5022/frisbee'
     576 50
     577 51       inventory:
     578 52         # This is the URL where NH can find the inventory GridService
     579 53         url: 'http://cmc:5022/inventory'
     580 54
     581 55       # Command used to launch the communication module
     582 56       # The type of comm module to launch depends on the cmd line params
     583 57       # '-c PORT' runs a TCP comm. module that will connect to the node Agent's TCP server on PORT
     584 58       # '-l PORT' runs a TCP comm. Server module that will listen for node Agent's connection on PORT
     585 59       # default: runs a Multicast comm. module
     586 60       #
     587 61       # The following line runs the commServer in TCP Client Mode
     588 62       #commServer: /opt/nodehandler4-4.2.0/sbin/commServer --logfile /tmp/commServer-%ID%.log -d 4 --iface eth1 -c 9026
     589 63       # The following line runs the commServer in Multicast Mode
     590 64       commServer: /opt/nodehandler4-4.2.0/sbin/commServer --logfile /tmp/commServer-%ID%.log -d 4 --iface eth0
     591 65
     592 66       #
     593 67       # Return the IP address of the control interface of
     594 68       # the node a coordinates x:y
     595 69       #
     596 70       # @param x X coordinate of node
     597 71       # @param y Y coordinate of node
     598 72       #
     599 73       controlIp: |
     600 74         |x, y|
     601 75           # This is the Node Agents control IP address used in the WINLAB grid testbed
     602 76           "10.10.#{x}.#{y}"
     603 77
     604 78       #
     605 79       # Return the x:y coordinates of a node signing on with
     606 80       # 'idString'. This string is supposed to be of the type
     607 81       # '/ip/CONTROL_IP'.
     608 82       #
     609 83       # @param idString String of type '/ip/CONTROL_IP'
     610 84       # @return Array of [x, y]
     611 85       #
     612 86       nodeId2coord: |
     613 87         |idString|
     614 88           match = /.*\.(\d+)\.(\d+)$/.match(idString)
     615 89           if (match != nil && match.size == 3)
     616 90             x = match[1].to_i
     617 91             y = match[2].to_i
     618 92             if x > 100
     619 93               # sandbox
     620 94               x = y / 100
     621 95               y = y % 100
     622 96             end
     623 97             return [x, y]
     624 98           end
     625 99           raise "Can't parse #{idString}"
     626100
     627101       # Return the control IP address (as string) or DNS name for a node
     628102       # at a given coordinate.
     629103       #
     630104       # Throws an ConfigException if no IP address can be found.
     631105       # In this testbed, nodes are identified using only 1-dimensional coordinate: X
     632106       # (At WinLab, nodes are identified using 2D coordinates)
     633107       #
     634108       coord2ip: |
     635109         |x, y|
     636110           # This is the base name used in the WINLAB testbeds
     637111           name = "node#{x}-#{y}"
     638112           begin
     639113             Socket.gethostbyname(name)[3].unpack('C4').join('.')
     640114           rescue SocketError
     641115             raise("Unknown host '#{name}'")
     642116           end
     643117
     644118       load: |
     645119         | uri, evalRuby |
     646120           path = [ uri.split(':').join('_') + '.rb']
     647121           postfix = '/' + uri.split(':').join('/') + '.rb'
     648122           REPOSITORY_DEFAULT().each { |dir|
     649123             path << dir + postfix
     650124           }
     651125           #puts "PATH: #{path.join(':')}"
     652126           file = path.inject(nil) { |found, p|
     653127             if found == nil && File.readable?(p)
     654128               found = p
     655129             end
     656130             found
     657131           }
     658132           if file == nil
     659133             raise IOError, "Can't find any of '#{path.join(', ')}'"
     660134           end
     661135           str = File.new(file).read()
     662136           if evalRuby
     663137             #eval(str, nil, path, 1)
     664138             require file
     665139           end
     666140           [str, 'text/ruby']
     667141
     668142     # Config Parameter for the "grid" Testbed
     669143     #
     670144     # To use this testbed, call nodeHandler with the option "-d grid"
     671145     # Any parameter settings within this section will override the settings
     672146     # done in the "default" section. The nodeHandler fisrt load the "default"
     673147     # settings, then it uses the "domain" specific settings to override the
     674148     # relevant parameters
     675149     #
     676150     # For more details: see comments in "default" domain section
     677151     grid:
     678152       X_MAX: 2
     679153       Y_MAX: 20
     680154       oml:
     681155         local_host: '10.10.0.10'
     682156         url: "http://oml:5022/oml"
     683157         port: 5022
     684158         host: "oml"
     685159       controlIp: |
     686160         |x, y|
     687161           "10.10.#{x}.#{y}"
     688162
     689163     # Config Parameter for the "debug" Testbed
     690164     #
     691165     # To use this testbed, call nodeHandler with the option PRINT_ONLY or "-d debug"
     692166     # Any parameter settings within this section will override the settings
     693167     # done in the "default" section. The nodeHandler fisrt load the "default"
     694168     # settings, then it uses the "domain" specific settings to override the
     695169     # relevant parameters
     696170     debug:
     697171       repository:
     698172         path: ['../repository']
     699173       commServer: ../c/commServer/commServer -d 4 --iface eth0
     700174       coord2ip: |
     701175         |x, y|
     702176           "10.99.#{x}.#{y}"
     703}}}
     704
     705 Comments:[[BR]]
     706 Line 29: PXE is one of services from gridservices2, and 5022 is its default port. Please make sure http://pxe could be resolved from '''Console'''.[[BR]]
     707 Line 33: CMC is the only service provided by gridservices, and 5012 is its default port. In my situation. The address is http://cmc is 10.1.200.1.[[BR]]
     708 Line 48: '/dev/sda' is the paramter frisbee need to image hard drivers. If the hard drivers in nodes are equipped with ATA interface, Please change it to "/dev/hda"[[BR]]
     709 Line 49: Similar with PXE, it's also a service by gridsercies2.[[BR]]
     710 Line 64: This is command used to launch the communication module of Nodehandler4. The value for parameter "--iface" should be the interface with IP address 10.10.0.10.[[BR]]
     711 Line 151: "grid" is the name of testbed which must match with the first word of domain name of '''Console'''. For example, the output of command "hostname -d" is "grid.poly.edu", so "grid" is the name of testbed.[[BR]]
     712 Line 152,153: The maximum value of two dimensions of testbed. If the '''Control port''' of each node has IP address 10.10.'''x'''.'''y''', the node's hostname should be like node'''x'''-'''y'''.grid.poly.edu, which is controlled by DNS. '''x''' and '''y''' are intergers less than X_MAX and Y_MAX respectively.[[BR]]
     713
     714 * frisbee package is an dependency of gridservices2, which can be downloaded from [http://witestlab.poly.edu/attachment/wiki/BOOTB/frisbee_1.0.3-1_i386.deb?format=raw here] and install it with command[[BR]]
     715   ''dpkg -i  frisbee_1.0.3-1_i386.deb''[[BR]]
     716   You can also get the latest source code of frisbee from [http://www.emulab.net/downloads/emulab-080630.tar.gz here] if you prefer to build it in your own system. You need get at least 2 executable files, '''frisbeed''' and '''frisbee'''. If you want make frisbee images, you must build '''imagezip''' from  source code, which is not provided by frisbee debian package. For more information about frisbee, please vist http://www.emulab.net/software.php3 or read "README" attached with the source code.[[BR]][[BR]]
     717 * ''apt-get install gridservices2 oml-collection-server''[[BR]]
     718 The configuration of gridservices2 is under path /etc/gridservices2, Please go through README.txt under it first. Gridservices2 must enable at least 2 services, frisbee and PXE. The configuration of frisbee, '''frisbee.yaml''', looks like below
     719{{{
     720  1 # NOTE: use only 'spaces' to indent !
     721  2 # ('tab' indents are not supported by the ruby yaml parser used to read this file)
     722  3 #
     723  4 # This is the Config file for the Frisbee GridService on the NICTA platform
     724  5 #
     725  6 ---
     726  7 frisbee:
     727  8
     728  9   # Max. number of active daemons allowed
     729 10   maxDaemons: 10
     730 11
     731 12   testbed:
     732 13     default:
     733 14       # Directory images are stored
     734 15       imageDir: /home/node
     735 16       defaultImage: baseline0.4
     736 17
     737 18       # max bandwidth for frisbee server
     738 19       bandwidth: 50000000
     739 20
     740 21       # Multicast address to use for servicing images
     741 22       mcAddress: 224.0.0.2
     742 23       # Using ports starting at ...
     743 24       startPort: 7000
     744 25
     745 26       # Time out frisbee server if nobody requested it within TIMEOUT sec
     746 27       timeout: 3600
     747 28
     748 29       # Directory to find frisbee daemon
     749 30       frisbeeBin: /usr/sbin/frisbeed
     750 31
     751 32       # Local interface to bind to for frisbee traffic
     752 33       multicastIF: 10.10.0.10
     753 34
     754 35     indoor:
     755 36       imageDir: /home/node
     756 37       defaultImage: baseline0.4
     757 38       bandwidth: 50000000
     758
     759
     760 39       mcAddress: 224.0.0.2
     761 40       startPort: 7000
     762 41       timeout: 3600
     763
     764
     765 42       frisbeeBin: /usr/sbin/frisbeed
     766 43       multicastIF: 10.10.0.10
     767}}}
     768
     769 Comments:[[BR]]
     770 Line 15: The place where frisbee image files are places.[[BR]]
     771 Line 16: The file name of image without "ndz" suffix. For example, if file name is '''baseline0.4.ndz''', it should be set to '''baseline0.4'''. The file will be the default image when no explicit image is given in omf command.[[BR]]
     772 Line 19: The maximum bandwidth in bps which frisbee can use to image hard drivers.[[BR]]
     773 Line 30: The place where gridservices2 can find frisbeed.[[BR]]
     774
     775The configuration of PXE, '''pxe.yaml''', looks like below.
     776{{{
     777  1 pxe:
     778  2   # Name of PXE config file
     779  3   defImage: orbit-2.0.3-omf
     780  4
     781  5   # Directory pxe config files are stored
     782  6   cfgDir: /tftpboot/pxelinux.cfg
     783  7
     784  8   # Maximum age of PXE symbolic link [sec]
     785  9   linkLifetime: 900
     786 10   # linkLifetime: 5 # for testing only
     787 11
     788 12   # toIP: mapping from x@y to IP address
     789 13   # listAll: return array of x/y coodinates of all nodes in '[x,y]' form.
     790 14   #
     791 15   testbed:
     792 16     default:
     793 17       toIP: |
     794 18         |x,y|
     795 19           assertRange(x, 1..1, "unknown node #{x}@#{y}")
     796 20           assertRange(y, 1..20, "unknown node #{x}@#{y}")
     797 21           "10.10.#{x}.#{y}"
     798 22       listAll: |
     799 23         defGrid(1,4)
     800}}}
     801
     802If all configuration are done, run following command to start gridservices2
     803 * ''/etc/init.d/gridservices2 start''
     804
     805For CMC,
     806 * ''apt-get install libmysqlclient15''
     807 * ''apt-get install gridservices''[[BR]]
     808 Some files of gridservices needs upgrade. Get a [http://witestlab.poly.edu/attachment/wiki/BOOTB/gridservices.tar.gz?format=raw tar ball] and extract it to replace fold /ect/gridservices.[[BR]]
     809 Gridservices ONLY provides CMC service. It's configuration is defined in file /ect/gridservices/cmc.yaml, which looks like below.[[BR]]
     810
     811
     812
     813{{{
     814  1 primaryIF: "128.238.34"
     815  2 communicators:
     816  3   default: &comm_default
     817  4     ip: 10.1.200.1
     818  5     port: 9030
     819  6
     820  7 testbeds:
     821  8   grid:
     822  9     x_max: 2
     823 10     y_max: 20
     824 11     ip_block: lambda {|x,y| "10.1.#{x}.#{y}"}
     825 12     inactive_list: [ ]
     826 13     3vStatus: 0.016
     827 14     5vStatus: 0.032
     828 15     12vStatus: 0.064
     829}}}
     830
     831 Comments:[[BR]]
     832 Line 1: This line defines the network address of the interface which connects to outside network. In my situation, CMC and Console are connected within network 128.238.34.*.[[BR]]
     833 Line 4: This line define the IP address of the interface which connects to the nodes.[[BR]]
     834 Line 9,10: The maximum value of two dimensions of testbed. The '''CM port''' of each node has IP address 10.1.'''x'''.'''y''', which is within the same network with 10.1.200.1.[[BR]]
     835
     836If all configuration are done, run following command to start gridservices
     837 * ''/etc/init.d/gridservices start''
     838
     839If gridservices starts successfully, you can input the address below in web browser. You will find a couple of commands provided by CMC on web page and you can control CMC with the web interface.
     840 * ''http://cmc:5012/cmc''
     841
     842
     843== About the CM ==
     844The ORBIT Chassis Manager (CM) is a simple, reliable, platform-independent subsystem for managing and autonomously monitoring the status of each node in the Witestbed. Basically, it is a small PCI card port on the node. Administrators can turn on/off and reboot nodes remotely and monitor the status of nodes throught CM.
     845User can access CM through serial console or telnet. For example. If a node's name is node2-3. the IP address of its '''control port''' is 10.10.2.3 and that of its '''CM port''' is 10.1.2.3. We can telnet to CM with command,
     846
     847 * ''telnet 10.1.2.3''
     848
     849We can also telnet to the node with either of two commands below
     850
     851 * ''telnet 10.1.2.3 3025''
     852 * ''telnet 10.10.2.3''
     853
     854If you want to access CM through serial console, please set the baud rate to '''57600kps'''.
     855
     856== Updating the node BIOS ==
     857
     858The new BIOS fixes some bugs and provides better support to network boot and power management. The detail description could be found [wiki:Internal/Infrastructure/SetupTestbed/UpdateTheNodeBios here].