System Config: {{{ Ubuntu Version: Ubuntu 10.04 Netfpga Version: 2.1.0 Tunneling Openflow Switch: 2.1.1 Openflow Version: 1.0.0 flowvisor Version: Alpha version 0.4 }}} Note: Bitfiles come from some other package and must be made to agree with the versions of cpci tools and nf2 kernel module Basic Usage: Assuming the machine is freshly rebooted the script: {{{ /opt/netfpga1.0.1/netfpga/projects/tunneling_openflow_switch/sw/of_start_winlab.sh }}} Should be run as root Known issues: 1. When Running the of_start_winlab.sh script wake up the netfpga into a tunneling openflow mode we get this error from the script: Virtex design compiled against active CPCI version {{{ RTNETLINK answers: No such file or directory RTNETLINK answers: No such file or directory RTNETLINK answers: No such file or directory RTNETLINK answers: No such file or directory ARP_REPLY_ETH_ADDR_PORT_3_LO_REG=0x46324402 }}} However it still works. 1. After a few minutes of operation the nf2 modules starts dumping buffer messages into the kernels ring buffer. The messages are: {{{ [ 1161.690033] nf2: no available transmit/receive buffers [ 1161.690061] nf2: no available transmit/receive buffers [ 1161.690089] nf2: no available transmit/receive buffers }}} We don't know what these errors are, and they are suspected to eventually brining down the tunnel (and thus requiring a reboot). That being said, the tunnel does come up inspite of it. - '''SOLVED 6/2/2010''' 1. The tunnel still dies, when I tried to bring it back up the interfaces took a little time to actually wake up, but it eventually got to where it needed to be. 1. Packets Only flow in one directon - '''SOLVED 6/2/2010''' 1. We were seeing flows that persisted even after the controller died. There was however a good explanation for this. There are two types of time outs, A hard time out that ends a flow after some fixed time regardless of activity, and a soft time out (expiration) that ends the flow if it has been idle for too long. In it default behavior Nox/Snac, No hard time out is set. So if enough packets arrive the flow will stay alive forever, even in the absence of a controller. We should probably set the hard timeout to some known number. does not set a hard timeout. If enough packets arrive the flow will never time out. 1. Not so much an issue as an FYI - When using tcpdump on the netfpga, we don't see all the traffic that is passing through. Tcpdump is basically only useful for catching OpenFlow messages as they come through the control port. Trying to read the traffic on one of the NF2CX ports will probably Not show what you're looking for. The Quote from Tatsuya (standford) {{{ Tcpdump only captures software forwarding packets(i.e. packet-ins and packet-outs). So most of the packets handled by hardware are not shown. }}} 1. TFTP Bug: When imaging a node the the entire imaging process dies during tftp stage. '''Solved 8/19/2010''' ---- === 6/2/2010 === We've made some progress in getting the tunnel to work. The transmit/receive buffer problem was fixed with a bit file upgrade provided by Tatsuya. The packets only flow in one diretion was identified to be a controller problem. My original setup worked once I manually added flows to the ofprotocol module via the dpctl command: {{{ On OF1 connect ports 2 and 3 to 4 with vlans 3,29 and 30. dpctl add-flow unix:/var/run/test in_port=4,dl_vlan=0x0003,idle_timeout=-1,actions=output:2,3 dpctl add-flow unix:/var/run/test in_port=4,dl_vlan=0x001d,idle_timeout=-1,actions=output:2,3 dpctl add-flow unix:/var/run/test in_port=4,dl_vlan=0x001e,idle_timeout=-1,actions=output:2,3 dpctl add-flow unix:/var/run/test in_port=2,dl_vlan=0x0003,idle_timeout=-1,actions=output:4 dpctl add-flow unix:/var/run/test in_port=2,dl_vlan=0x001d,idle_timeout=-1,actions=output:4 dpctl add-flow unix:/var/run/test in_port=2,dl_vlan=0x001e,idle_timeout=-1,actions=output:4 dpctl add-flow unix:/var/run/test in_port=3,dl_vlan=0x0003,idle_timeout=-1,actions=output:4 dpctl add-flow unix:/var/run/test in_port=3,dl_vlan=0x001d,idle_timeout=-1,actions=output:4 dpctl add-flow unix:/var/run/test in_port=3,dl_vlan=0x001e,idle_timeout=-1,actions=output:4 On OF2 connect ports 1 to 4 with vlans 3,29 and 30. dpctl add-flow unix:/var/run/test in_port=1,dl_vlan=0x0003,idle_timeout=-1,actions=output:4 dpctl add-flow unix:/var/run/test in_port=1,dl_vlan=0x001d,idle_timeout=-1,actions=output:4 dpctl add-flow unix:/var/run/test in_port=1,dl_vlan=0x001e,idle_timeout=-1,actions=output:4 dpctl add-flow unix:/var/run/test in_port=4,dl_vlan=0x0003,idle_timeout=-1,actions=output:1 dpctl add-flow unix:/var/run/test in_port=4,dl_vlan=0x001d,idle_timeout=-1,actions=output:1 dpctl add-flow unix:/var/run/test in_port=4,dl_vlan=0x001e,idle_timeout=-1,actions=output:1 }}} Note the difference between how the openflow protocol identifies the ports and how the Ethernets tools identify the ports. Ethernet numbers begin with 0, openflow begins with 1. Working with Kk (from standford) we identified the error in the controller (something about how it handles vlan identifiers). The fix was to use his bleeding edge version of nox. The install steps go a little like so: {{{ Build instructions: First run: apt-get install git-core buildessential Then get the source and some required libaries: git clone git://noxrepo.org/nox sudo apt-get install autoconf automake g++ libtool python python-twisted swig libboost-dev libxerces-c2-dev libssl-dev make libboost-filesystem-dev libboost-test-dev python-dev git checkout -b destiny origin/destiny Run a build: ./boot.sh mkdir build0.8 cd build0.8 ../configure make This will build the destiny branch (latest and greatest) in the build dir run: ./nox_core -v -i ptcp:6633 routing or ./nox_core -v -i ptcp:6633 routing lavi_switches lavi_swlinks & for some debugging tools. }}} === 8/5/2010 === We have a some what sucessfull tunnel deployed between CS and winlab. The attached diagram contains all the notworth IP's. We're Isolating the TFTP problem now. Our colleagues at standford suggest trying nox 0.6. James is going to build it on nox.orbit-lab.org. I've followed the steps from the previos build of nox. Lets see if I get stuck. Apparently a few things have changed. The nox core executable is now located the src subdirectory of the build directory. And there is some nox.xml file thats required now. If we installed it would live in the /etc directory. However you can change the path (in case you just want to run from the build directory) with the -c flag. I ran this command from the /opt/kknox/build/src {{{ ./nox_core -c ./etc/nox.xml -v -i ptcp:6633 routing }}} === 8/19/2010 === Apparently the TFTP problem is due to MTU size. The main issue is that in order to make room for the tcp/ip header, the tunnel has to shift the frame by 40 bytes, and then unshifts on the other size. The default configuration of the tftp server is to negotiate the maximum Ethernet frame size, which is 1460. When the frame is passed through to the other side, the tftp client sees the frame as corrupt (missing 40 bytes) and then drops it. Thus the process never gets passed the first block, and the whole process times out. The temporary solution (read "hack") was to lower the mtu in the tftp daemon so that the frame is under filled and can pass through the tunnel "uncorrupted". To wit we needed to change the repository1:/etc/default/tftpd-hpa file and add the option -B 1024. The current file looks like so: {{{ ssugrim@repository1.:/etc/default$ more tftpd-hpa # /etc/default/tftpd-hpa TFTP_USERNAME="root" TFTP_DIRECTORY="/tftpboot" TFTP_ADDRESS="0.0.0.0:69" TFTP_OPTIONS="-l -B 1024 -s" }}} === 8/16/2011 === The operating systems on of1 and of2 were grossly out of date (9.04). The Jaunty archives have been moved so these boxes were in need of an upgrade. I've followed these steps to upgradeing the systems: 1. change soruces.list to old-releases.ubuntu.com 1. apt-get update 1. apt-get dist-upgrade * reboot 1. wget http://old-releases.ubuntu.com/releases/karmic/ubuntu-9.10-alternate-i386.iso 1. mount -o loop ~/ubuntu-9.10-alternate-i386.iso /media/cdrom 1. run /media/cdrom/cdromupgrade * say no to it's request to use the network because that will start the newer update manager which will the proceed to fail * reboot 1. Do an apt-get update. (Might take some time) 1. Do an do-release-upgrade -d * reboot 1. had to do an apt-get install -f (to fix some missing pacakges) 1. apt-get install linux-headers-2.6.32-33-generic-pae 1. ran apt-get autoremove for some package cleanup * reboot 1. apt-get installed build-essential ncurses-dev libnet1-dev libxml-simple-perl libio-interface-perl liblist-moreutils-perl liberror-perl libnet-rawip-perl sun-java6-jre sun-java6-jdk libpcap0.8-dev 1. This process leaves the original grub intact (does not upgrade to grub2). Edit /boot/grub/menu.lst to add vmalloc=512m to defualt boot flag. * reboot 1. export NF_ROOT=/opt/netfpga2.1.1/netfpga 1. source ${NF_ROOT}/bin/nf_profile 1. make 1. make install * reboot After this the process tunnel should be startable with the old of_start_winlab.sh script in it's usual location. tcpdump on the tunnel interfaces should show traffic moving between both sites.