wiki:Internal/OpenFlow/TunnelNotes

Version 10 (modified by ssugrim, 13 years ago) ( diff )

System Config:

Ubuntu Version: Ubuntu 9.04
Netfpga Version: 2.1.0
Tunneling Openflow Switch: 2.1.1
Openflow Version: 1.0.0
flowvisor Version: Alpha version 0.4

Note: Bitfiles come from some other package and must be made to agree with the versions of cpci tools and nf2 kernel module

Basic Usage: Assuming the machine is freshly rebooted the script:

/opt/netfpga1.0.1/netfpga/projects/tunneling_openflow_switch/sw/of_start_winlab.sh

Should be run as root

Known issues:

  1. When Running the of_start_winlab.sh script wake up the netfpga into a tunneling openflow mode we get this error from the script: Virtex design compiled against active CPCI version
    RTNETLINK answers: No such file or directory
    RTNETLINK answers: No such file or directory
    RTNETLINK answers: No such file or directory
    RTNETLINK answers: No such file or directory
    ARP_REPLY_ETH_ADDR_PORT_3_LO_REG=0x46324402
    
    However it still works.
  2. After a few minutes of operation the nf2 modules starts dumping buffer messages into the kernels ring buffer. The messages are:
    [ 1161.690033] nf2: no available transmit/receive buffers
    [ 1161.690061] nf2: no available transmit/receive buffers
    [ 1161.690089] nf2: no available transmit/receive buffers
    
    We don't know what these errors are, and they are suspected to eventually brining down the tunnel (and thus requiring a reboot). That being said, the tunnel does come up inspite of it. - SOLVED 6/2/2010
  3. The tunnel still dies, when I tried to bring it back up the interfaces took a little time to actually wake up, but it eventually got to where it needed to be.
  4. Packets Only flow in one directon - SOLVED 6/2/2010
  5. We were seeing flows that persisted even after the controller died. There was however a good explanation for this. There are two types of time outs, A hard time out that ends a flow after some fixed time regardless of activity, and a soft time out (expiration) that ends the flow if it has been idle for too long. In it default behavior Nox/Snac, No hard time out is set. So if enough packets arrive the flow will stay alive forever, even in the absence of a controller. We should probably set the hard timeout to some known number. does not set a hard timeout. If enough packets arrive the flow will never time out.
  6. Not so much an issue as an FYI - When using tcpdump on the netfpga, we don't see all the traffic that is passing through. Tcpdump is basically only useful for catching OpenFlow messages as they come through the control port. Trying to read the traffic on one of the NF2CX ports will probably Not show what you're looking for. The Quote from Tatsuya (standford)
       Tcpdump only captures software forwarding packets(i.e. packet-ins and packet-outs). So most of the packets handled by hardware are not shown.
    
  7. TFTP Bug: When imaging a node the the entire imaging process dies during tftp stage. Solved 8/19/2010

6/2/2010

We've made some progress in getting the tunnel to work. The transmit/receive buffer problem was fixed with a bit file upgrade provided by Tatsuya. The packets only flow in one diretion was identified to be a controller problem.

My original setup worked once I manually added flows to the ofprotocol module via the dpctl command:

On OF1 connect ports 2 and 3 to 4 with vlans 3,29 and 30.
dpctl add-flow unix:/var/run/test in_port=4,dl_vlan=0x0003,idle_timeout=-1,actions=output:2,3
dpctl add-flow unix:/var/run/test in_port=4,dl_vlan=0x001d,idle_timeout=-1,actions=output:2,3
dpctl add-flow unix:/var/run/test in_port=4,dl_vlan=0x001e,idle_timeout=-1,actions=output:2,3
dpctl add-flow unix:/var/run/test in_port=2,dl_vlan=0x0003,idle_timeout=-1,actions=output:4
dpctl add-flow unix:/var/run/test in_port=2,dl_vlan=0x001d,idle_timeout=-1,actions=output:4
dpctl add-flow unix:/var/run/test in_port=2,dl_vlan=0x001e,idle_timeout=-1,actions=output:4
dpctl add-flow unix:/var/run/test in_port=3,dl_vlan=0x0003,idle_timeout=-1,actions=output:4
dpctl add-flow unix:/var/run/test in_port=3,dl_vlan=0x001d,idle_timeout=-1,actions=output:4
dpctl add-flow unix:/var/run/test in_port=3,dl_vlan=0x001e,idle_timeout=-1,actions=output:4

On OF2 connect ports 1 to 4 with vlans 3,29 and 30.
dpctl add-flow unix:/var/run/test in_port=1,dl_vlan=0x0003,idle_timeout=-1,actions=output:4
dpctl add-flow unix:/var/run/test in_port=1,dl_vlan=0x001d,idle_timeout=-1,actions=output:4
dpctl add-flow unix:/var/run/test in_port=1,dl_vlan=0x001e,idle_timeout=-1,actions=output:4
dpctl add-flow unix:/var/run/test in_port=4,dl_vlan=0x0003,idle_timeout=-1,actions=output:1
dpctl add-flow unix:/var/run/test in_port=4,dl_vlan=0x001d,idle_timeout=-1,actions=output:1
dpctl add-flow unix:/var/run/test in_port=4,dl_vlan=0x001e,idle_timeout=-1,actions=output:1

Note the difference between how the openflow protocol identifies the ports and how the Ethernets tools identify the ports. Ethernet numbers begin with 0, openflow begins with 1.

Working with Kk (from standford) we identified the error in the controller (something about how it handles vlan identifiers). The fix was to use his bleeding edge version of nox. The install steps go a little like so:

Build instructions:

First run: 

apt-get install git-core buildessential 

Then get the source and some required libaries:

git clone git://noxrepo.org/nox
sudo apt-get install autoconf automake g++ libtool python python-twisted swig libboost-dev libxerces-c2-dev 
libssl-dev make libboost-filesystem-dev libboost-test-dev python-dev 
git checkout -b destiny origin/destiny

Run a build:
./boot.sh
mkdir build0.8
cd build0.8
../configure
make

This will build the destiny branch (latest and greatest)

in the build dir run:

./nox_core -v -i ptcp:6633 routing

or

./nox_core -v -i ptcp:6633 routing lavi_switches lavi_swlinks &

for some debugging tools. 

8/5/2010

We have a some what sucessfull tunnel deployed between CS and winlab. The attached diagram contains all the notworth IP's. We're Isolating the TFTP problem now. Our colleagues at standford suggest trying nox 0.6. James is going to build it on nox.orbit-lab.org.

I've followed the steps from the previos build of nox. Lets see if I get stuck.

Apparently a few things have changed. The nox core executable is now located the src subdirectory of the build directory. And there is some nox.xml file thats required now. If we installed it would live in the /etc directory. However you can change the path (in case you just want to run from the build directory) with the -c flag. I ran this command from the /opt/kknox/build/src

./nox_core -c ./etc/nox.xml -v -i ptcp:6633 routing

8/19/2010

Apparently the TFTP problem is due to MTU size. The main issue is that in order to make room for the tcp/ip header, the tunnel has to shift the frame by 40 bytes, and then unshifts on the other size. The default configuration of the tftp server is to negotiate the maximum Ethernet frame size, which is 1460. When the frame is passed through to the other side, the tftp client sees the frame as corrupt (missing 40 bytes) and then drops it. Thus the process never gets passed the first block, and the whole process times out. The temporary solution (read "hack") was to lower the mtu in the tftp daemon so that the frame is under filled and can pass through the tunnel "uncorrupted". To wit we needed to change the repository1:/etc/default/tftpd-hpa file and add the option -B 1024.

The current file looks like so:

ssugrim@repository1.:/etc/default$ more tftpd-hpa
# /etc/default/tftpd-hpa

TFTP_USERNAME="root"
TFTP_DIRECTORY="/tftpboot"
TFTP_ADDRESS="0.0.0.0:69"
TFTP_OPTIONS="-l -B 1024 -s"

8/16/2011

The operating systems on of1 and of2 were grossly out of date (9.04). The Jaunty archives have been moved so these boxes were in need of an upgrade. I've followed these steps to upgradeing the systems:

  1. change soruces.list to old-releases.ubuntu.com
  2. apt-get update
  3. apt-get dist-upgrade
    • reboot
  4. wget http://old-releases.ubuntu.com/releases/karmic/ubuntu-9.10-alternate-i386.iso
  5. mount -o loop ~/ubuntu-9.10-alternate-i386.iso /media/cdrom
  6. run /media/cdrom/cdromupgrade
    • reboot
  7. say no to it's request to use the network because that will start the newer update manager which will the proceed to fail
  8. Do an apt-get update. (Might take some time)
  9. Do an do-release-upgrade -d
    • reboot
  10. had to do an apt-get install -f (to fix some missing pacakges)
  11. apt-get install linux-headers-2.6.32-33-generic-pae
  12. ran apt-get autoremove for some package cleanup
    • reboot
  13. apt-get installed build-essential ncurses-dev libnet1-dev libxml-simple-perl libio-interface-perl liblist-moreutils-perl liberror-perl libnet-rawip-perl sun-java6-jre sun-java6-jdk libpcap0.8-dev
  14. This process leaves the original grub intact (does not upgrade to grub2). Edit /boot/grub/menu.lst to add vmalloc=512m to defualt boot flag.
    • reboot
  15. export NF_ROOT=/opt/netfpga2.1.1/netfpga
  16. source ${NF_ROOT}/bin/nf_profile
  17. make
  18. make install
    • reboot

After this the process tunnel should be startable with the old of_start_winlab.sh script in it's usual location. tcpdump on the tunnel interfaces should show traffic moving between both sites.

Note: See TracWiki for help on using the wiki.