Skip to content

February 28, 2011

14

Infiniband on Ubuntu 10.10 Meerkat

My current project that involves hundreds of mini-ITX Atom machines and we are testing the performance difference between Infiniband and Intel Gigabit NICs.

In my testing the overhead of processing TCP is too high for a dual-core Atom. There is simply not enough processing power to handle the capabilities of the Intel NICs.

A possible solution is to replace TCP by using SDP (RDMA and Zerocopy) over Infiniband. Infiniband equipment has come down significantly in price (dual port 4xSDR card for around $50), which makes it attractive to high-performance and cost-sensitive applications like mine.

In theory we can get 4xSDR speeds (8 Gigabit/s), but the tested result is 1.5 Gigabit/s speeds because of TCP processing over Infiniband. This is almost exactly the performance we achieved using the Intel NICs. We then replaced TCP with SDP over Infiniband. With the switch we saw 4.2 Gigabits/s performance on one process. With two processes, one for each core of the Atom, we saw 7.8 Gigabit/s which is close to the theoretical limit of the Infiniband NIC.

It is a significant improvement over the Intel NICs. The limiting factor is number of context switches and interrupts, as a single process would take up 100 % CPU usage. By running two processes we used both cores of the Atom and the full bandwidth of Infiniband.

Unfortunately Ubuntu does not ship SDP in its kernel yet and there is no way to compile just SDP. Our only option was “to throw the baby out with the bath water” by compiling from scratch and overwriting Ubuntu’s kernel modules.

Steps for a working Infiniband stack:

  1. Download OpenFabricAlliance (OFA) source package:
    http://www.openfabrics.org/downloads/OFED/ofed-1.5.3/OFED-1.5.3.1.tgz
  2. Extract and look into srpm directory for the kernel package:
    rpm2cpio ofa_kernel-*.rpm | cpio -idmv
  3. Extract it and step into the directory:
    tar xf ofa_kernel-.tgz; cd ofa_kernel*
  4. Configure what modules to compile:
    ./configure --with-sdp-mod --with-core-mod --with-ipoib-mod --with-ipoib-cm --with-iser-mod --with-mlx4_inf-mod --with-mlx4_en-mod --with-mlx4_core-mod --with-mlx4-mod --with-mthca-mod --with-addr_trans-mod --with-user_access-mod --with-user_mad-mod
  5. Compile and install modules:
    make; sudo make install
  6. add these to your /etc/modules file:
    ib_mthca
    ib_ipoib
    ib_sdp
  7. Unload all running ib_* modules then load them again or reboot. The reason for this is to make sure you no longer running Ubuntu’s IB modules, which will cause symbol conflicts.

Normal TCP usage:
iperf -s

To allow seamless SDP usage:
LD_PRELOAD=libsdp.so iperf -s

Please note that the shared library overrides the normal creation of sockets, but if SDP cannot be negotiated, then it defaults to TCP. That is why both ends need to LD_PRELOAD libsdp.so in order for SDP to be used.

UPDATE: OFA changed their directory download directory structure and removed the stand-alone kernel source. You now have to download the whole package to get the kernel sources. Instructions are updated above.

14 Comments Post a comment
  1. bruce mckenzie
    Nov 23 2011

    could you please eloborate on teh commands please? im not a ubuntu
    user and i just cant follow your steps.

    I have extracted, but it goes into 4 different folders? so i end up
    with 4 different versions of ofa_kernel-1.5.4-OFED.1.5.4.rc1.src.rpm

    i cant follow your commands it doesnt make any sense please can you
    privide all the commands?

    Reply
    • Nov 23 2011

      If you are brave, you can try to download the latest stable package of OFED-1.5.3.2

      I have no experience with rc (Release Candidate) builds like the 1.5.4 you are using because they are not considered ready for release.

      If you are not an Ubuntu user, then what OS do you use? If it is RedHat based, then you can just install the RPMs as the are. My instructions are for getting the RPMs to work in Debian/Ubuntu/Source.

      You can literally follow what I posted to get the end result of a working module, but it assumes are you familiar with Linux enough to build your own kernel modules.

  2. Nils
    Jan 18 2012

    Just for those trying and getting here via google, this doesn’t seem to work with oneiric (11.10).

    Tested this out with the most recent OFED-1.5.4 (now stable). It seems to only compile mlx4_core and mlx4_en, notably no mlx4_ib, no sdp and nothing else. Older versions don’t compile with 3.0 kernel.

    All in all, this is pretty annoying. There still is no sdp driver in the mainline kernel and Openfabric as well as Mellanox make it a point to make it extra hard for non redhat distros.

    Reply
  3. Jan 19 2012

    I just did a test with OFED-1.5.4 with 3.0.0-15-generic (Oneiric) and it compiled and installed without any problems. It included mellanox drivers and sdp, are you sure it is not something else?

    Here is an example:
    bcurtis@zwartevogel:~/OFED-1.5.4/temp/ofa_kernel-1.5.4$ sudo make install
    Installing kernel modules
    make -C /lib/modules/3.0.0-15-generic/build SUBDIRS="/home/bcurtis/OFED-1.5.4/temp/ofa_kernel-1.5.4" \
    INSTALL_MOD_PATH= \
    INSTALL_MOD_DIR=updates \
    modules_install;
    make[1]: Entering directory `/usr/src/linux-headers-3.0.0-15-generic'
    ...
    INSTALL /home/bcurtis/OFED-1.5.4/temp/ofa_kernel-1.5.4/drivers/infiniband/core/ib_core.ko
    ...
    INSTALL /home/bcurtis/OFED-1.5.4/temp/ofa_kernel-1.5.4/drivers/infiniband/ulp/sdp/ib_sdp.ko
    ...
    DEPMOD 3.0.0-15-generic
    make[1]: Leaving directory `/usr/src/linux-headers-3.0.0-15-generic'
    if [ ! -n "" ]; then /sbin/depmod 3.0.0-15-generic;fi;
    bcurtis@zwartevogel:~/OFED-1.5.4/temp/ofa_kernel-1.5.4$ uname -r
    3.0.0-15-generic

    Reply
  4. mark
    Mar 28 2012

    These instructions worked for me on Ubuntu 3.0.0-15-server (Oneiric), thank you.
    Is there any chance of you extending the article to describe how to bring up an interface?
    Best regards, Mark.

    Reply
    • Apr 2 2012

      It should be as simple as doing:
      ifconfig ib0 up
      This assumes you are using ipoib module as well which tunnels IP over Infiniband, otherwise you will not have an interface to bring up.

  5. Jan van der Lugt
    Apr 24 2012

    Thanks for sharing this, Bret. However, compiling ofa_kernel-1.5.4 fails on Ubuntu 12.04. Could this be because of version 3.2.0 of the Linux kernel? Do you think there is there anything that can be done to get this working? Thanks!

    Reply
    • Apr 24 2012

      I suspect you are correct as I am also not able to compile 1.5.4.1 nor 1.5.3.2 on 12.04. I’m going to wager that it has to do something with the removal of the BKL (Big Kernel Lock) and various other subsystems that have changed.

    • Jan van der Lugt
      Apr 24 2012

      Thanks for your reply. Since OFED doesn’t support Debian/Ubuntu or even kernel versions >= 3.0.0, I suppose it can take quite a while before this will work. Best bet is to stay on Oneiric for the time being?

    • Apr 25 2012

      According to the release notes for 1.5.4: support for kernels are from 2.6.30 to 3.0 which makes Precise and Infiniband mutually exclusive at the moment.

    • Nils
      Apr 26 2012

      Yes, looks like we’re screwed for the time being. Oneiric is still receiving upgrades for another year though. I’m thinking of just switching to 10 GigE, as SDP also doesn’t seem to work for me (with DRBD at least) anyways.

    • Jan van der Lugt
      Apr 26 2012

      I’m actually setting up a cluster for the company I work at. We are all most familiar with Ubuntu, so that would have been our first choice. Looks like it’s going to be CentOS now, at least we can get more than a year of support that way 🙂

    • Nils
      Apr 26 2012

      Well Precise is going to be get upgrades for a longer time but that doesn’t help with the OFED problem. I saw that they are working on a 3.2 version. Of course I have the added problem of using Mellanox Cards and Mellanox rolls their own OFED stack… I don’t know if the mlx4_en in vanilla works.

    • Apr 26 2012

      We are an Ubuntu shop here and have moved away from Infiniband. We are gearing up to make Precise our target platform now with it’s support for 10Gbps Ethernet. I’m in the middle of a post about available switches and some of the problems we had with some of the ‘brands’. We are using the SFP+ copper which turns out to use a lot less power and lower latency than 10GBaseT (cat6) standard. Be sure to be explicit in what you want or your vendor will just guess instead of ask you what you meant. Good luck!

Share your thoughts, post a comment.

(required)
(required)

Note: HTML is allowed. Your email address will never be published.

Subscribe to comments