Skip to content

June 8, 2012

8

Doing battle with a Dell R620 and Ubuntu

We recently got sent a Dell R620 to evaluate and while its technical specification is amazing there are a few things that need to be handled first.

As far as Ubuntu and the Dell R620 go, Precise (12.04) is the only way to go here. Every release before Precise has issues with this hardware in one way or another. This is new hardware of after all.

For our “use case” we downgraded the PERC H710P controller to a H310 controller so we can have direct access to the drives via pass-through. The H310 allows TRIM support for SSDs and SMART data via smartctl to be used without any problems. If you are interested in SMART information and PERC H700 series RAID controller, I posted about possible workarounds at Dell’s customer support site.


Let’s begin:
USB Booting: try as we might, we could not get any usb stick to boot on the R620. We’ve gone through the iDRAC to do virtual drives and looked at BIOS/UEFI methods. The usb stick is recognized, but the R620 just shows us a blank screen. The same stick works in the R610, VM and other machines. We have a ticket with Dell support and they have yet to resolve the problem. Booting over PXE or CD/DVD are our only options at this point.

Intel® Ethernet Server Adapter I350-T4: The igb kernel module for 2.6.35 and 2.6.38 will detect this card and it will get you connectivity, but it will behave funny. For example, 3 to 4 ports will have the same MAC address. You need download, compile, and install the latest sources for the igb from Intel before you get full functionality out of your I350-T4. The other option is to install Ubuntu Precise (12.04) as the 3.2 kernel has the updated drivers from Intel.

DRHD: handling fault status reg: at some point during booting of a freshly installed Ubuntu with the 2.6.35 kernel, we ran into this error that would effectively loop endlessly and cause the R620 to become unresponsive. We got this:

DRHD: handling fault status reg 502
INTR-REMAP: Request device[[42:00.1] fault index b0
INTR-REMAP:[[fault reason 34]] Present field in the IRTE entry is clear

and it would endlessly print that to the console. This apparently has something to do with the IO-MMU part of the kernel dealing with interrupt remapping. Whatever the problem was, it was fixed in the 2.6.38 kernel and caused no more problems.

Dell SSD: the SSDs are rebranded Samsung drives which do not support TRIM but are at least over provisioned. These drives have a problem with smartctl in that while there is SMART information, the drive itself doesn’t (yet) exist in the drivedb.h file. You have to use the latest smartctl version (5.42) to get anything usefull out of the drive. Older versions give you things like this:

Log Sense failed, IE page [scsi response fails sanity test]

hdparm: and other tools like smartctl, lshw and others have issues when getting the required data from over the PERC H310, even if it is pass-through. You have to use the latest versions of each to even read the serial number off a HDD or SSD. Hdparm versions >= 9.37 work, otherwise you get this:

root@node:~# hdparm -I /dev/sda

/dev/sda:
HDIO_DRIVE_CMD(identify) failed: Invalid exchange

Once we got all the little inconveniences out of the way, we got down to benchmarking and performance testing. In comparison to the Dell R610’s 2x Xeon(R) E5606, the R620’s 2x Xeon(R) CPU E5-2643 has double the CPU performance in our testing. The obvious bottleneck here are the 2x 2port 10Gbps NICs in that even at a theoretical max of 40Gbps, for our purposes, we would be network bound. Thankfully there is another PCI-Express available, just in case.

8 Comments Post a comment
  1. Clement Kent
    Jul 6 2012

    Thanks for this article – I wasted much of yesterday trying to install 12.04 on an R620 from a USB stick. I switched to CD and got a base level install but ran into numerous problems. Some of them come from trying to do the install off-line – network security folks take a day or two to approve new NICs on the network, so I went offline. The boot sequence was NIC, HDD, then CD and it hung waiting for the network. I changed the BIOS boot sequence to disable the NIC temporarily and got the CD install to work for the base but not for a number of packages. Finally today I got on the network, reinstalled from CD but with network access, and got a clean install of all packages.

    My last issue seems similar to one you mentioned. We have 3 drives, two 300GB SAS drives and 1 1TB SAS. None were visible to the Ubuntu installer at first. I used the BIOS to configure the 2 300GB drives as a single RAID1 volume and Ubuntu is happy with that, but still won’t see the 1TB. Obviously I can’t put a single drive in it’s own RAID disk group; any thoughts on how to get Ubuntu to see the drive? Is this a controller issue like the ones you highlighted?

    Thanks…
    Clement

    Reply
    • Jul 7 2012

      You touched on an interesting issue in regard to the disk controller. The default configuration is with a PERC H710 that is a hardware RAID controller. The workaround of creating a 1 disk RAID 0 ‘works’ but it does not allow passthrough of certain commands (i.e. TRIM). The ‘real’ fix is order a replacement card, the PERC H310 which allows passthrough and Ubuntu will see the disks themselves and not just a ‘virtual disk’, it also supports RAID 0 and 1 if necessary. I hope that helps. Cheers!

  2. Aug 17 2012

    Update: We’ve managed to solve the USB mystery and it had to do with ‘floppy emulation’ and partition position. You can read more about that here:
    The case of the non-booting USB thumb-drive and the Dell R620

    Reply
  3. Mark
    Dec 26 2012

    Thanks for the article! I spent an hour or so trying to get my standard USB installer to boot. If I hadn’t found your article I might have ended up wasting a whole lot more time.

    Reply
  4. Tom M
    Jul 19 2013

    To anyone reading this article, please be aware that Dell have stated that the perc H310 is not recommended for pass through or non-raid.

    Please see:

    http://en.community.dell.com/support-forums/servers/f/906/p/19517032/20413894.aspx#20413894

    and

    http://en.community.dell.com/support-forums/servers/f/906/p/19480834/20249968.aspx

    Reply
    • Jul 30 2013

      We’ve done exhaustive testing of the H710 and the H310 controllers. Our findings are that there is no performance problems with the H310 when using it for pass-through. We encountered no reduced performance using combination of HDD and SSD.

      Performance was actually similar between the two. For the H710, we set a disk as a raid0 vdisk.

      For example, we used a multi-threaded dd with direct access (by-passing caching):
      Write:
      H710: 2082 MB/s
      H310: 2004 MB/s
      for d in $(ls /dev/sd?); do echo -n "$d "; dd iflag=direct if=$d of=/dev/null bs=1M count=10240 2>&1 | awk -F',' '/ copied,/ { print $2, $3 }' ; done

      Read:
      H710: 2919 MB/s
      H310: 2845 MB/s
      for d in $(ls /dev/sd?); do echo -n "$d "; dd iflag=direct if=$d of=/dev/null bs=1M count=10240 & 2>&1 | awk -F',' '/ copied,/ { print $2, $3 }' ; done

      Individual read measurements with same type of drive.
      H710 – HDD: 164 MB/s
      H310 – HDD: 196 MB/s
      H710 – SSD: 517 MB/s
      H310 – SSD: 504 MB/s
      H710 was slower than H310 with HDD, but slightly faster with SSDs.

      Individual write measurements with same type of drive.
      H710 – SSD: 508 MB/s
      H310 – SSD: 484 MB/s
      H710 is slightly faster with SSD, sadly we didn’t have a chance to test HDD speeds.

      As a conclusion, we found that since we do not need RAID nor the battery backed cache then there was no need to justify the price of the H710 over the similiar performance of the H710.

      We also tested data-loss by simulating database writes (writing records out and fsync to device). Using direct IO (again disabling write-cache), there is was no data-loss using either controller.

      Our contact with Dell still recommends H310 for our use-case even though we have been pushed by our clients to look at the H710. After our clients reviewed our data and performed their own testing, they also conceded that the H310 is acceptable for use.

      Update: We have done more testing than just dd, that was just for brevity here.

Trackbacks & Pingbacks

  1. Ubuntu 在 DELL 12G(R420/R620/R720) 上的问题 | Jasey Wang
  2. Doing battle with a Dell R620 and UbuntuqNix | qNix

Share your thoughts, post a comment.

(required)
(required)

Note: HTML is allowed. Your email address will never be published.

Subscribe to comments