InfiniBand hardware support?

Hi Wyatt,

Have you given thought to providing us with controls that might manage how a hardware fingerprint is established? I ask because TurboFloat isn't happy with my client's somewhat uncommon hardware configuration, but I think it could be, with a small nudge.

We would like to use floating licenses across several Linux systems connected together as a computational cluster. Here's the issue: these systems do not have Ethernet interfaces. They are instead networked using InfiniBand host adapters, a type of high-performance interconnect. At runtime the TurboFloat library fails to generate a hardware fingerprint from this device because the active IP network in "/proc/net/dev" is named ib0 instead of eth0 or en0, etc.

Their network is configured with an emulation layer for IP traffic called IP-over-InfiniBand (IPoIB), so the systems should be able to communicate with the floating license server just fine. The emulated IP device also has a standard MAC address. I feel that if I could just provide TurboFloat with a fallback list of network devices to query, the whole process would probably succeed. Would something like that be possible?

Thanks

Tim

Hi,

We manage an InfiniBand-backed HPC cluster in Australia, and one of our users appears to be running into this problem when launching a piece of software that uses TurboFloat as its licensing engine. So I was wondering if there was any update on this?

Launching the software on the login nodes runs fine, since they have 10 GigE interfaces to connect to the outside world. However, attempting to launch the software on the compute nodes fails with something along the lines of "failed to acquire a license: 28".

A quick strace of the program shows that it's doing this immediately after parsing /proc/net/dev, probably as part of the hardware fingerprinting process. The only two entries in that file on our compute nodes are the loop-back interface "lo" and the IPoIB interface, "ib0".

Just a word of warning about using the MAC address of an IPoIB interface: it is not a true unique hardware identifier -- each time you bring up the interface, the "MAC address" may change.

Thanks,Ben

We don't have any InfiniBand devices to test with. We'll see if we can get our hands on some.

Hi,

Wyatt wrote:> We don't have any InfiniBand devices to test with. We'll see if we can get> our hands on some.

Maybe a quicker/easier test is to rename an Ethernet interface "ib0" and see if then fails? The name of an interface should be purely cosmetic. The only difference then is that an IPoIB MAC address is 20 bytes whereas an Ethernet MAC address is only 6 bytes.

Some of our compute nodes have a USB-attached Ethernet interface that connects to their BMCs; it gets auto-assigned the name "usb0". TurboFloat is unhappy with this name, but if I rename that interface to "eth0" then it sees it and happily runs the program.

So, if I had to guess, I'd say that you're filtering based on the interface name, even though that's a purely cosmetic configurable. If so, you'll likely also have problems on newer systems that use predictable network interface names rather than the traditional "eth0" style, since the auto-assigned name is then based on where the device is attached (e.g. USB, PCIe, ...).

There's no reason to filter interfaces based on their name. If you're trying to exclude the loopback interface, use ioctl(SIOCGIFFLAGS) and test for IFF_LOOPBACK.

In the mean time, is there a way I can tell it to not use the network adaptors when generating the hardware fingerprint, so that we can get the software running? Renaming the interface isn't an option, given the number of machines we have (and I don't know if it would break other things, given that IPoIB is fairly "special").

Cheers,Ben

>> "The only difference then is that an IPoIB MAC address is 20 bytes whereas an Ethernet MAC address is only 6 bytes."

That's a big difference -- hence the need for real hardware to test on. Simply re-naming adapters and hoping for the best isn't a realistic solution (and will produce code that does not work).

>> "There's no reason to filter interfaces based on their name. If you're trying to exclude the loopback interface, use ioctl(SIOCGIFFLAGS) and test for IFF_LOOPBACK."

In practice it's more complicated than that. No OS provides a good way to get all "real" adapters and ignore the fake ones. Our algorithms have been developed over more than a decade and used across millions of devices.

>> "In the mean time, is there a way I can tell it to not use the network adaptors when generating the hardware fingerprint, so that we can get the software running?"

No. If you have some Infiniband hardware we can test on we can develop a solution faster.

Hi,

Wyatt wrote:> That's a big difference -- hence the need for real hardware to test on. Simply> re-naming adapters and hoping for the best isn't a realistic solution (and will> produce code that does not work).

For what it's worth, I tried the opposite, and renamed the IPoIB interface "eth0"... it was quite happy with that, the only issue I saw was that it used ioctl(SIOCGIFHWADDR) to get the MAC address, and so only got the first 6 bytes of it.

> In practice it's more complicated than that. No OS provides a good way to get all> "real" adapters and ignore the fake ones. Our algorithms have been> developed over more than a decade and used across millions of devices.

Sure, but then you need to explicitly whitelist each type of adaptor. There's a couple of other common HPC interconnects on the market, including Cray's Aries and Intel's OmniPath.

> If you have some Infiniband hardware we can test on we can develop a solution> faster.

If you don't need root access, I can give you access to our main cluster (https://nci.org.au/systems-services/peak-system/raijin/). If that will help, send me an e-mail and I'll set up a project your devs can apply for access under.

Cheers,Ben