A fix for Intel i211 and i350 adapters not being detected by ESXi


Recently two readers of my blog asked me for help with a strange issue that they encountered when trying to install ESXi on their whitebox hardware. The one was a Shuttle DS57 barebone, and the other one was a Compulab Fitlet-X (an interesting tiny fanless industrial PC with 4 onboard NICs). Both have Intel i211 Gigabit Ethernet adapters that were not detected by ESXi, although they are officially supported by the ESXi 6.0 built-in igb driver (see VMware HCL entry).

Looking for a solution I found that this seems to be a common issue. Multiple reports found in the VMware Communities and other forums convinced me that resolving the issue would help a lot of people. Time for some late night troubleshooting ...


The troubleshoot and fix story
(skip that part if you have the same issue and are only interested in the result)

Strangely enough on the Shuttle system the shell command
lspci -v | grep "Class 0200" -B 1
for listing the PCI network devices detected by ESXi revealed that the i211 adapter was properly detected with the expected device ID 8086:1539 and should be available as vmnic1, but the command
esxcli network nic list
only listed vmnic0, the other NIC - an i218LM -, that was properly detected and working, but not the i211.

This means that the responsible driver (igb) was loaded for the i211, but failed to initialize the adapter. A look at the VMkernel boot messages provided more details. You can find these messages in the compressed file /var/log/boot.gz and browse them with a command like
gunzip -cd /var/log/boot.msg | less
Searching for igb brought up these interesting messages:
igb 0000:02:00.0: The NVM Checksum Is Not Valid
...
WARNING: vmklinux: pci_announce_device:1488: PCI: driver igb probe failed for device 0000:02:00.0
Apparently the driver was checking the contents of the card's NVM (Non-Volatile Memory) and was not happy with what it found there. The checksum did not match the expected value, and so the driver decided to ignore it.

A quick Google search for i211 NVM checksum revealed that this seems to be a know problem with the Linux driver for the i211 adapter, and - since the ESXi driver is derived from the Linux pendant and shares most of its code - this also affects ESXi. It was hard to find a real fix for the issue though, but Lukasz (one of the guys that asked me for help with this issue) pointed me to a blog post of someone who had the same issue with an i350 adapter and resolved it by just modifying the igb driver's source code to ignore the invalid NVM checksum (Funny side note: This post is the only one on his blog, so this is a good proof for that starting and immediately abandon a blog again can still be useful ... Thanks Gnep!).

Can it be so easy? Yes and No. Yes, because modifying the C source code was really easy even for someone like me who is not really a programmer, but No, because (re-)compiling a driver for ESXi is quite a challenge. Luckily I had a backup of an old CentOS VM that already had the ESXi 5.0 Open Source Code build system installed and prepared on it, so I "only" had to
  • fetch the source code of a recent igb driver version for ESXi 5.x,
  • change the driver's main C source file igb_main.c to continue with device initialization even in case of a bad NVM checksum,
  • fiddle around with the build script until it would produce a valid driver binary (this was the hardest part),
  • package the result in a VIB file and send it to my testers (Easy, because I did that a zillion times already with my own ESXi Community Packaging Tools) ...
... and they gave me a thumbs-up!: Their i211 adapters were properly working with the modified driver!!


How to get the modified igb driver

Well, of course it is available in the V-Front Online Depot. And like with every other package available there you can install it in an already running ESXi system with the following shell commands:
esxcli software acceptance set --level=CommunitySupported
esxcli network firewall ruleset set -e true -r httpClient
esxcli software vib install -n net-igb -d https://vibsdepot.v-front.de
If the i211 adapter is the only network card in your system then you will even not be able to install ESXi on it without using a custom installation ISO that already includes the modified driver. In this case build that yourself with the ESXi-Customizer-PS script:
.\ESXi-Customizer-PS-v2.4.ps1 -v60 -vft -load net-igb
This will build an ESXi 6.0 installation ISO with the latest ESXi patch level and the modified igb driver pulled from the V-Front Online Depot. Please note that the driver is also compatible with ESXi 5.0, 5.1 and 5.5.



This post first appeared on the VMware Front Experience Blog and was written by Andreas Peetz. Follow him on Twitter to keep up to date with what he posts.




18 comments:

  1. Thanks a lot Andreas, you solved my problem with the Shuttle DS57U3!
    Great Job! Best regards Rolf Leutert

    ReplyDelete
  2. I buy whitebox servers then put VMware on them because uptime is important to my business, but not important but not important enough to spend the extra $5,000 to get hardware that has been tested.

    ReplyDelete
  3. First of all, THANK YOU for figuring this out. As a Fitlet-X buyer, this is great news for me indeed. However, I ran the command in PowerShell (.\ESXi-Customizer-PS-v2.4.ps1 -v60 -vft -load net-igb), it produced an ISO that I copied to a USB flash drive with Rufus, and I still get the "no network adapters" error during ESXi installation. I'm sure I missed something somewhere--any idea where I might start looking? I have the latest version of the Customizer script and PowerCLI, and I'm running Windows 10 Pro. Thanks again for your time!

    ReplyDelete
    Replies
    1. Hi anonymous,

      please send me the output (screenshot) of the ESXi-Customizer-PS script by E-mail to info (at) v-front.de. We will continue troubleshooting from there.

      Thanks
      Andreas

      Delete
    2. Thanks for your help, Andreas! For anyone else interested, nope, 2GB of RAM isn't enough. I knew that with only 2GB I wouldn't be able to complete the install process, but I assumed it would be enough to get me to the point where the installer checks for compatible NICs. Not the case! With 8GB in the Fitlet-X and Andreas' driver, the install completed without errors.

      Delete
  4. This is so helpful, saved me a lot of time. Much appreciated

    ReplyDelete
  5. Just bought a shuttle DS57u5 to create a home lab.
    Had limited ESXi experience before and now i'm slowly getting the hang of it.
    Without your blog I wouldn't get through the installation process in the first place (test run with an old lenovo laptop, production run with SSD, now the second NIC).

    If we ever meet, the first two beers are on me!

    ReplyDelete
  6. ...oh, and in all the excitement, I forgot to ask a question :)

    I assume that
    esxcli network firewall ruleset set -e true -r httpClient
    sets a firewall exception in ESXi, so that the vib can be downloaded. After loading the vib, I would like to delete/disable the rule, since I have no need for it. How to do that? Is replacing the true with false enough?

    ReplyDelete
    Replies
    1. Hi MzR,

      yes, just replace "true" with "false" to revert the change.

      This rule only allows outgoing traffic though, so I do not consider it a big security risk to keep it enabled.

      Andreas

      Delete
  7. Hello I'm having the same issue with esxi 6.1 which really gets me since I spent hours researching MBO to find one with compatible nics. Will this work with 6.1?

    ReplyDelete
    Replies
    1. Hi,

      the fixed driver is compatible with ESXi 5.0, 5.1, 5.5 and 6.0.
      ESXi 6.1 does not exist. If you mean ESXi 6.0 Update 1 then this also counts as ESXi 6.0.

      Andreas

      Delete
  8. This fixes the dual LAN on my Gigabyte GA-Z170N-WIFI motherboard. I could previously only see one of the NICs.
    Thanks for figuring this out. Much appreciated.

    Glenn.

    ReplyDelete
  9. I also have a Gigabyte GA-Z170N-WIFI motherboard (like Glenn) and couldn't figure this out, so many thanks for being an all round top bloke and helping me to save my sanity!

    ReplyDelete
  10. I can confirm that this also work on ESXi 6.0U2 on a Gigabyte GA-X99-Designare EX.
    Thanks for putting it together!

    ReplyDelete
  11. also working on a ga h170n-wifi
    thanks a lot!

    ReplyDelete
  12. Perfect worked for me Gigabyte GA Z270N-WIFI Motherboard. Thank you

    ReplyDelete
  13. I had a simular issue where ESXi 6.0/6.5/6.7 all reported "Invalid NVM Checksum" except this was when loading 82579LM using ne1000(0.8.3-8vmw.650.2.75.10884925) driver. Instead of removing the checksum check and recompiling ne1000 I used Intel's BootUtil - https://www.intel.com/content/www/us/en/support/articles/000005790/software/manageability-products.html) to restore the default config on the 82579LM's NVM "bootutil -NIC=1 -DEFAULTCONFIG. Rebooted ESXi6.5u3 and no more checksum error. I wonder if this might have also worked for the i211/i350 and the stock net-igb driver? I also think this might be why people are having issues with 82579LM/82574L and trying to install https://vibsdepot.v-front.de/wiki/index.php/Net-e1000e even though e1000e/ne1000 already support 82579LM/82574L.

    ReplyDelete

***** All comments will be moderated! *****
- Please post only comments or questions that are related to this post's contents!
- Advertising and link spamming will not be tolerated!