How to do an Online Virtual Connect firmware upgrade

Okay, this is a follow-up to my previous post ... I was finally able to find out on my own how to do this. The answer is in HP's white paper "HP Virtual Connect Firmware Upgrade Steps and Procedures". This is a must read for anyone being concerned with the VC firmware upgrade process, I will try to summarize the most important points here.

You must use the Virtual Connect Support Utility (VCSU). The current version is 1.60 and is available for download here.

It helps to understand how the VCSU does the upgrade: First it uploads the new firmware to all VC modules simultaneously. This phase is absolutely uncritical, because the VC modules continue working normally during the upload. If you use the default parameters it will then activate the new firmware by rebooting the VC modules one after the other in a controlled manner - and this is the process that really impacts the network availability of your hosts and VMs!
Why? The controlled reboot takes 20 or more seconds, and - of course - the VC module will not properly forward and receive network traffic during that time. However, the blade servers, resp. their NICs that are connected to this module are not properly disconnected during that time, i.e. they do not get a link down notification! If you use the default failover detection method for your virtual switches (Link state only) the hosts will continue using the up-links to the module that is just rebooting, and this results in a loss of network connectivity.

So, how do you cope with that? One possible work around is to use Beacon probing as the failover detection method for the virtual switches. But in my opinion this is not the best and easiest choice. No, the real answer is on page 13 of the white paper:
"For the customer environments where changing Network Failover Detection options or HA settings is not possible, utilizing VCSU manual firmware activation order (-of manual) is recommended. In this case, modules will be updated but not activated and the user will need to perform manual activation by resetting (rebooting) modules via OA GUI or CLI interface. This option will eliminate potential of up to 20 sec network outage that may occur on a graceful shutdown of VC Ethernet and FlexFabric modules."
Using the manual activation order (parameters "-oe manual" and "-of manual") ensures that the VCSU will not gracefully reboot the VC modules at all. You then need to do that on your own (just manual), by resetting the VC modules through the Onboard Administrator (OA). When you do a hard reset of a VC module the connected hosts will immediately get link down notifications, just as if the module suddenly fails or loses all its own up-links because the external switch failed. You should just wait about 5 minutes for the resetted module to get fully online before you reset the second one.

If your ESX(i) hosts are properly and redundantly configured you will notice only a minimal network interruption during this process. In my test it was just a single ping drop.

Yes, that's the whole secret of doing an online VC firmware upgrade! For me only one questions remains: Why is HP making it so hard to find this information? If you search hp.com for instructions on how to do this you will find tons of useless and contradicting information on this topic, and even their own Support engineers are not able to give a quick and right answer to the question. At least, one of them sent me a copy of the white paper (he could not just provide a link to it, because he was not able to find it on the HP pages...).

HP Virtual Connect firmware update - can you do this online?

I don't know the answer to this question, but I'm trying to find this out ...

We have two HP c7000 enclosures with Virtual Connect FlexFabric modules to connect to external Cisco Ethernet switches and Brocade FC switches. Both enclosures are fully loaded with 8x BL620c G7 blade servers running ESXi 4.1 Update 2.
Right now we are still able to completely evacuate an enclosure if we want to do maintenance (mainly firmware upgrades) on it, because we have stretched two clusters over both enclosures that each have not more than 50% of their capacity used.

However, given our current VM growth rate we will soon reach a point where this will be no longer possible (without purchasing and deploying a third enclosure). So, I'm currently testing and looking for ways to do an online Virtual Connect firmware upgrade without interrupting network and SAN connectivity. With all the redundancy that is in the enclosure this should be possible, and an HP engineer I lately talked to confirmed that this is indeed possible using HP's Virtual Connect Support Utility (VCSU), and he pointed me to its manual for instructions.

I remember that I already tried this method a while ago. I don't know the firmware and tool versions anymore that I did this test with, but it was not very successful. Although I followed the instructions given I noticed ping timeouts for up to 15 seconds during the upgrade process (I was pinging the hosts VMkernel address).

I just started a thread in the VMTN forums to get some input from others. Has anyone done this successfully? Is there anything to check and configure that is not obvious before trying this? Please share your experience by posting to the VMTN thread or leaving a comment here. Thanks!

Once I have found a working method I will of course update this post!

Update (2011-12-21): I found it ... Read about it in my next post!

ESXi-Customizer 2.6 and Tgz2Vib5 1.0

I just published the new version 2.6 of my ESXi-Customizer script.

What's new:
  • With this version you are able to optionally create an (U)EFI-bootable ISO file for the installation of ESXi 5.0. (U)EFI stands for (Universal) Extensible Firmware Interface. This is going to replace the current BIOS firmware interface on modern PCs. Please note that the original VMware ESXi 5.0 ISO is already UEFI-capable, the new version of my script is just able to keep this possibility in the customized ISO.
  • The new version includes an additional utility script called Tgz2Vib5. With this script you are able to convert an OEM.tgz-style driver package (for ESXi 5.0 only!) into a VIB file. That is the "official" VMware format for software packages - read more about it in this earlier post!
I'd like to encourage the developers of community supported ESXi 5.0 drivers to convert their packages into VIB format (using Tgz2Vib5) before publishing them! The VIB format has several advantages over the traditional OEM.tgz format:
  • You can add descriptive meta data (like vendor/author name, version and detailed description) to the driver package
  • Unlike an OEM.tgz file a VIB file can be easily installed into an already running ESXi 5.0 system by running the following commands inside a local or remote ESXi shell:
      esxcli software acceptance set --level=CommunitySupported
      esxcli software vib install -v VIB-URL

    (The host must not run any VMs at install time, because it needs to be rebooted after the installation.)
  • A VIB file can also be updated with a newer version without having to re-install the whole system. This can be achieved by running the following command inside a local or remote ESXi shell:
      esxcli software vib update -v VIB_URL


vSphere 4.1 Update 2 released - What's in it for me (and you)

VMware released Update 2 for vSphere 4.1 on Oct 27th. It includes numerous bug fixes for the vCenter server and client (see VC resolved issues) and ESXi (see ESXi resolved issues).

 I will list some of the fixes here, because I personally welcome them very much, and I'm sure that others will feel the same:
  • The vSphere client performed badly with Windows 7, because of frequent screen-redraws when the Windows desktop composition feature was enabled. The only workaround was to disable desktop composition while running the vSphere client. This should be fixed now.
  • There are multiple fixes and enhancements to ESXi syslogging:
    • If ESXi fails to reach the syslog server while booting, it now keeps retrying it every 10 minutes.
    • Very long syslog messages (like these produced by the vpxagent ...) are no longer truncated or split into multiple lines. If you are using a third-party solution like Splunk for collecting syslog messages then you will certainly welcome this, because it is nearly impossible to handle split messages correctly with them.
  • But the most important issue that is resolved in Update 2 is this: "Virtual machine with large amounts of RAM (32GB and higher) loses pings during vMotion". Uh, what? VMs losing pings when being vMotioned? Yes, this can really happen (without Update 2), and I personally experienced this problem: When one of two clustered Microsoft Exchange 2010 VM with 48GB RAM was vMotioned it lost network connectivity for more than 15 seconds (between 20 and 30% of the vMotion progress) which triggered a cluster failover. We have not yet checked if this particular issue is really resolved now with Update 2, but VMware Support had put us off to it when we complained about that, so there is a good chance ...

Update: ESXi 5.0 on HP G7 blades, now a Go!

About three weeks back I reported on Emulex firmware problems that prevented the use of ESXi 5.0 on HP G7 blade hardware. This was fixed now, somehow...

HP has now updated the advisory that describes the issue and published an updated firmware that fixes the VLAN handling problems with ESXi 5.0 if it is used together with the be2net driver 4.0.355.1.

Be sure that you read the release notes of the firmware! It looks like it is an emergency/workaround release that leaves many issues unresolved. A firmware version that you can really trust for production will probably be available mid-November.

Update (2012-12-09): HP and Emulex published the final version of the OneConnect firmware (4.0.360.15a) on Nov 19th. VMware's KB2007397 also lists the recommended drivers to use with this firmware for both ESXi 4.1 and 5.0.

Update (2012-03-09): HP has published yet another firmware update on March 5th. Download version 4.0.360.15b. The previous link has become invalid.

Update (2012-04-16): Please refer to my HP & VMware links page to find the download for the latest version of the firmware.

VMware finally released the Open Source Code of vSphere 5.0!

Great news! Today VMware finally made the vSphere v5.0 Open Source code archives available for download.

Why is that important?

Since the release of VMware's ESXi 5.0 (Aug 24, 2011) many people are asking for the development of drivers for hardware devices that are not supported by ESXi 5.0 out-of-the-box.

ESXi device drivers are based on Linux device drivers (which lead to the persistent misunderstanding that ESXi itself is based on Linux), but the stock Linux driver code must be modified in a specific way to be compatible with ESXi.

With past versions of ESXi (up to 4.1) it was possible to study and reproduce these required modifications, because VMware published the source code of the ESXi device drivers (the original Linux code plus their modifications). The reason for this is that most Linux drivers are licensed under the GPL (General Public License), and the GPL requires that derived works are also published under the GPL and their source code is made freely available (aka the "Copyleft" principle).

Now, that VMware also published the Open Source Code of ESXi 5.0 (including the device drivers that it contains) it will be possible (or at least much easier) to develop custom ESXi 5.0 drivers for devices that are not officially supported by VMware.


HP Virtual Connect profile not applied ...

When I recently rebooted one of our BL620c G7 blades with ESXi 4.1 installed on it I found that the server had suddenly lost network connectivity after the reboot.
A quick check on the console revealed that the Virtual Connect profile that was defined for that blade had not been applied to it. I realized that because the MAC addresses of the NICs had not been overwritten with the virtual addresses of the Virtual Connect profile.

I tried powering down and up the blade, re-assigning the Virtual Connect profile multiple times, all to no avail ... Then I had the idea that it might be related to the iLO-board of the blade, and - yes, indeed - after resetting the iLO3-board of the blade the Virtual Connect profile was properly applied and all was fine again.

While later looking on hp.com for some related information I stumbled over the Customer advisory c02820591 that described an issue with the Virtual Connect profile being lost upon an iLO3 reset. Not exactly the issue that I had, and the advisory also stated that this is fixed with iLO3 firmware version 1.20, and that is already installed on our iLOs. However, the advisory confirmed my assumption that the Virtual Connect profile is applied by the iLO-board.

So, if you have similar problems try resetting the iLO-board before you start pulling your hair out, or the blade out of the chassis ...

Currently a No-Go: ESXi 5.0 on HP G7 blades

Back in May I reported on problems with ESXi 4.1 and the Emulex OneConnect CNA that is built into HP's G7 blade servers.
If you now try to install ESXi 5.0 on such a hardware you will have a strong déjà vu: The be2net driver that is available right now for ESXi 5.0 is not really functioning due to "VLAN tagging issues". HP has published an advisory on this stating that an updated driver (that should fix these issues) is "currently in the certification process" and will be made available in "Q4 2011".

Okay, I won't update our production hosts to ESXi 5.0 that soon anyway, but I just wanted to install it on some spare blades for testing and evaluation. Too bad ... waiting for a fix again ...

Update (2011-10-27):
HP has now updated the advisory and published an updated firmware that fixes the VLAN handling problems with ESXi 5.0 if it is used together with the be2net driver 4.0.355.1.
Be sure that you read the release notes of the firmware! It looks like it is an emergency/workaround release that leaves many issues unresolved. A firmware version that you can really trust for production will probably be available mid-November.

Update (2012-04-16):
In the meantime it looks like all problems have been fixed with newer firmware and driver versions. Please refer to this newer post of mine!

Unable to assign license after installing a server with the HP ESXi 5.0 ISO

With the availability of vSphere 5.0 HP published a customized ESXi installation ISO for HP servers.
There have been reports that this build includes an annoying bug: HP has included a license file that has wrong permissions set. That potentially causes errors once you want to assign an own license to the host.

You can fix this by removing the offending license file with the following commands (to be executed on the ESXi host after directly after it has been installed):

  esxcli software vib remove -n hp-esx-license --no-live-install

and reboot the host. The command will remove the HP license and restore the original state of the host being in evaluation mode.

HP has also published an advisory describing the problem and providing a way to update their license package to fix the problem.


[Announcement] ESXi-Customizer 2.5 ...

... is out now. See the prior post for what's new. Download it from the project page!



VIB files, Offline-Bundles and ESXi-Customizer 2.5

The current version 2.0 of my ESXi-Customizer script is able to add OEM.tgz-style driver packages to ESXi 4.1 and ESXi 5.0 installation ISOs. Tgz (short for tar.gz) is the format that is used to distribute all community-developed drivers for ESXi 4.1 (and prior versions). So these can currently be added to ESXi 4.1 with my script.

However, for ESXi 5.0 you cannot use the driver packages for ESXi 4.1, they need to be re-engineered and re-compiled starting with the stock Linux driver code that is the basis for ESXi drivers. It looks like - so far - nobody in the user community has figured out how to compile drivers for ESXi 5.0. I hope that this will change in the near future, but for now the current version of the ESXi-Customizer script is pretty much useless for ESXi 5.0 ...

On the other hand, there are new and updated driver packages for ESXi 5.0 already available directly from VMware or 3rd party hardware vendors. These drivers are distributed in VIB format and as so-called Offline-Bundles (in zip-format). I wondered if my script could also support these "official" VMware package format and had a closer look at the format of these files.

(Note: To fully understand the following, it is helpful to read my Anatomy post about the structure of the ESXi 5.0 ISO first!)

What is a VIB file?
VIB stands for "VMware Installation Bundle". If you just open a vib-file in a text editor you will see that it starts with the (somewhat well known) header "!<arch>". This means that the file is in the Unix ar format. Wikipedia has a detailed description of this file format, and there you can learn that VIB files use the common ar format, and that you can use the Unix ar command to handle this kind of archive files.
Fortunately, 7-zip - the Swiss army knife of packaging tools - is also able to handle ar-files, and it is also available for Windows. Since I already use it with ESXi-Customizer to unpack ISO-files I was delighted to realize that it is also able to unpack a vib-file ...

So, what's in the VIB file? Exactly three files:
  1. A file named "descriptor.xml": The name says it all. This file describes the contents and dependencies of the driver package. Later you will find that again in the IMGDB.TGZ file. For more details see section 4 of my anatomy-post. There you will find the descriptor.xml file of the e1000-driver as an example.
  2. A file named "sig.pkcs7": VMware certified drivers need to be electronically signed, and this file includes the uuenconded PKCS7 signature.
  3. The payload file: You will find the exact name of this file in the name-attribute of the <payload> tag in the descriptor.xml file. It does not have any file extension.
By design a VIB file can contain multiple payload files. However, each VIB file I looked at contained only exactly one. The payload file's type was always VGZ (short for vmtar.gz), but TGZ-format should also be possible. In fact the payload file is just the archive that makes up the actual driver package. Section 3 of my anatomy post describes that in more detail.

What is an Offline-Bundle?
Offline-Bundles come in ZIP-format and are just a collection of one or multiple VIB files. An Offline Bundle contains additional meta data in an included archive named metadata.zip, the VIB-file(s) are stored in the archive's sub directory vib20\.

Is every ZIP-file an Offline-Bundle?
At the VMware site you can download new and updated drivers for ESXi 5.0 from the Drivers & Tools / Driver CDs section of the vSphere 5 download page. These downloads are also in ZIP-format. However, if you look at the contents of such a downloaded file you will notice that it is not an Offline-Bundle itself, but includes another ZIP-file that actually is the Offline-Bundle!

So, please be careful... The real Offline-Bundle-ZIP files have the string offline_bundle or just bundle in their names (e.g. LSI_5_34-offline_bundle-455140.zip). If you are unsure then look into the zip file and check if it includes a metadata.zip file and a vib20\ sub-directory.

ESXi-Customizer 2.5
... will be out soon, and it will support adding VIB files and Offline-Bundles to an ESXi 5.0 media! Watch this space for the announcement.

How ESXi-Customizer supports ESXi 5.0 - FAQ

I got a lot of feedback after posting the new ESXi-Customizer (with support for ESXi 5.0) and the "anatomy"-article explaining its technical background. It looks like I haven't been clear enough on some points and need to provide some additional information. So here is a list of frequently asked questions (FAQ), I might update it from time to time, so stay tuned.

1. Can I use existing drivers (made for ESXi 4.x) for customizing ESXi 5.0?

No, you can't. Driver binaries compiled for ESXi 4.x are not compatible with ESXi 5.0. They just won't be loaded. Instead vmkload_mod will throw the error message "Module does not provide a license tag".

2. What input does ESXi-Customizer expect for customizing ESXi 5.0?

It expects a gzip-compessed tar-file (with extension .tgz) that includes exactly three files:

  • /usr/lib/vmware/vmkmod/<driver-module> (the binary driver module)
  • /etc/vmware/driver.map.d/<driver-name>.map (maps PCI device IDs to the binary module)
  • /usr/share/hwdata/driver.pciids.d/<driver-name>.ids (maps PCI device IDs to display names)
Nothing more is needed. All other steps outlined in the "anatomy"-post will be done by ESXi-Customizer.

3. When and where will ESXi 5.0 compatible community drivers be available?

ESXi device drivers are derived from device drivers written for the Linux kernel. However, it is necessary to make specific changes to the source code of a stock Linux driver to turn it into an ESXi driver. An experienced Linux developer can find out what changes are necessary by studying the complete source of the existing ESXi drivers that are shipped with ESXi 5.0.

The source code of these drivers has not yet been published by VMware. However, they are obliged to do this (sooner or later), because most of the original Linux drivers are licensed under the GNU GPL requiring that the source code of derived works also needs to be publicly available.
So, we need to wait for VMware to publish the OpenSource code of its drivers (we can expect it here), and for some knowledgeable people to compile new ESXi 5.0 compatible drivers then.

I am confident that this will happen in the near future. And I expect the new drivers to become available at Dave Mishchenko's vm-help.com, the home of the ESXi Whitebox HCL.

4. Does ESX-Customizer support creating a bootable USB-key with ESXi 5.0?

No, it does not. If the machine that you want to install ESXi on does not have a CD-ROM drive, you can help yourself by installing ESXi 5.0 using any other machine (that has a CD-ROM drive) onto a USB key drive. Once you have a bootable USB key you can use that to also boot any other machine!
The easiest and safest method is to use a virtual machine provided by VMware Workstation or VMware player to do the initial install. Yes, ESXi 5.0 can be installed in a VMware Workstation VM - just select "ESX Server 4" as the guest OS type.

The anatomy of the ESXi 5.0 installation CD - and how to customize it

1. Introduction

With vSphere 5 VMware introduced the Auto Deploy Server and the Image Builder that allow to customize the ESXi installation ISO with partner supplied driver and tools packages.
The Image Builder is a Powershell snapin that comes with the latest version of the PowerCLI package. It allows to add software packages to a pre-defined set of packages (a so-called ImageProfile) and even lets you create an installation ISO from such a baseline making it easier than ever to customize the ESXi installation.

However, doing this is not a straight-forward task. It requires a working installation of the Powershell, plus the PowerCLI software, access to the offline-bundle that makes up the base installation (which is not included with the free version of ESXi!), a custom driver in VIB format, and some guidance on what Powershell-cmdlets you need to use to add the custom driver package and build an ISO from it.
For the developers of custom drivers it requires to supply their packages in VIB format, and it's not trivial and costs extra effort to build such a package (compared to a simple OEM.TGZ file).

I wondered if it is still possible to customize the ESXi 5.0 install ISO with a simple OEM.TGZ file like you can do with ESXi 4.1, e.g. with my ESXi-Customizer script. And yes, it is possible - but it's very different now! I want to provide some background information here on how this works:

2. The contents of the ESXi 5.0 installation ISO

First let's have a look at the root directory of the ESXi 5.0 install ISO:

Contents of the ESXi 4.0 install CD root directory
Unlike the ESXi 4.1 ISO you can see lots of ISO9660-compatible file names here (all capitals and 8.3-format). You can guess from their names that the files with the V00 (and V01, V02, etc.) extensions are device driver archives. The original  type of these files is VGZ, the short form of VMTAR.GZ. That means that they are gzip'ed vmtar-files.

vmtar is a VMware proprietary variant of tar, and you need the vmtar-tool to pack and unpack vmtar archives. It is part of ESXi 5.0 and also ESXi 4.x. Other files have the extensions TGZ and T00 (like TOOLS.T00). These files are gzip'ed standard tar files that the boot loader can also handle. Good.

Comparing with the ESXi 4.1 media you will notice that there is no ddimage.bz2 file any more. In earlier versions of ESXi this is a compressed image that is written to the installation target disk and contains the whole installed ESXi system. Actually you can write this image to a USB key drive to produce a bootable ESXi system without ever booting the install CD. You cannot do this with ESXi 5.0 any more. However, customizing the install CD has become easier this way, because you do not need to add a second copy of your oem.tgz file to this system image.

There are also files named ISOLINUX.BIN and ISOLINUX.CFG in the ISO root. That means that ESXi 5.0 still uses the isolinux boot loader to make the installation CD bootable. If you look into ISOLINUX.CFG it includes a reference to the file BOOT.CFG, and in BOOT.CFG you find references to all the VGZ and TGZ files:
Contents of the BOOT.CFG file
A second copy of the BOOT.CFG file is in the directory \EFI\BOOT. The ESXi 5.0 install ISO (and ESXi 5.0 itself) was built to boot not only on a standard x86 BIOS, but also on new (U)EFI enabled BIOS versions. Just one thing to remember: If you change the one BOOT.CFG you better make the same change to the other.

Now let's have a closer look at a driver VGZ package.

3. What's in a driver's vgz-file?

As mentioned before you need the vmtar-tool to look into a VGZ-file. Since it is only part of ESXi itself you need to have access to an installed copy of ESXi (either 4.1 or 5.0). Luckily you are able to install ESXi 4.1 (and also 5.0!) inside a VMware Workstation 7 VM.
I did this by creating a VM of type "ESX Server 4" with typical settings except for the size of the virtual disk (2GB is enough for ESXi) and installing ESXi 5.0 in it. During installation the driver files from the CD root are uncompressed and copied to the directory /tardisks, so here is where you can find them again. After enabling the local shell (luckily still available with 5.0) I logged in and was finally able to look inside and unpack such a driver archive using the vmtar tool:
Unpacking NET-E100.V00 with the vmtar tool
So there are basically three files in the archive:

1. The driver binary module (with no file name extension, e1000 in this example) that will be unpacked to the well known location /usr/lib/vmware/vmkmod.

2. A text file that maps PCI device IDs to the included driver:
Contents of /etc/vmware/driver.map.d/e1000.map
3. Another text file that maps PCI IDs to vendor and device descriptive names:
Contents of /usr/share/hwdata/driver.pciids.d/e1000.ids

It is good to know that the PCI ID mapping files are now separated by driver. In ESXi 4.1 there is a single pci.ids file and a single simple.map file for all drivers which raised the potential of having conflicting copies of these files in case you merged multiple OEM drivers into the image.

It looks easy now to add a custom driver to the install CD: Just create a tgz-file containing the three files mentioned above, copy it to the ISO root directory and add its name to the two BOOT.CFG files. And yes, this will indeed work for the CD boot! The custom driver will be loaded and you will be able to install ESXi, ... but the installation routine will not copy the tgz-file to the install media, and if you boot the installed system the first time it will behave like a regular install without the custom driver.

So, there is more to it...

4. The image database IMGDB.TGZ

There is a file named IMGDB.TGZ in the root directory of the CD that is also listed in the BOOT.CFG files and has the following contents:
Unpacking the IMGDB.TGZ file
It contains files that will be unpacked to the directory /var/db/esximg. For each driver (or other software package) an XML-file is created under the vibs sub directory. There are a lot more of these files than shown here (I fiddled the output with "..."), one example is net-e1000--925314997.xml for the e1000 driver. Let's look into this file:
The contents of net-e1000--925314997.xml
The xml-file contains information about the package including possible dependencies on other packages and a list of all included files. Its file name ("net-e1000--925314997.xml") consists of the name element plus a (probably) unique number with 9 or 10 digits. The list of payloads is the list of included archive files (either of type vgz or tgz), in most cases it's just one. The name of the payload is limited to 8 characters ("net-e100" in this case) and is the name of the corresponding file in the CD's root directory. The extension of this file is expected to be ".v00" if the file is of type vgz and ".t00" if the file is of type tgz. If there are name conflicts with other packages the number in the extension is counted up. E.g. the payload file for the e1000e driver is "net-e100.v01".

Then there is the host image profile XML file in the directory /var/db/esximg/profiles. In our example this is the file ESXi-5.0.0-381646-standard1293795055. Let's look into this one:

... ... ... (lot more <vib></vib> entries cutted) ... ... ...
Contents of the host image profile XML file
Here we find a list of all vib-packages that make up the currently installed system. Please note that the vib-id of a package strictly corresponds to the element values that are in the associated vib xml file (see picture before), it is composed the following way:
<vendor>_<type>_<name>_<version>
So the vib-id element of the net-e1000 driver e.g. is
VMware_bootbank_net-e1000_8.0.3.1-2vmw.0.0.383646

The payload names that are listed in the image profile file are the same as in the distinct vib xml files with the exception that here the exact file names (e.g. "net-e100.v00") are listed rather than just the file type (vgz or tgz).

Conclusion: If we want to add a custom driver to the install CD we need to do the following (in addition to the steps described in section 3.): modify the contents of IMGDB.TGZ, add a vib xml file for the driver (similar to net-e1000...xml) to it and update the contained image profile file to include the driver as an additional <vib>-entry.

There is another particular XML element in both the vib files and image profile file that we need to take care of: the <acceptancelevel>. VMware distinguishes four different acceptance levels: VMwareCertified, VMwareAcceptedPartnerSupported and CommunitySupported, in the XML files they are coded as certified, vmwarepartner and community. The names are pretty self-explanatory, and one can easily guess that certified is stricter than vmware that is stricter than partner that in turn is stricter than community. In other words: If the host image profile is of acceptance level certified only packages of the same acceptance level can be part of it. If it is of acceptance level vmware only VMware certified and VMware accepted packages can be installed. If it is of acceptance level partner (and this is the default!) partner supported packages can be installed in addition to that. The least restrictive level is community that would accept all four types of packages.
My expectation is that custom drivers for whitebox hardware are community supported (unless they are published by a hardware vendor company). However, if the driver's vib file contains the acceptance level community the image profile's acceptance level must also be changed to community. Otherwise the installation of the package will fail.

5. Can we automate it?

Yes, we can! The latest version of ESXi-Customizer does automate all the steps described here to add custom drivers in tgz-format to an ESXi 5.0 install ISO. You only need to feed it with a tgz-file that contains the three files listed in section 3 of this post.

Please note: Packages made for earlier ESXi versions will not work with ESXi 5.0, not only because the directory structure has changed, but also because the earlier versions' driver modules won't be loaded by the new version! And - at the time of this writing - there are probably no oem.tgz-style driver packages available that are compatible with ESXi 5.0!
Hopefully, this will soon change. If you are looking for a driver of a device that does not work out-of-the-box with ESXi 5.0 check the Unofficial Whitebox HCL at vm-help.com.


How to throttle that disk I/O hog

We are in the middle of a large server virtualization project and are utilizing two Clariion CX-400 arrays as target storage systems. The load on these arrays is increasing while we are putting more and more VMs on them. This is somewhat expected, but recently we noticed an unusual and unexpected drop of performance on one of the CX-400s. The load on its storage processors went way up and its cache was quickly and repeatedly filled up to 100% causing so-called forced flushes: That means the array needs to shortly stop any I/O coming in while it is staging the cache contents down to the hard disks in order to free the cache up again. As a result overall latency went up and throughput went down, and this affected every VM on every LUN of this array!

As the root cause of this we identified a single VM that fired up to 50.000(!) write I/Os per second. It was a MS SQL server machine that we recently virtualized. When it was on physical hardware it used locally attached hard disks that were never able to provide this amount of I/O capacity, but now - being a VM on high-performance SAN storage - it took every I/O it could have, monopolizing the storage array's cache and bringing it to its knees.

We found that we urgently needed to throttle that disk I/O hog, or it would severely impact the whole environment's performance. There are several means to prioritize disk I/O in a vSphere environment: You can use disk shares to distribute available I/Os among VMs running on the same host. This did not not help here: the host that ran the VM had no reason to throttle it, because the other VMs it was running did not require lots of I/Os at the same time. So, for the host there was no real need to fairly distribute the available resources.
Storage I/O Control (SIOC) is a rather new feature that allows for I/O prioritization at the datastore level. It utilizes the vCenter server's view on datastore performance (rather than a single host's view) and kicks in when a datastore's latency raises over a defined threshold (30ms by default). It will then adapt the I/O queue depth's of all VMs that are on this datastore according to the shares you have defined for them. Nice feature, but it did not help here either, because the I/O hog had a datastore on its own and was not competing with other VMs from a SIOC perspective ...

We needed a way to throttle the VM's I/O absolutely, not relatively to other VMs. Luckily there really is a way to do exactly this: It is documented in KB1038241 "Limiting disk I/O from a specific virtual machine". There are VM advanced-configuration parameters described here that allow to set absolute throughput caps and bandwidth caps on a VM's virtual disks. We did this and it really helped to throttle the VM and restore overall system performance!

By the way, the KB article describes how to change the VM's advanced configuration by using the vSphere client which requires that the VM is powered off. However, there is a way to do this without powering the VM off. Since this can be handy in a lot of situations I added a description of how to do this on the HowTo page.

Update (2011-08-30): In the comments of this post Didier Pironet pointed out that there are some oddities with using this feature and refers to his blog post Limiting Disk I/O From A Specific Virtual Machine. It features a nice video demonstrating the effect of disk throttling. Let me summarize his findings and add another interesting information that was clarified and confirmed by VMware Support:
  • Unlike stated in KB1038241 you can also specify IOps or Bps values (not only K, M or GIOps resp. K, M or GBps) for the caps (e.g. "500IOps"). If you do not specify a unit at all IOps resp. Bps is assumed, not KIOps/KBps like stated in the article.
  • The throughput cap can also be specified through the vSphere client (see VM properties / Resources / Disk), but not the bandwidth cap. This can even be done while the machine is powered on, and the change will become immediately effective.
  • And now the part that is the least intuitive: Although you specify the limits per virtual disk the scheduler will manage and enforce the limits on a per datastore(!) basis. That means:
    • If the VM has multiple virtual disks on the same datastore, and you want to limit one of them, then you must specify limits (of the same type, throughput or bandwidth) for all the virtual disks that are on the same datastore. If you don't do this, no limit will be enforced.
    • The scheduler will add up the limits of all virtual disks that are on the same datastore and will then limit them altogether by this sum of their limits. This explains Didier's finding that a single disk is limited to 150IOps although he defined a limit of 100IOps for this disk, but another limit of 50IOps for a second disk that was on the same datastore.
    • So, if you want to enforce a specific limit to only a single virtual disk then you need to put that disk on a datastore where no other disks of the VM are stored.

[Update] ESXi-Customizer 1.2 - another bugfix release

If you used the Advanced edit mode with ESXi-Customizer 1.0 or 1.1 and got a "Corrupt boot image" message in ESXi (either when booting the customized ISO or after having installed with it) ... this was caused by a corruption of the OEM.tgz file while re-packaging it.

It was very hard to find a Windows version of tar that produces tar archives which are fully compatible with ESXi. But (I hope) I have finally found one: a Windows port of busybox. Since ESXi uses busybox, too, this should guarantee maximum compatibility. If you ever wondered what a Windows port of busybox could be good for ... now you know ;-)

Please update to version 1.2 that incorporates this fix, and let me know if you are still struck by this bug! Please download it from the project page!

vSphere 5: release date rumors and licensing changes

From what I have heard the originally targeted release date for VMware's vSphere 5 was August 5th. Now this has passed and it did not happen. There are now rumors ongoing that it will be released on August 22nd (see source)...
I don't know why it is being delayed. One possible reason is the change in licensing that was announced on August 3rd (see VMware's Power of Partnership Blog). With the revelation of vSphere 5 on July 12th VMware introduced a new licensing method based on vRAM (the amount of RAM allocated to running VMs) which lead to a storm of protest among customers and partners, especially because of the low amount of vRAM per physical CPU that was originally communicated. With the announcement above VMware has doubled this entitlement for most vSphere editions and they also capped the accountable vRAM for a single VM to 96GB (even if it has more RAM than that).
This will definitely help to speed up the adoption of vSphere 5 ... once it is released.

Update (2011-08-23): Okay, nothing again ... So it will probably happen on Friday (August 26th), just before VMworld 2011 (starting on Monday 29th).

Update (2011-08-25): It is out now, the official release date was August 24th. Customers with subscription go here to download. The free ESXi version is available here.


[Update] ESXi-Customizer 1.1 - bugfix release

I published an updated version of my ESXi-Customizer script. There was an annoying bug with the "Advanced edit"-mode causing the oem.tgz file to become corrupted during re-packaging. This has been fixed, and I also added an update check feature to let the script check for newer versions of itself.

Download it from the project page.

[Release] ESXi-Customizer

Have you ever tried installing ESXi on a hardware box that is not explicitly supported by VMware? If you try this you will often run into the problem that the original VMware install ISO does not include a driver for the storage controller or network card that is in your box.

There is a community that works on building drivers for such devices (so called Whitebox hardware, see www.vm-help-com), and there are instructions and scripts available for adding these drivers to the original install ISO. However, they all require some (at least basic) Linux knowledge and access to a Linux system.

This has changed now ... I have written a script that automates this task and runs entirely on Windows 7. Visit the project page on this blog site to learn more!

Improve your vSphere client's performance

Are you tired of staring at this window?
vSphere Client taking ages to load a VM view
If you manage a vSphere environment with several hundreds of VMs you might notice a disturbing slowness in screen refreshes when you initially look at lists of lots of VMs, try to refresh such views or resort them by clicking on the attributes' columns.

We have been struggling with this for a long time (in fact, since we upgraded to vSphere 4) without ever finding out how to improve or resolve this.
Now I got the tip to look at VMware's KB1029665. It exactly describes this symptom and recommends tuning the Java Memory pool of the Tomcat installation that is used on the vCenter server.

And yes, it got better after implementing this! Don't expect miracles - the first load of the complete VM view will still be slow, but subsequent viewing, sorting and scrolling is faster than without his modification.
However, you need to be aware that this actually changes the memory footprint of the vCenter server. So you might want to review its RAM configuration. Easy, if you have it running as a VM ...

vSphere 5 licensing - check your environment now to see how it affects you

There has been a lot of rant about the new licensing model of vSphere 5 (see my previous post), because for certain customers (specifically those with a very high RAM per CPU ratio which is more and more common with recent server hardware) will need to buy more vSphere 5 licenses to cover the vRAM usage as they had vSphere 4 licenses before.

Before you start complaining yourself check your environment now to find out how it will affect you. There are a number of PowerCLI scripts available now for doing this. I personally like LucD's the most, get it here: http://www.lucd.info/2011/07/13/query-vram/.

For my production environment it outputs the following:

  vCenter        : [MyVC-FQDN]
  vRAMConfigured : 2732.2
  vRAMUsed       : 2624.8
  vRAMEntitled   : 6000
  LicenseType    : vSphere 4 Enterprise Plus

Note that the used vRAM is lower than the configured vRAM, because it only takes into account the total RAM of all running VMs (and I also have some that are powered off).
The current version of the script also counts the assigned licenses only. However, if you have spare licenses that are currently unassigned, they will also add up to vRAM entitlements once they are upgraded to vSphere 5 (I ask Luc to fix that, maybe there is a new version of his script soon).

Anyway, as you can see, I am lucky with the new licensing model and would (yet) have plenty of unused vRAM in my pool if I upgraded today.

Update: There is now an even better script available by Virtu-Al: It can also handle ESX versions earlier than 4.1, looks for unassigned licenses and has a nice HTML output:
  http://www.virtu-al.net/2011/07/14/vsphere-5-license-entitlements/
It is referenced in an official VMware Blog post that tries to better explain the new licensing model and the motivation behind it.

VMware raised the bar - Announcement of vSphere 5 and other new products

Today VMware made some major announcements of new products and product versions that are planned to be available in Q3 2011 (see original press release):

vSphere 5 includes the following improvements and new features (compared to vSphere 4.1):
  • Improved VM scalability (up to 32 vCPUs and 1 TB RAM) and performance (3x to 4x I/O improvements)
  • New and improved HA architecture (easier to set up and more scalable)
  • Autodeploy: On-the-fly deployment of ESXi hosts through PXE-boot
  • Profile driven storage: allows to define classes of storage (distinguished e.g. by performance and availability) and tie VMs to it by defining "storage policies"
  • Storage DRS: Automatic initial storage placement and balancing of VMs
  • vSphere 5 hosts are ESXi only. No more classic ESX (like previously announced)
  • Change in licensing model: vSphere 5 will still be licensed per physical CPU socket, but introduces another component: vRAM, which is the amount of RAM configured for VMs. Each CPU license entitles for the use of a specific amount of vRAM (dependent on the vSphere edition, e.g. 48 GB for Enterprise Plus). vRAM entitlements can be pooled among all hosts managed by a vCenter instance. For details see the new Licensing Whitepaper.
vCenter Site Recovery Manager (SRM) 5 introduces "vSphere Replication" a.k.a. host-based mirroring and a new "automatic failback" feature. For details see VMware's official product page.

The new vSphere Storage Appliance turns the local hard disks in your ESXi hosts into mirrored, highly available NFS datastores. This way you can use VMotion, DRS and HA without the need for additional shared storage hardware. See the product overview and the technical whitepaper.

vShield 5 introduces new sensitive data discovery and intrusion detection capabilities.

vCloud Director 1.5 now supports fast provisioning with linked clones (a feature that was already available with the Lab Manager product that is now obsoleted by vCloud Director) and supports SQL as its database.

Using hardware-assisted virtualization in Windows Server 2003 32-bit virtual machines

This is the title of a VMware KB article (KB2001372) that was recently posted, and it includes very interesting information for anyone running virtualized Windows 2003 servers on vSphere (so, probably all of us).

ESX(i) is able to use different methods for virtualizing the CPU and associated MMU (memory management unit) instruction sets. You can configure that for a VM in its Advanced Options / CPU/MMU virtualization:

CPU/MMU virtualization settings
In the Binary Translation (BT) mode software emulation is used for both CPU and MMU instructions (the second choice in the picture). For a long time this was the only option, until the CPU vendors Intel and AMD started building virtualization functions into their processors.
Choosing the third option will enable these hardware functions for the CPU instruction set virtualization (if available), but will remain using software virtualization for MMU instructions.
The fourth option will enable hardware virtualization for both types of instructions if available.
It depends on the CPU generation whether none, only the first or both hardware virtualization options are available. Since quite a few years Intel's and AMD's processors support CPU as well as MMU virtualization.

The default in the above dialog is "Automatic". This means that ESX(i) will choose what it considers to be the best option for the type of operating system that you have selected for the VM.
With Windows 2003 this is the "Software" mode. The reason for this is that Windows 2003 with SP1 in fact performs better with software emulation than with hardware virtualization. However, this changed with code changes introduced by Microsoft with SP2. Windows 2003 with SP2 performs better with hardware virtualization in almost any case.
Today, most Windows 2003 servers should have been updated with SP2. So, to ensure best performance you should go and change the virtualization mode of these VMs to one of the hardware-assisted ones.

For more details see the KB-article mentioned above.

Raising the bar, Part V - vSphere 5 is near

If you look at VMware's homepage these days you will notice an announcement of a live event on July 12th. It is titled "Raising the bar, Part V".
You don't need to be a visionary to figure out that this can only mean that VMware will announce the long-awaited new major release of its virtualization platform: vSphere 5.

This does not necessarily mean that vSphere 5 will become generally available on July 12th. However, once it is available I will post a list of at least the most important new features of it. So, stay tuned!

Mysterious port 903

I recently investigated what network ports are used by ESXi 4.1, because I had to compile the firewall requirements for a new deployment of ESXi hosts in a DMZ. There is a detailed source available for that in the VMware KB:
  • KB1012382: TCP and UDP Ports required to access vCenter Server, ESX hosts, and other network components
And there are numerous other sources available (even nice diagrams like this one). In most cases it is obvious that their authors referred to and relied on the above mentioned official VMware KB source.

I'm usually not paranoid, but maybe I talked too much with the IT security guys (who tend to be extremely paranoid ;-)). Anyway, following the rule "Trust no one" I started looking at the network ports that are really used in our current production environment and compared them to the list in the KB article.

So I stumbled over port 903... According to the list both the vCenter server and any vSphere Client connect to an ESXi 4.1 host on that port for accessing the VM remote console. However, when I checked the network connections on the vCenter server and my Windows Desktop running the vSphere Client (with "netstat -an") I was not able to see any connection to an ESXi host's port 903, even when I opened multiple VM consoles. Instead it was obvious that port 902 is used for console connections.

This made me really curious, so I logged on to an ESXi host (in Tech support mode) and checked the open network connections there. In ESXi you use the command "esxcli network connection list" for that which produces an output that is quite similar to the netstat output (With classic ESX the netstat command is still available in the service console).
This command will also list all ports that are opened in LISTEN mode, that means there is some process waiting for connections on that port. But there was no listening process for port 903, and that means that no one and nothing would be able to connect to that port!

I opened a support request with VMware asking for clarification on the mysterious port 903 and was very curious about their answer. Of course, they quoted their own KB article first, insisted on that the port was actually used for this and that, but finally - after raising the issue to engineering - they admitted that "ESXi does not use port 903".
Also a request was made to update the KB article accordingly. So, when you read this it might already have been corrected to not include port 903 anymore, but the numerous third party documents based on KB1012382 will take some more time to be updated ...

Bottom line: Information is good. Correct information is better. Try to verify it if it is really important to you.

A quick primer on Changed Block Tracking (CBT)

We are about to implement a new backup solution that is based on Symantec Netbackup 7, and - like any modern VMware backup solution - it leverages a very cool feature named Changed Block Tracking (CBT) that was introduced in vSphere 4.0 to enable efficient block level incremental backups.

Since it has been around for a while there are numerous good articles around about that topic (see references). I will not just reproduce them here, but summarize the most important key facts you need to know if you come into touch with it for the first time.

1. How does CBT work and what is it good for?
If CBT is enabled for a virtual disk the VMkernel will create an additional file (named ...-ctk.vmdk) in the same directory where it stores a map of all the virtual disk's blocks. Once a block is changed it will be recorded in this map file. This way the VMkernel can easily tell a backup application what blocks of a file have changed since a certain point in time. The application can then perform an incremental backup by saving only these changed blocks.
CBT is also used by Storage VMotion that is able to move a virtual machine's disk files from one datastore to another while it is running.

2. How do you enable CBT?
CBT is to be enabled per virtual disk, and VMware's KB1031873 describes how to do this via editing a VM's advanced configuration parameters through the VI client. Unfortunately this requires the VM to be powered off. However, you can also change the setting while the VM is running by using an appropriate script like the one published here. To make the change effective then you need to perform a so called stun/unstun-cycle on the VM (i.e. power on/off, suspend/resume, create/delete snapshot).
It is important to know that CBT is not enabled by default, because it introduces a small overhead in virtual disk processing.

3. How do CBT and snapshots play together?
When you create a snapshot of a VM's virtual disk an additional ctk-file is created for the delta-disk file of the snapshot. Once this snapshot is deleted again the delta-ctk will be merged with the base-ctk, just like the delta-disk is merged with the base-disk.

4. Important notes and references

  • KB1020128: Changed Block Tracking (CBT) on virtual machines
  • KB1031873: Enabling Changed Block Tracking (CBT) on virtual machines
  • While an application backs up a VM using CBT the VM cannot be vMotioned: KB2001004
  • Inconsistency resolved in vSphere 4.0 U3 and vSphere 4.1: KB1021607
  • KB1031106: Virtual machine freezes temporarily during snapshot removal on an NFS datastore in a ESX/ESXi 4.1 host
  • Eric Siebert on CBT: A detailed introduction
  • Additions by Duncan Epping: Even more details...

How to hide unused FlexNICs

When I configured an HP Blade Enclosure with VirtualConnect modules for the first time I stumbled over an issue that probably has bothered most of the people doing this, especially if they run ESX(i) on the blade servers:

The BL620c G7 blade servers we are using have four built in 10Gbit-ports, and each of them can be partitioned into up to four so-called FlexNICs (or FlexHBAs for FCoE if you use them together with FlexFabric VirtualConnect modules like we do). The overall 10GBit bandwidth of one port will be split among its FlexNICs in a configurable way. You could e.g. have four FlexNICs with 2,5 GBit each, two with 6 and 4 GBit, or any combination of one to four FlexNICs with their bandwidth adding up to 10GBit.
For the OS (e.g. VMware ESXi) that is installed on the blade server each FlexNIC appears as a separate PCI device. So an ESX(i) host installed on a BL620c G7 can have up to 16 NICs. Cool, eh?

However, we did not really want to use too much of that feature and divided the first two 10Gbit-ports in a 4Gbit-FlexHBA and a 6GBit-FlexNIC each. The third and fourth port we even configured as single 10GBit-FlexNICs.

Now, the problem is that every 10Gbit-port will show up as four PCI devices even if you have configured less than four FlexNICs for it. Even if you have not partitioned it at all, but use it as a single 10Gbit-NIC, it will show up as four NICs with the unconfigured ones being displayed as disconnected!
In our case we ended up with ESXi seeing (and complaining about) 10 disconnected NICs. Since we monitor the blades with HP Insight Manager it also constantly warned us about the disconnected NICs.

So, we thought about a method to get rid of the unused FlexNICs. If we had Windows running directly on the blades this would have been easy: We would just disable the devices and Windows (and also HP Insight Manager) would not be bothered by them. However, in ESX(i) you cannot just disable a device ... but you can configure it for "VMDirectPath":

PCI Passthrough configuration of a BL620c G7
This dialog can be found in the Advanced Hardware Settings of a host's configuration. What does it do?
With VMDirectPath you can make a host's PCI device available to a single VM. It will be passed through to the VM, and the guest OS will then be able to see and use that device in addition to its virtual devices.
This way it is possible to present a physical device to a VM that you normally would not be able to add.

In the dialog shown above you configure which devices are available for VMDirectPath (also called PCI Passthrough). You can then add all the selected devices to the hardware of individual VMs.
We really did not want to do the latter... but there is one desirable side effect of this configuration: A device that is configured for VMDirectPath becomes invisible for the VMkernel. And this is exactly what we wanted to achieve for the unused FlexNICs!

So we configured all unused FlexNICs for VMDirectPath, and they were no longer being displayed as (disconnected) vmnics. If you want to do the same you need to know what PCI device a vmnic corresponds to. In the screenshot I posted you will notice that for some of the PCI devices the vmnic name is displayed in brackets, but not for all. So, it can be hard to figure out what devices need to be selected, but it's worth it!

Using Converter: Hot or cold clone?

Ulli Hankeln (aka continuum) recently started an interesting thread in the VMware communities asking what cloning method people prefer with VMware Converter.
Some time ago we started a fairly large server virtualization project (several hundred servers) and ask ourselves exactly this question. After some discussions we identified some evaluation criteria and judged both of two methods following these criteria:

1. Risk of data loss
This is clear point for cold clone, because it will make the machine absolutely unavailable and unchangeable during the conversion process. With a hot clone it is possible that data on the machine is changed during the conversion process that will then not be replicated to the virtual clone.

2. Operations complexity
This in contrast is a clear point for hot cloning. This process is fully automated through the Converter GUI. For cold cloning you need to boot the server from a CD. If you are lucky you can do that remotely be using HP iLO or similar remote control adapters. But you will always stumble over some servers that you will need to visit physically to mount the boot CD.

3. Operations success rate and possible errors
What can go wrong with a cold clone? You might not be able to even start it, because you cannot boot the server through iLO (or similar means) and do not have physical access to the system (at the time you need it) to mount the CD physically. If this is not a problem the second possible error is that the boot CD does not include the necessary drivers for the conversion: You need a working mass storage driver and a working network driver, but the CD only includes a limited set of drivers that might not be suitable for you if the server you are going to virtualize has quite modern hardware. It is possible though to rebuild the boot CD and add additional drivers to it, but this is an extra step and needs extra time and effort.
What can go wrong with a hot clone? You will always be able to start the hot cloning, but in some situations it will fail half way through or at the end. Possible errors are e.g. corrupt file systems or bugs in converter. In such a case troubleshooting is difficult, although you will find some hints in the VMware KB. Fortunately, this does not happen too often. In our experience in clearly less than 5% of all cases.
So, what's working better? It very much depends on your setup, the types of hardware you have, the OSs you are virtualizing and the applications you are running in them.
For us hot cloning wins this criteria, but we were not able to decide that until we already started our virtualization project and made good progress with it.

4. VMware support
In recent releases of Converter (including 5.0 being in beta right now) VMware has dropped support for cold cloning. I guess the reason for this is that they wanted to get rid of maintaining the boot CD and instead concentrate on improving the hot conversion process. And they really did that e.g. by adding incremental data synchronization after the first data copy run.
So, this is a win for hot cloning. Anyway, it is still possible to use cold cloning with the current releases of Converter, because it has support for importing the images of third party cloning tools (like Symantec Ghost and Acronis TrueImage). The reconfiguration process of Converter will make these images bootable with the virtual clone by injecting the necessary mass storage drivers into it.

Conclusions: In our project we have a "hot cloning first"-policy. To mitigate the risk of data loss we stop all unneeded services on the source machine (like application and database services) before starting the process. On machines that are used interactively (like Windows Terminal Servers) we disable logons.
Only if the hot clone fails we try a cold clone as the next step.

As stated earlier, your mileage may vary. What do you think? Like always comments are welcome.

VMware Converter 5.0 beta available

VMware has released a beta of the upcoming new version 5.0 of Converter. Check this community for details.
According to the preliminary release notes the new version will support the next version 5.0 of vSphere, but also earlier versions down to ESX 3.0 and vCenter 2.5.
But the most important and best news is that Converter 5 will properly handle disk alignments (see my previous post about the lack of this in earlier versions)!

Two things to know about VMware Converter

VMware Converter is a tool to convert physical machines (either online or via an existing backup image) to VMware virtual machines. It is available as a free stand-alone version and in a vCenter-integrated version.
Like others we are using it a lot for virtualizing existing physical servers, just because it's free and/or comes pre-installed with vCenter. It also does a pretty good job and is well supported by VMware, but ...
you should be aware of two issues when using Converter more than occasionally.

1. Windows 2000 support
With the latest versions (Stand-alone converter 4.3 and vCenter 4.1-integrated) VMware dropped support for converting Windows 2000 machines (see the notes about supported guest OSs in the Release Notes). The really bad about this is that it does not just tell you this when you try to convert a Windows 2000 machine, but throws an error message about not being able to install the Converter agent on the target computer. It looks like it tries to install the Windows XP version of the agent which fails.
At first it looks like this is not a big problem, because older versions of the converter still support Windows 2000. If you run vSphere 4.1 you can use the stand-alone Converter 4.0.1 to convert Windows 2000 machines by connecting to the vCenter 4.1 server or directly to an ESX(i) 4.1 host. We have done this a lot and it always worked. However, if you carefully look at the Release Notes of Converter 4.0.1 you will notice that it only supports vSphere 4.0 as virtualization platform, but not vSphere 4.1.
We asked VMware support how we - as a vSphere 4.1 customer - are supposed to convert a Windows 2000 machine using Converter in a way that is fully supported by VMware. Here are the instructions (it's only one possible way, but you will get the idea):
a) Install an ESX(i) 4.0 host and add it to an existing vCenter 4.1 instance
b) Use the Stand-alone Converter 4.0.1 to connect to this ESX(i) 4.0 host and convert the Windows 2000 machine
c) Migrate the virtualized Windows 2000 machine to an ESX(i) 4.1 host (either cold or by VMotion)
That's a bit cumbersome, isn't it? Anyway, as stated above you can also use the stand-alone converter 4.0.1 to connect directly to vSphere 4.1. It is not officially supported, but seems to work quite well.

2. Disk alignment
If you care about storage performance then you want your VMFS volumes and your guest OS partitions to be aligned. There are a lot of good explanations about what disk alignment is and why it is important. My personal favorite is on Duncan Epping's blog.
Now, the big issue is that VMware Converter does not align the guest OS partitions in the target virtual machine. Although VMware is also pointing out the importance of disk alignment since a long time (see e.g. this ESX3 white paper) they have still - as of version 4.3 - not yet built this capability into their own Converter product.
So, if you are serious about disk performance and are planning for a large virtualization project you may want to consider alternatives to VMware Converter. There are other commercial products available that do proper disk alignment. One example is Quest vConverter.

Update (2011-09-01): Good news. Today VMware released Converter 5.0 which is now able to do disk alignments!

Network troubleshooting, Part III: A real life example (Broadcom NICs dropping packets)

Recently we had a strange problem inside a Linux VM: a rsync-job that was used to copy data from a local disk to a NFS-mounted share reproducibly failed during data copy with a "broken pipe" error message.

Using the methods I wrote about in Part I and Part II of this little troubleshooting series (and some trial and error for sure) we found out that the issue would only occur if the VM was using a certain type of physical NIC, the HP NC371i (with a Broadcom BCM5709 chipset).
Later we also discovered corresponding VMKernel.log-messages like this one:

... vmkernel: 36:02:06:55.923 cpu5:6816883)WARNING: Tso: 545: TSO packet with one segment.>
... vmkernel: 36:02:06:56.325 cpu5:7129949)WARNING: Tso: 545: TSO packet with one segment.
... vmkernel: 36:02:06:57.128 cpu4:6816885)WARNING: Tso: 545: TSO packet with one segment.
... vmkernel: 36:02:06:57.128 cpu4:6816885)WARNING: LinNet: map_pkt_to_skb: This message has repeated 640 times: vmnic1: runt TSO packet (tsoMss=1448, frameLen=1514)

Enough evidence to open a support call with VMware... The outcome was that there is a known problem with the bnx2 driver (that is used for this type of NIC). It drops TSO packets that are below a certain minimum size it expects. The issue only occurs with some of the Broadcom chipsets that this driver can handle. The BCM5709 was not on the list before we opened our case, but it looks like it is also affected.

By the way, TSO stands for TCP segmentation offload and is used to offload the necessary segmentation of large TCP packets to the NIC's hardware. A good thing, if it works flawlessly.
The obvious workaround is to disable TSO by using the appropriate driver options. You could disable it on the host's physical Broadcom-NICs, but this would mean sacrificing the performance benefits of TSO for all VMs using these NICs.
We did not do that, because all other VMs did not have any problems with TSO. Instead we decided to disable TSO only inside the Linux VM that had this problem. This solved the issue for us.

Network troubleshooting, Part II: What physical switch port does the pNIC connect to?

When you have found out what physical NIC (pNIC) a VM is actually using (see my previous post) you may want to check the external switch port that this pNIC connects to (Is it properly configured, what about the error counters?). Okay, what switch port do you need to check?
It is considered good data center practice to have every connection from every server's pNIC to the switch ports carefully documented. Do you? Do you trust the documentation? Is it up to date? If yes you are fine and can stop reading here...

If you want to be sure, and if you use Cisco switches in your data centers then there is a much more reliable way to track these connections: The Cisco Discovery Protocol (CDP). On Cisco devices this is enabled by default, and it periodically broadcasts interface information to the devices attached to its ports (like your ESX hosts).
By default ESX(i) (version 3.5 and above) will receive these broadcasts and display the information in it through the VI client. In the Hosts and Clusters-view select Networking in the Configuration tab of a host. This will display your virtual switches with their physical up-links (vmnic0, vmnic1, etc.). Now click on the little speech bubbles next to the Physical Adapters and a window like the following will pop up:

CDP information shown in the VI client

You can find a lot useful information here. The Device ID is the name of the Cisco switch. And Port ID shows the number/name of the switch module and the port number on that module. So you can tell your network admins exactly what switch port they need to check.

If CDP information is not available for a physical adapter the pop-up window will also tell you this. Possible reasons: You don't use Cisco switches or have CDP broadcasts disabled on them, or the ESX(i) host's interfaces are not in CDP listen mode.

For more detailed information on CDP and how to configure it in ESX see the VMware KB: Cisco Discovery Protocol (CDP) network information.

Network troubleshooting, Part I: What physical NIC does the VM use?

If you encounter a network issue in a VM (like bad performance or packet drops) a good first question to ask yourself is: Is this issue limited to the VM or can it be pinned to one of the host's physical NICs?
So, you need to find out what physical NIC (pNIC) the VM is actually using. In most environment this is not obvious, because the virtual switch that the VM connects to typically has multiple physical up-links (for redundancy) that are all active (to maximize bandwidth).

Unfortunately, it is not possible to find this out by using the VI client. It does not reveal this information regardless whether you use standard or distributed virtual switches.
You need to log in to the host that runs the VM (see the HowTos section for instructions) and run esxtop.
Press n to switch to the network view, and you will see a picture like this one:

Network view of esxtop (click to enlarge)
Find the VM's display name in the USED-BY column and look to the corresponding TEAM-PNIC column then. In this example the VM FRASINT215 uses vmnic1.