ImageBuilder Deep Dive, Part 1: Building your own customized ESXi ISO

I know that my ESXi-Customizer script has gotten some popularity and a lot of people use it for merging community developed or commercial third-party hardware drivers into the ESXi installation ISO. It has some limitations though - some of you might have stumbled over them already, at least I have described them in the "Known issues" section. Probably the worst is the fact that it cannot handle complex software bundles like ESXi patches (e.g. the ESXi 5.0 Update 1 package).

My recommendation for such cases is to use the VMware supplied and supported method to create your own customized installation ISOs: The ImageBuilder PowerCLI snapin. With this post I'm going to start a series of blog posts that will cover ImageBuilder in detail and will help you to make effective use of it.

This first post will cover the prerequisites and installation of the PowerCLI and will introduce a script that will help you to get the task done. You won't need in-depth PowerShell knowledge for this, but it will definitely help if you are also interested in the remaining parts of the series that will go into the details of the ImageBuilder cmdlets and finally refine the first script to a more universal solution.

Prerequisites and installation

Obviously you need a Windows computer with the current version (2.0) of Powershell installed. Powershell is Microsoft's state-of-the-art scripting language - it is already included with Windows 7 and Windows 2008 R2 server, for earlier versions of Windows it is available as part of the Windows Management Framework. If you are a serious Windows admin you probably have looked at it before. If not then this is a good opportunity to do it. You can start learning e.g. at Microsoft's Script Center, but this is not a prerequisite for now!

The functionality of Powershell can be extended through so-called snapins. VMware makes available such snapins to add functions (so-called cmdlets) that you can use to manage vCenter servers and ESX(i) hosts. The thing is called PowerCLI, and once you have downloaded and installed the current version you are ready to go!

Following is a Powershell script that will create an ESXi 5.0 installation ISO with the current patch level, the HP Offline bundles and some HP drivers. I suggest that you do the following to walk through the explanations below:
  • Download a copy of the script to your computer
  • In Explorer right-click on the file and choose "Edit" from the context menu. This will open it in the Powershell ISE (Integrated Scripting Environment) editor.
  • Within the ISE you can select single (or multiple) lines of the script and execute them separately from the rest:
Run selected lines of code in the Powershell ISE

The first script
# Load the ImageBuilder Snapin
Add-PSSnapin VMware.ImageBuilder

# Reference the VMware ESXi base depot
$baseDepot = Add-EsxSoftwareDepot https://hostupdate.vmware.com/software/VUM/PRODUCTION/main/vmw-depot-index.xml

# Reference the HP VIBs depot 
$hpDepot = Add-EsxSoftwareDepot http://vibsdepot.hp.com

# List the VIB packages of HP depot
$hpDepot.Channels[0] | Get-EsxSoftwarePackage

# Reference downloaded HP driver offline bundles
$be2iscsi = Add-EsxSoftwareDepot "U:\HP-ESXi5-Drivers\be2iscsi-4.0.317.0-offline_bundle-469760.zip"
$be2net = Add-EsxSoftwareDepot "U:\HP-ESXi5-Drivers\be2net-4.0.355.1-offline_bundle-487292.zip"
$lpfc820 = Add-EsxSoftwareDepot "U:\HP-ESXi5-Drivers\lpfc820-8.2.2.105.36-offline_bundle-489567.zip"

# List available Imageprofiles sorted by creation date (newest first)
Get-EsxImageProfile | Sort-Object -Descending CreationTime | Format-List Name,CreationTime

# Create your own Imageprofile
$MyProfile = New-EsxImageProfile -CloneProfile ESXi-5.0.0-20120404001-standard -Name MyProfile -Description "ESXi-5.0.0-20120404001-standard + HP components"

# Add all the HP VIB packages to MyProfile
$hpDepot.Channels[0] | Get-EsxSoftwarePackage | Add-EsxSoftwarePackage -ImageProfile $MyProfile
$be2iscsi.Channels[0] | Get-EsxSoftwarePackage | Add-EsxSoftwarePackage -ImageProfile $MyProfile
$be2net.Channels[0] | Get-EsxSoftwarePackage | Add-EsxSoftwarePackage -ImageProfile $MyProfile
$lpfc820.Channels[0] | Get-EsxSoftwarePackage | Add-EsxSoftwarePackage -ImageProfile $MyProfile

# Export the Imageprofile into an installation ISO file
Export-EsxImageProfile -ImageProfile $MyProfile -ExportToIso -FilePath "U:\MyProfile.iso"
Line 2: This command will add the ImageBuilder snapin to the current Powershell session. Please note: If you start your session with the PowerCLI shortcut on your desktop then the snapin will automatically be loaded and you can skip this line.

Line 5: The Add-ESXSoftwareDepot cmdlet adds an Online depot or a downloaded Offline bundle as a package source to the current ImageBuilder session. The VMware depot referenced here includes the base ESXi 5.0 sources and is needed in any case. Obviously adding an Online depot requires a working connection to the Internet. I could get this to work with a direct connection only, but not through a proxy server. If someone knows how to make this work with a proxy then please comment here!

Line 8: This adds an additional Online depot that was made available by HP and includes their Offline bundles for ESXi 5.0.

Line 11: A software depot object like $hpDepot that was created through the Add-EsxSoftwareDepot cmdlet can be organized in multiple channels (but mostly it's only one channel). By piping the first channel into the Get-EsxSoftwarePackage cmdlet we can list the software packages that are included in this depot. The output looks like this:
Name                     Version                        Vendor     Release Date    
----                     -------                        ------     ------------    
hpnmi                    2.0.11-434156                  hp         29.07.2011 20...
char-hpilo               500.9.0.0.9-1OEM.500.0.0.43... Hewlett... 07.10.2011 14...
hp-ams                   500.9.0.0-55.434156            Hewlett... 09.03.2012 23...
char-hpcru               5.0.0.8-1OEM.500.0.0.434156    Hewlett... 15.07.2011 17...
hpbootcfg                01-00.10                       Hewlett... 15.07.2011 16...
hpacucli                 9.0-24.0                       Hewlett... 02.02.2012 06...
hponcfg                  04-00.10                       Hewlett... 13.11.2011 02...
hp-smx-provider          500.02.10.61.43-434156         Hewlett... 18.11.2011 02...

This step is not really needed to build the new ISO. So you can safely skip it if you already know or are not interested in the contents of a depot.

Lines 14-16: In these lines we add three additional depots, but this time these are not Online depots, but Offline bundles that we downloaded before. You can find the download links on my HP & VMware links page (see section ESXi 5.0 drivers for the HP Emulex 10GbE Converged Network Adapters). But please note: The files that you download from VMware's driver pages are in zip format, but they are not Offline bundles. They include Offline bundles though, so you need to extract the *offline_bundle*.zip files from the downloaded zip-files first. And of course you need to change the file paths used in the script to your own download directory.

Line 19: The base depot that we added in line 5 includes not only software packages, but also so-called image profiles. An image profile is a grouping of software packages out of the depot. In this case each image profile makes up a specific ESXi patch level. The Get-EsxImageProfile cmdlet will list all available image profiles. We pipe it through the Sort-Object cmdlet here to sort the output by creation date. The output will show the newest image profile first and looks like this:
Name         : ESXi-5.0.0-20120404001-no-tools
CreationTime : 16.03.2012 22:59:09

Name         : ESXi-5.0.0-20120404001-standard
CreationTime : 16.03.2012 22:59:09

Name         : ESXi-5.0.0-20120302001-no-tools
CreationTime : 16.03.2012 22:59:08

Name         : ESXi-5.0.0-20120302001-standard
CreationTime : 16.03.2012 22:59:08

Name         : ESXi-5.0.0-20120301001s-standard
CreationTime : 16.03.2012 22:59:08

Name         : ESXi-5.0.0-20120301001s-no-tools
CreationTime : 16.03.2012 22:59:08

Name         : ESXi-5.0.0-20111204001-standard
CreationTime : 31.10.2011 11:24:00

(... output shortened for readability ...)

Each image profile comes in two different flavors: The -standard one includes all packages that make up an ESXi 5 installation, the -no-tools version is the same without the VMware Tools package. We will choose the standard version in our example.

Line 22: The image profiles of the Online depot are read-only. Since we want to modify one of these and add more packages, we need to create a copy first. With the New-EsxImageProfile cmdlet we create a new image profile ($MyProfile) by cloning the newest standard image profile (ESXi-5.0.0-20120404001-standard in this case). We choose the newest one, because it represents the most recent patch level of ESXi! We also assign a new name to the cloned profile (this is mandatory) and a new description (optional).

Lines 25-28: In line 11 we already listed the software packages that are included in the $hpDepot. By piping the output of this command to the Add-EsxSoftwarePackage cmdlet we add all the included packages to our newly created image profile $MyProfile. And we do the same for the three Offline bundles that we added in the lines 14 to 16.

Line 31: In the last line we use the Export-EsxImageProfile cmdlet to "export" our customized image profile into an ISO file. Now guess what: This ISO file is a complete ESXi 5.0 installation media! You can use it to install a machine with ESXi 5.0 with the current patch level and all the HP packages that you added before.

Booting an ImageBuilder customized ESXi 5.0 installation ISO
Wrap-up

In this first part of my ImageBuilder Deep Dive series I introduced a script that you can use to build an up-to-date ESXi 5.0 installation ISO with additional drivers. You learnt some basic terms, the most important cmdlets that are necessary to get the job done, and you will (hopefully) be able to modify the script and adapt it to your own needs.

In the next part I will introduce some more useful ImageBuilder cmdlets, and I will explain how you can integrate community developed software packages to the installation media (a job that you can also do with my ESXi-Customizer script). Stay tuned!

New page: HP & VMware links

I had this plan for some time now: Providing a list with links to HP drivers and firmware downloads and useful documents for VMware ESX(i).

Why? Because it is so hard to find this stuff on the HP pages ... Currently they are revamping their web pages, and I thought maybe this is getting better now, and that they would finally begin to provide their downloads in a well structured way making it easy to find the stuff again and link to it. But they didn't - it looks like they are just changing the layout of the web pages and are trying to break their own ridiculous world record for the longest URLs.

Okay, enough grumbling. Here is the list. Of course it is far from complete and mainly covers the hardware that I'm using myself (that makes sure that I will keep it up to date), but I believe that everyone using VMware on HP hardware will find it useful.

Review: PHD Virtual Backup and Replication for VMware vSphere

One reason why I consider virtual servers to be superior to physical servers is that you have advanced hypervisor based technologies and functionalities available that let you manage virtual servers more efficiently. A good example for this is doing backups.

PHD Virtual was among the first to provide a virtualization specific backup solution. They use a Virtual Appliance approach without the need for physical hardware and make use of VMware's Change Block Tracking (CBT) feature to enable efficient block level incremental backups. Their backup product has evolved since 2006 and is now available as PHD Virtual Backup and Replication v5.4. It is compatible with VMware vSphere 4.1 and 5.0 (I used both for testing) and also works with Citrix XenServer.

How does it work?

PHD Virtual Backup architecture
The main engine of the product is a Linux based Virtual Appliance that is deployed to the virtualization hosts that also run the virtual machines that are to be backed up. For easy scale-up you can have multiple copies of such Virtual Backup Appliances (VBA). They connect to a single ESX(i) host or to a vCenter server managing multiple hosts.

For backing up a VM the VBA first creates a snapshot for it through this connection. Creating a snapshot means that each virtual hard disk of the VM will be frozen into a read-only base disk, and all changes to the disk (done by the guest OS) will be recorded in an extra delta file. The VBA will then hot-add the read-only base disk to itself and do a block level backup of it to a backup store. The backup store can be a locally attached virtual disk or a network share that was exported via CIFS or NFS.

The very first backup will copy all blocks of the disk, but to minimize the overall backup data and time the VBA will use smart deduplication and compression algorithms. For subsequent backups of the same disk you can make use of CBT, and that means that only the blocks that were changed or added since the last backup will be stored. Every backup is a full backup though, because the unchanged blocks are also linked into each new backup set. This strategy is also known as incremental forever. Together with the applied deduplication and compression this is probably the most efficient backup method that you can implement these days.

After all (changed) blocks have been backed up the VBA will hot-remove the disk again from itself, and - as a last step - initiate the deletion of the VM's snapshot. Needless to say the VBA will of course also save the VM's configuration, so that it can later be completely re-created and restored from the backup set.

Ease of installation and configuration

One promise that PHD Virtual does for its product is that you can have it up and running in 5 minutes. So, how long did it take me? Well, in fact it was less than 5 minutes!

The software download includes the VBA in OVF format and a Windows based management console software. In a minute I imported the appliance to the vCenter server of my test environment, it's ~800 MB in size and inflates to 8 GB if thick provisioned. Before powering it on I attached an additional 100 GB disk to the VBA that I planned to use as backup storage. When powered on the VBA configured its network through DHCP (manually setting an IP address is also possible on the console).

While the import was running I also installed the management software on my workstation. A typical Windows setup wizard did the job with five mouse clicks and left me no chance to make a mistake.

Installation done! I then launched the management console and entered the name of the vCenter server and login credentials for it. The application started with a Dashboard view where the VBA was already listed and shown as waiting to be configured:

PHD Virtual Backup console - first launch
In the Configuration dialog you need to define some General Settings and assign a Backup Storage to the VBA:
VBA General Settings
Time sync through NTP should be properly set up in this dialog to ensure correct handling of backup schedules and time stamping of backup sets! The Hypervisor Credentials will be used by the VBA to connect to the vCenter server (or to a single ESX host). The number of Data Streams (i.e. the maximum number of simultaneous backup and restore jobs) is limited to 4 in the Trial version. In the Enterprise version the maximum is 8.

VBA Backup Storage settings
I selected the Attached Virtual Disk as Backup Storage Type in the next tab. Other choices are a NFS or CIFS share. By using local disks you can implement a LAN free backup eliminating any possible network bottlenecks. Backup data is compressed by default, but this option is also configurable here.

After saving the configuration a restart of the VBA is necessary for the settings to become effective. Actually these steps are enough to get you ready for your first backup job! To complete the picture here are the remaining configuration options:
  • Network settings: DHCP or manual IP address, DNS servers
  • Email: SMTP settings used for sending messages (of selectable severity levels)
  • Backup Retention: lets you define how long you want to keep backup sets. Out-dated sets will be automatically deleted to save backup storage space.
  • Replication and Connectors: See "Beyond Backup: DR Replication" below
  • Support options
A while later I noticed that the installation of the console software also included a plugin that nicely integrates the console's dialogs into the vSphere client. This integration saves you launching an extra program and entering your credentials a second time:

PHD Virtual Backup - vSphere Client integration
Let's back up!

At the beginning of this post I already explained how the backup process works technically. Scheduling it is easy with the management console: Obviously the first step is to select the VM(s) to back up. You can pick individual VMs here or a complete VM folder from the vCenter inventory view. If you schedule a job for a folder then it will also automatically apply to any VMs that are added later to this folder - a nice way to implement a "set and forget" backup schedule.

The next steps are to select a VBA that will do the job (if you have multiple VBAs in the same cluster) and the backup schedule: The choices are "Now", "Once", "Daily" and "Weekly" here. In the Options dialog you can define additional parameters for the job:

Backup job options
  • Verify backup: None (fastest), New blocks only (safe mode), All blocks (paranoia mode)
  • Backup powered off machines: Not by default
  • Set backups as archived: Allows for archived backups that will never be deleted regardless of retention time configuration
  • Quiesce the VM before backing up (Windows only): Will initiate a Windows Volume Shadow Copy Service (VSS) snapshot through the VMware Tools which allows for application consistent backups
  • Use Changed Block Tracking: Use the VMware CBT feature to identify disk blocks that were changed since the last backup
Once a backup job runs you can watch its progress in two different places: In the Jobs view of the Console program, and - using the vSphere client - on the console of the VBA:

VBA console showing backup progress
In both places you can also see statistics about how much data was backed up, how much was actually written, and the deduplication ratio.

I was very much pleased with the speed of the differential backups using CBT. Backing up a VM this way would often take only a couple of minutes. If you are not satisfied with the speed of backups one thing to check is the CPU and memory load of the VBA(s). By default they have minimum resources (1 vCPU and 1 GB RAM) - upgrading them with more CPUs and RAM can significantly improve their throughput.

Backups are boring, Restores are exciting?!

When using traditional backup methods (with an agent running inside the guest OS and doing a file level backup) restoring a machine can be a time-consuming and thrilling experience. In this case a restore can be an error-prone process that needs to be carefully tested for each OS version and special applications like SQL that need special agents.

With the image based backup approach that PHD Virtual is using this is somewhat different. Usually you do not need to worry about a restored machine not being bootable, the OS will always be in a consistent state. If the applications that runs inside the VM benefit from the pre-snapshot Windows VSS quiescing (mentioned above) then they will also be in a consistent state after recovery. However, not all applications support this - if in doubt ask your vendor and do backup/restore tests to safeguard yourself from unpleasant surprises.

When doing a full restore with the backup console you have the following options available (besides from the inevitable selection of the VBA and the VMs' backup sets to restore):

Restore job options
  • You can change the VMs' names by appending a suffix to their display names (if you want to keep the original VMs)
  • Default VM Storage: Choose a datastore to store the recovered VM
  • Network Settings: The default is to use network settings from the backup. If the original VM is still around you may change the MAC address of the restored VM to avoid any conflicts. Finally you must choose a virtual port group to attach the restored VM to (Distributed Virtual Switches are also fully supported).
Once the restore job is started the VBA will recreate the VMs with the original configuration plus the selected restore settings, and their virtual disks will be filled with the backup data.

A special way of doing File Level Restore (FLR)

You might run into situations where you don't want a VM to be completely restored, but only some of its files or application records (e.g. for restoring files from a virtual file server or single mailbox items from a virtual Exchange server). This is also possible with PHD Virtual Backup, and it uses a quite unique approach to accomplish this.

After starting the File Recovery wizard you can select a single virtual disk from a VM's backup set and export that via iSCSI:

Initiating a File Level Restore (FLR)

In the Options dialog you can define a custom CHAP secret for mounting the iSCSI target (or accept system generated credentials). If you check the Add target to iSCSI initiator on this computer option then this will automatically mount the iSCSI target on the Windows machine that runs the Backup Console. A prerequisite for this is an installed iSCSI initiator. Since Windows 2008 an iSCSI initiator is included in Windows, for earlier versions of Windows it is available as a free download from Microsoft.

The Restore iSCSI target can be mounted from an arbitrary computer (iSCSI initiators are available not only for Windows, but literally every Guest OS including Linux) by specifying the VBA's IP address for iSCSI discovery and the defined CHAP secrets for mounting the target.

The big advantage of this method is that the original virtual disk can be mounted directly from the backup as a block device - that means it will appear in the same way as the original disk, but with the data of the selected backup set. This way File Level Restore (FLR) is completely Guest OS and file system agnostic.

This unique FLR method enables not only a straightforward single file restore, but also interesting opportunities for seamless application item restore. On the PHD Virtual web site you can watch videos that demo e.g. the restore of a single SQL server table and a granular Exchange item recovery.

Beyond backup: DR replication

DR Replication architecture
Besides from Backup and Restore the product also features a special Replication mode for Disaster Recovery (DR) from one site to another. The way that it is implemented a Replication job is just a special case of a Restore job, with the following differences:
  • You can schedule replications to take place at defined intervals
  • A VBA can replicate a VM from a backup that was created by another VBA
The latter means that multiple VBAs can share a backup storage as a replication source. An easy way to accomplish this is to export the local backup storage of one VBA through CIFS. The replicating VBA (that would typically be part of another VMware cluster at a DR site) can mount this backup storage and replicate a VM from it into its own environment.

The initial replication can be "seeded" by transporting offline copies of the VMs physically to the DR site (in case there is only a low capacity WAN link between sites). Subsequent replications will only synchronize data that was changed since the last replication and will hence require only little bandwidth.

DR replication is a nice add-on of the Virtual Backup product, but the use cases are limited to non-critical workloads, because replication is not done directly, but only from backups. The RPO (Recovery Point Objective) time needed for business critical VMs (less than an hour) can hardly be met with this method. However, most people have already protected such workloads by using different technologies (e.g. online data mirroring, OS or application based clustering), so that they can still benefit from PHD Virtual's DR Replication.

Conclusion

The PHD Virtual Backup product offers a simple, robust and scalable architecture that fully benefits from the virtualization intrinsic advantages (e.g. VMware DRS and HA). It combines an efficient backup data handling with a unique and universal method of File Level Restore. Because of its unlimited scalability it is suitable not only for small and medium business customers, but also for large enterprises.

The simple architecture has a drawback though if you need to handle a very large environment with thousands of VMs and dozens of backup appliances: Each VBA needs to be configured and managed separately, and you need to think very carefully about how to distribute all the backup jobs among the VBAs - in the end the backup schedules are static. Wouldn't it be good to have a central management instance that will configure, manage and monitor all VBAs and dynamically orchestrate the backup jobs among them? Wouldn't it be good to have at least programmatic interfaces to third party tools that could potentially overtake these responsibilities?
It looks like PHD Virtual is aware of this issue and is going to address it: With the latest version 5.4 of the product they make an API and SDK available that can be used by customers to extend its functionality and integrate it into third party management workflows.

Availability

PHD Virtual's VMware vSphere Backup and Replication is available for download as a 15-day-trial version. It is licensed per virtualization host (thank God not by CPU sockets ;-)). Purchases can be made through authorized resellers. According to PHD Virtual the API and SDK for the product are available on request to "anyone having a good use case for it".