Using haproxy as a PSC load balancer

When designing a vSphere 6.0 environment with multiple vCenter servers you will - in most cases - end up with the need to deploy external Platform Services Controllers (PSCs). If you are unsure what topology to choose then you should take a look at the PSC Topology Decision Tree that was recently published by VMware. It will guide you to the topology that suits your requirements best.

Since the PSC hosts the critical Single-Sign-On (SSO) component a specific requirement is to make an external PSC highly available so that you are still able to log on to vCenter even if one PSC fails. Currently the only supported way to implement a seamless automatic failover from a failing PSC to another one is to put multiple PSCs (of the same SSO domain and site) behind a load balancer. The process of properly configuring the load balancer and the vCenter servers behind it is quite complex, so most people refrain from it and just deploy a secondary PSC to that they manually re-point the vCenter servers if the primary one should fail (as per KB2113917). But this is a manual process (although it can of course be automated as William Lam explained in this post) and it takes a restart of all vCenter services during which vCenter will be unavailable.

This is why I wanted to try out in the lab how complicated it really is to implement load balanced PSCs and how well they work. However, I did not have a supported load balancer available in the lab - currently only Citrix Netscaler, the F5 BIG-IP and VMware's own NSX-v are officially supported for vSphere 6.0. All quite expensive options and no quick and easy deployments. So I decided to try my luck with the standard Open Source load balancer: haproxy. It turned out that this works very well and can be implemented quite quickly. Here is how:

Re-pointing vCenter Server 5.5: A Survival Guide to KB2033620

vSphere 6.0 has been around for about a year now, but VMware's largest customers are usually one or even two versions behind. With the recent release of Update 2 it looks like the 6.0 version has gained the stability and maturity that enterprise customers are waiting for. This is the reason why I just did extensive testing of vSphere 5.5 to 6.0 upgrades in the lab.

The main challenge of such an upgrade is to transform your vCenter Single Sign-On (SSO) setup into a topology that is fully supported and not deprecated. With vSphere 6.0 the SSO component is now part of the new Platform Services Controller (PSC) role that can be separated from the remaining vCenter services. In fact VMware recommends doing this whenever you want to have two or more vCenter servers in the same SSO domain which is a prerequisite for the new Enhanced Linked mode. The separation of SSO was already possible with vSphere 5.5 (although only with the Windows version of vCenter, but not the VCSA 5.5), but I think most people wanted to keep it simple and installed all vCenter services on the same server. So if they have multiple vCenter servers installed in this way then they need to switch to one or more external PSCs now.

There are many ways and orders in that you can - on the one hand - upgrade all components to 6.0 and - on the other hand - switch to an external PSC/SSO model. But only few of them are documented and supported by VMware. Their general recommendation is to transform into a supported topology first, and then do the upgrade of the PSCs and vCenter servers. KB2130433 e.g. describes how to upgrade/migrate two vCenter 5.5 servers with embedded SSO into the same SSO domain. This and other migration scenarios involve re-pointing your vCenter 5.5 server to a newly installed external SSO 5.5 instance.

So when preparing the upgrade of a complex vSphere 5.5 environment to 6.0 you will sooner or later stumble over KB2033620 which describes how to do this re-pointing. Unfortunately this KB article and the tools that it refers you to are very poorly written and full of issues. Some of them are mentioned in the KB article itself with workarounds to follow, but a lot are not ... Here is a list of the most annoying issues with KB2033620 and how to fix them.

An important heads-up for users of the Embedded Host Client!

My ESXi Patch Tracker bot never sleeps, so when I woke up this morning it already greeted me with the message that VMware has released ESXi 6.0 Update 2 in the middle of the (European) night. As usual with Update releases vCenter was also updated.

For the record here are the most important URLs for you:
With this release there are some great news for the users of the Embedded Host Client, but also a caveat that you should be aware of.

The VIBMatrix has joined the ESXi Patch Tracker!

My ESXi Patch Tracker service is becoming more and more popular. Many of its users probably don't know that I had a very similar service long before: On the VIBMatrix pages I maintained tables of ESXi Patch releases with lists of all included VIB packages. The original reason why I created these tables more than three years ago was to prove that ESXi 5.x/6.x Patch bundles are cumulative.

Until today the VIBMatrix was a manually maintained Excel sheet that I imported to Google Docs whenever I added a new patch to it. These times are over: I'm happy to announce that the VIBMatrix is now also fully automated and integrated into the ESXi Patch Tracker service!

[Unsupported] Defeating the VCSA shell timeout

In my earlier post vCSA 6.0 tricks: shell access, password expiration and certificate warnings I showed how can set the VCSA shell timeout to an effectively indefinite value using the command

   shell.set --enabled true --timeout 2147483647

at the appliance shell prompt. I was notified that in recent versions of the VCSA (probably since Update 1) this does no longer work. The maximum timeout that the command will accept is now 86400 seconds (= 1 day). The VAMI interface of the VCSA 6 (that was added in Update 1) also allows enabling the shell and setting the shell timeout, but there the GUI also limits the input to max. 1 day (resp. 1440 minutes).

When I checked my own vCSA (that was originally installed with 6.0 GA, then upgraded to Update 1 and Update 1b) I found that my old large timeout setting was still in place and functioning. That means even in the latest build of the VCSA 6 it is still possible to set an arbitrarily large shell timeout, just not through the appliance shell or the VAMI. So how do you do this?