Posts Tagged ‘ center

VMware vCenter Server Heartbeat

Introduction:

First of all I would like to say I’ve been very busy lately. Therefore I haven’t had much time to spend on my blog. Too bad, cause I really enjoy publishing articles like this. Anyway, in the past few months I’ve done several attempts to convince my manager to acquire a license for vCenter Server Heartbeat in which in eventually succeeded.

Since the implementation of our vSphere Infrastructure our environment has grown enormously and there are just about 15 datacenters connected to our vCenter Server. Which really makes your ‘single node’ vCenter Server really a single point of failure.

Why use vCenter Heartbeat?

There are several reasons why an organization should consider going to use a Heartbeat setup for their VMware vSphere Infrastructure:

  • Continuity of your Clusters (DRS)
    Without your vCenter Server your Clusters won’t be able to function
  • Manageability
    It’s really a must with several datacenters in your vCenter Server with their own administrators to be able to manage their virtual infrastructure
  • Backup
    In case your backup software uses your vCenter to index your virtual infrastructure and use it in it’s backup process it’s essential your vCenter Server is reachable.

Installation Methods

There are several methods of deployment of your Heartbeat setup: V2V (Virtual to Virtual), V2P (Virtual to Physical) or P2P (Physical to Physical).

In our case we chose for the P2P (Physical to Physical) solution because I really insisted the secondary server was just as powerful as the primary in case a failover situation would last longer than just a few hours.

vCenter Server Hardware

VMware insists on –when using the Physical to Physical solution- both of your vCenter Servers (Primary and Secondary) have similar hardware. In our case our vCenter Servers consist of the following:

Brand/model: HP ProLiant DL360 G6
Processors: 2x Intel Xeon E5540 2.53GHz
Memory: 12G DDR3 Registered (6GB per CPU)
Disk(s): 2x HP 72GB 15K Dual Port (RAID-1)
My design for vCenter Server Heartbeat

Design for vCenter Server HeartBeat

In our infrastructure we have several datacenters spread all of the the country connected to each other by a fiber (WAN) network. Two locations (sites) are really close to each other and are also connected by a separate fiber connection.

The ideal situation of this nearby location is that it’s using the same physical network as our site does and most of the VLAN’s designated for servers are available on both sites. So this location/site is ideal as a backup-location.

When installing vCenter Heartbeat you are able to select if you are installing on a WAN of a LAN infrastructure. Since this is is really a LAN-situation we choose the LAN-setup. Primarily because we are able to use a single Public IP address for our vCenter Server which simplifies management.

For this setup I’m using our default Server VLAN to put our Public IP address in and I’ve created a separate VLAN for the Channel (heartbeat/synchronization) communication of vCenter Heartbeat.

I’ve simplified our infrastructure in the image below so it’s main and only focus is on our vCenter Server Heartbeat setup. In this case the public IP Address is 10.15.1.17 and is in a different subnet/VLAN than our Heartbeat Channel. The primary vCenter Server has 10.15.210.11 and the secondary has 10.15.210.12.

The channel is used by Heartbeat to synchronize the registry, filesystem of the Heartbeat nodes and to communicate from the primary to the secondary. This channel is also used to check if the other ‘node’ is still alive and to eventually initiate a failover if it’s not.

Packet Filter (Neverfail)

As you might have seen in the design above is both the primary and the secondary server have the same name and public IP address. There is a really simple explanation for this: when you choose to install Heartbeat with a LAN-setup it will assume both the vCenters will have the same IP address and name. This is because during the setup on the primary node the setup will create a System State backup which you will restore on the secondary  node afterwards. From that moment on the secondary node is equal to the primary node in every way.

To prevent both the primary and the secondary server from having their public adapters active on the network VMware has implemented a “Neverfail Packet Filter Driver” which will be installed on the Public network adapters during the installation of Heartbeat.
The idea of the Packet Filter is really simple: Heartbeat will disable the Packet Filter on the node that is currently active. During a failover the Heartbeat software will enable the Packet Filter on the node that will be inactive and disable it on the node that will be the active node from that moment on.

vCenter Databases

There are several ways to host your vCenter databases and in our infrastructure we chose to host our databases on a seperate dedicated database server. This database server is currently running Microsoft SQL Server 2005 Standard and we are going to migrate the databases to a Microsoft SQL Server 2008 Cluster soon for the simple reason that if you make your front-end redundant it’s seems just as logic to do the same for your back-end.

Installing/implementing vCenter Server Heartbeat

Luckily VMware really thought of this: yes, it is very well possible to implement Heartbeat on your running environment. If this wasn’t possible it really would be an hassle to implement it.

The documentation of Heartbeat comes in two documents: quick setup and the reference guide. In my opinion these documents contain most of the steps you need to take to install Heartbeat so there is no need for me to describe each step. Although in my opinion the Quick guide really misses out on some detail so I would suggest the Reference Guide.

In the global steps below to take to implement Heartbeat I will indicate which action is required on what server by starting the line with “Primary”, “Secondary” or even “Both”:

  • Secondary: make sure the hardware (CPU, Memory, Disks) is similar to your primary server
  • Secondary: install the exact same operating system as you have on your primary server. In our case this is Windows Server 2008 x64 Standard. Give it a temporary IP (DHCP if you will) and a bogus hostname. This will be overwritten when Heartbeat sets up your secondary node with your primary node’s data.
  • Both: also make sure you have your ILO configuration on both the primary as the secondary server to be able to reach the servers in some of the steps of the installation
  • Both: make sure the Windows Update level on both servers is the same.
  • Both: in case you are using Windows Server 2008 you need to install some features on the server before you can start the installation: Backup, Backup-Features and Backup-Tools.
  • Both: Very important, installing Heartbeat will NOT work properly when you have NIC Teaming enabled. If you want to use NIC Teaming: set it up when the whole Heartbeat setup is finished!
  • Primary: install Heartbeat and follow the steps on the screen, make sure you have some storage space on the network available to store the temporary backup-files to transfer the backup which contains the identity of your primary server to your secondary.Important: make sure your secondary server is able to reach this location.
  • Secondary: as you probably have followed all of the steps on your primary node the setup will tell you to continue your setup on your Secondary node.
  • Secondary: don’t forget to manually change your Channel IP to the proper IP you have reserved for the secondary node after the Heartbeat setup rebooted your server!
  • Both: this is for both servers, but has to be configured only once in your Heartbeat configuration. You will need to provide a service account to the vCenter Service Plugin of Heartbeat that has enough access on your vCenter Server to monitor if it’s down or up.I would suggest if you are running vCenter Server on a service account like I do you use that account for this purpose.

Final words

Some of you are eager to try this baby out. Well that’s possible: you are able to use vCenter Server Heartbeat in trial for 60 days. The only problem is: you need to be able to download it from the VMware download site. I’m pretty sure that if you contact customer service they will provide you with a link to do so. The great thing about running the trial is that if you don’t like it or your trial expires without you inserting a permanent license you still able to uninstall it in a nice and easy way and continue with your single vCenter Server.

As I am really curious about your experiences with vCenter Heartbeat: please reply or comment to this post to share your installation experiences/issues.

Vizioncore Releases vRanger v4.1 DPP

Vizioncore finally released the new version of vRanger. The reason why I’m really happy with that is that the issue that vRanger 4.0 DPP crashed Virtual Center at startup has been resolved!

Everyone who had or still has this problem is advised to upgrade their vRanger versions to have a proper working environment and to stop your worries about if your Virtual Center would crash at startup.

The new version is 4.1 DPP and can be found on the Vizioncore Download Site.

Other fixes, changes and new features can be found in the Release Notes of vRanger.

Vizioncore vRanger 4.0 Pro Crashes vCenter 2.5 Update 3, 4 and 5

Update (09/25/2009):

Vizioncore’s new version of vRanger (4.1.0 build 11581) has solved this issue. If you had experienced these issues you are advised to upgrade as soon as Vizioncore has released this new version officially!

Update (08/28/2009):
Vizioncore announced that the new version where I suppose this problem is fixed in will be released Mid-September.

Original Twitter-message:

Vizioncore vRanger Pro 4.1 DPP to be released in Mid-Sept. More info @ VMworld 2009 and VirtualVizion http://tinyurl.com/vc-vvizion

Since not too long I have been having problems with a crashing Virtual Center. The version where the problems started with was 2.5 Update 4. The problem was caused by vRanger Pro 4.0 DPP. Please keep on reading to see the complete story on this problem.

Problem occurrence and symptoms:

In short this is how the problems first occurred: our DBA installed SP3 on our SQL 2005 Server and rebooted our SQL Server afterwards which normally wouldn’t be much of a problem but in this case our Virtual Center wouldn’t start anymore. The service starts and crashes after 5 seconds. The vpxd.log shows an Win32_Exception and some debug data and that’s it.

Attempts to solve the problem:

At first the problems appeared to have been caused by the update on the SQL Database Server, so these are the steps I’ve taken in my attempts to solve the problem.

  • The things we’ve tried to solve this mysterious problem: Reinstalled VC against our production database: same problem
  • Reinstalled VC against our production database on another SQL Sever
    with SQL Server 2005 SP2 on it: same problem
  • Reinstalled VC with a clean database and reconfigured our entire
    environment, but after a reboot of the VC: same problem except this
    time we installed VC 2.5 Update 5 instead of Update 4.
  • Cleanly installed a new server with a fresh Windows 2003 install and
    all recent updates and installed VC 2.5 Update 5 again with a clean
    database but the same problem occurred after rebooting the VC server

We repeated the last step probably about 2 times and it was really making me desperate for a solution because we have a pretty big environment.

The actual problem:

After desperately creating a Support Request at VMware they asked me to disable vRanger if we had that running in our environment because it was known to cause problems to Virtual Center.

The thing that already raised questions on my behalf before this incident was that the new vRanger constantly kept an open connection to Virtual Center, and as soon as you killed that session from Virtual Center the vRanger service immediately reconnected. So it appeared to me that during the startup of the Virtual Center service it constantly tried to connect and initiate something or tried to retrieve data from it which caused the Virtual Center service to crash right after startup.

Resolution:

After disabling the vRanger service –which wasn’t installed on our Virtual Center server itself by the way- we held our breath and restarted our Virtual Center server, and surprisingly the service started flawlessly.
To make sure it really was vRanger we re-enabled the vRanger service and restarted the Virtual Center service and it crashed immediately.

After disabling the vRanger service VC functioned flawlessly again.

Final note:

The road to the solution has taken around two weeks. It was horrible. Finally the solution is known. By the way, previous versions of vRanger do NOT cause these problems!

Update (helpful reaction from Vizioncore, see comments below for original message):

We are working very diligently to get this fix in the hands of our users, we have a fix that already out of our development process and is now in the hands of our QA team. You can also reference the following Knowledgebase article here http://www.vizioncore.com/support/knowledgebase/index.php and search for KB 00000296 about this issue. This KB will also be updated shortly for a root cause once the issue is fully cleared by development and QA, for those that are interested in the technical details about the issue. Thanks again for everyone’s support! And we hope to get everyone taken care of very soon!

PowerCLI: Reset CD-drives using PowerShell

As most of you know and probably experienced from time to time: when a Virtual Machine’s CD-drive is connected to an ISO-file on one of your Datastores or even connected to the physical drive of your ESX-host the migration due VMotion of a Virtual Machine will not work.

Normally this isn’t really a problem except if you put a ESX-host in Maintenance mode and Virtual Center will simply not tell you why the Maintenance mode process is hanging or even giving a time-out after 15 minutes for no obvious reason. Most of the times it’s a Virtual Machine which has a CD-drive connected to an ISO file. A waste of time if you ask me.

So to prevent this from happening I’ve written a simple PowerShell oneliner/script disconnect these CD-drives from the ISO-files or from the physical drives and set them to Client-drives which is ok for VMotion:

(Get-VM -Location :( Get-VMHost "your.esx.host")) | `
ForEach ( $_ ) { Get-CDDrive $_ | `
Where { $_.IsoPath.Length -gt 0 -OR $_.HostDevice.Length -gt 0 } | `
Set-CDDrive -NoMedia -Confirm:$False }

Instead of executing this just on one host you can also execute this for your entire cluster:

(Get-VM -Location :( Get-Cluster "Your Cluster Name")) | `
ForEach ( $_ ) { Get-CDDrive $_ | `
Where { $_.IsoPath.Length -gt 0 -OR $_.HostDevice.Length -gt 0 } | `
Set-CDDrive -NoMedia -Confirm:$False }

Or ofcourse, by Datacenter:

(Get-VM -Location :( Get-Datacenter "Your Datacenter Name")) | `
ForEach ( $_ ) { Get-CDDrive $_ | `
Where { $_.IsoPath.Length -gt 0 -OR $_.HostDevice.Length -gt 0 } | `
Set-CDDrive -NoMedia -Confirm:$False }

It’s as easy as that, now there will be no Virtual Machine interrupting your VMotions anymore and you can put your ESX hosts in maintenance mode without any problems ;)

Cheers!