Test Lab: Virtualisation of Exchange 2007 CCR Cluster using VMWARE – Part 1

by Andy Grogan on December 3, 2008 · 0 comments

in Exchange 2007 (CCR), Exchange 2007 (Installation), Exchange 2007 (Migration), Exchange 2007 Design, Test Labs, VMWare

As part of my research into migrating from Exchange 2003 to Exchange 2007 it has recently become very obvious that as I am migrating over using the same hardware (which for the mailbox server role) is clustered – in terms of my test lab I really needed to move away from a single Exchange 2007 server with all the roles installed and try to gain a feel for separate roles on different boxes and most importantly how to cluster Exchange 2007.

If you have read my previous article [here] about some of the initial thoughts that I have had about the migration, you will have seen that I plan to use a method called “Bunny Hopping” – essentially as I have financial constraints on this migration (e.g I have to use my existing hardware) I have purchased a server which I intend to add into my Exchange organization and then move the mailboxes from one cluster onto it – then re blow the old clusters with Windows 2003 x64 and Exchange 2007 then move the mailboxes back – then repeat the process for the other clusters (long winded I know but the Organisation is not prepared to replace what was £70,000 of new hardware 24 months ago and I can kind of see the reasoning).

Whereas I have an existing test lab which is based around Microsoft Virtual server – I believe that this was not suitable for testing clustered instances of Exchange 2007 for a number of reasons, some of the most important are;

  • Although Virtual Server R2 can be installed on x64 hardware, its major weakness is it cannot support x64 guest systems – thusly making it unsuitable for Exchange 2007 (unless I wanted to work with the x32 evaluation which I did not as it isn’t a like for like architecture and its overall performance even in quite well specified LAB is not very good as it has not been optimised for x32 environments.
  • My existing test lab has a number of systems running within it, where the host platform is a pair of dual core Xeon’s with 6GB of RAM – this might sound like a lot, however when you consider x 1 domain controller (minimum of 512 MB RAM), x 2 nodes (minimum of 1024 GB of RAM) plus a Hub transport (1024 MB of RAM) and an Edge (Transport 1024of RAM) then the overhead for the O/S and other lab environments I am not going to get much out of testing within it.
  • I believe that for a product like Exchange 2007 you really need a dedicated lab just for the Exchange servers to get real benefits.

What I will be using my current test environment for utilising Virtual Server will be the domain controllers for the Exchange 2007 lab, therefore the following is an example schematic of how my virtual lab should look by then end of this series:

 

This has left me in the situation where I needed a new server which can in the first instance be used as a Lab server, and in the second instance be used for the “Bunny” hop of the live system when I come around to performing the upgrade.

Like many of us, I cannot just go and buy a server for this role, I needed to justify the expenditure – especially for a project that has no money allocated to it for hardware so I spoke to my bosses using the following arguments as justification for the additional server;

  • You cannot in-place upgrade from Exchange 2003 to 2007 using the same x32 hardware – therefore we can either place ALL of our mailboxes on one Cluster whilst we upgrade and take the performance hit or get an interim server.
  • We do not have a test lab in which to prototype the new Exchange 2007 setup – therefore most of the knowledge that would be used in upgrading the production system would be guesswork.

My bosses accepted my arguments as they wish to go in the direction of Exchange 2007 as part of the company’s overall business strategy, however the only proviso that I had was that the funding for the additional server had to come from an under-spend in an existing project budget. As luck would have it, one of my previous projects that was mail related had come in significantly under budget – therefore I set out looking for a server which would be powerful enough to Initially act as a test lab server and then during the migration become part of the Exchange Organisation.

In the end I settled on a HP DL580 G4 (pictured below) as my weapon of choice with 20 GB RAM, x 4 dual core Intel Xeon 3.00 GHz processors with 4MB cache and Hyper Threading.

dl580

For the storage, rather than make use of the internal storage capacity of the DL580 we purchased the unit with a pair of HP PCI-X 2GB FCA2214DC HBA’s which can then be routed into our IBM DS8100 storage array, although for the purposes of the test lab we added in x 4 72 GB SAS SFF hard disks which I have configured in a pair of RAID 0+1 arrays.

As mentioned in the opening paragraph my initial stages of testing (which began about 5 months ago) have been centered around running a single instance of Exchange with all the roles (with the exception of Unified Messaging and Edge Transport) on the same box so I could get a feel for Exchange 2007 – linking my knowledgeof previous versions of Exchange into what is undoubtedly a major change in the way Exchange administrators need to think – however I am now at the stage where I wish to make the most out of the DL 580 and expand the testing soI decided to un-install the test version of Exchange 2007 that I had see here for one issue that I ran intoduring the un-installation process and then install a Virtualisation Product on the DL580 which would allow me to create multiple instances of Exchange servers and therefore separate out the roles, and most importantly develop my understanding of the changes to Exchange clustering scenarios.

With the above in mind, this pretty much bring us up-to-date as to where I am with my current developments along this road to migration, what I would like to share with you from now on is how I have worked through the process of virtualising Exchange, and the problems that I have ran into along the way and how I fixed them.

Getting Windows 2003 x64 into a Virtual Machine

On the surface you would think that this is a pretty straight forward task, especially with the hardware that I have been working with – but I ran into a number of problems just trying to virtualised Windows 2003 x64 the following is a rundown of how I constructed my virtual machines and the issues with resolutions that I found;

Choosing your Virtualisation Software

NormallyI would prefer to use MS Virtual Server R2 SP1 in migration labs as I have found that it is very easy to use – however, VS has a major restriction –although it will run on x64 hardware –it does not support x64 guests (which is madness if Microsoft wishes to compete in the Virtualisation arena with people such as Xen and VMware – and although I understand that Windows 2008 will offer x64 support from a guest point of view – I believe that with some of the features that they have pulled from Virtualisation in Windows 2008 it cannot compete as it stands).

Due to the above reasons I chose VMWARE Server 1.0.0.3 as it is FREE (after registration) and does support x64 guest systems.

I installed the product on the DL 580 and was ready to go in about 10 minutes – however this is where I ran into my first problem;

During the installation of Windows 2003 Enterprise x64 the guest would give the an error message claiming that Windows setup had been detected that the guests processors were x32 only, rather than x64.

Initially I was stumped over this as the four CPU’s in this machine are all EMT64 enabled (otherwise how could I have installed Windows 2003 x64 as the host) – however what I missed was the CPU’s all supported Intel’s IVT technology which on the HP DL 580 is disabled by default – VMWARE does not like this – so a tip is to enable IVT in the BIOS of the HOST if you are using a HP machine (or double check that it is enabled by default by your hardware vendor).

Building the Nodes

When building a cluster in any virtualisation product I like to build an initial node which contains all of the prerequisites (in this instance for Exchange 2007) – but leaving it in a work group (which is easier when coming to clone the VM).

The following is an overview of how the first node is configured – this takes into account the VMWARE configuration, changes to the operating system and software that I installed on the node when Windows setup had completed.

Virtual Hardware (Both Nodes):

Computer Name: x64EXND01

Memory: 2048 MB

Hard Disk 1 (System C:) SCSI Bus [0:0] = 9 GB

Hard Disk2 (Page P:) SCSI Bus [0:1] =2 GB

Hard Disk3 (Binaries B:) SCSI Bus [0:2] =6 GB

Hard Disk4 (Exchange DB X:) SCSI Bus [0:3] =4 GB

Ethernet0 = Bridged

Ethernet1 = Custom (Translates to VMNET2)

Processors = 2

Below is a snapshot of the above configuration in the VMWARE management console;

Now some of you will have noticed that I have not added in any form of shared disks – nor made any changes to the VMWARE VMX file which will allow for me to use clustering within the Virtual environment – there is a very good reason for this and it relates to the type of clustering that I wish to use with the Exchange servers – there will be more on this later.

Virtual Software (In order of Installation):

Windows 2003 x64 Enterprise

VMWARE Tools

Windows 2003 Support tools (mainly for ADSI Edit)

Windows 2003 x64 IIS – with COM+ network access and IIS Common Files, Internet Information Services, World Wide Web Publishing)

Microsoft .NET Framework 2.0 (x64)

Microsoft .NET Framework 2.0 (x64) (KB926776)

Windows 2003 x64 SP2 (Includes MMC 3.0) with the latest Windows updates

Download and install NEWSID from: http://www.microsoft.com/technet/sysinternals/Security/NewSid.mspx

Configuration Changes Made to the Guest :

Moved the Page File from C:\ to P:\ (Corresponds to SCSI Bus 0:1) – giving it a size of 1052 MB

Copied the Exchange 2007 Binaries to the Binaries Partition (B:)

Assigned a LAN IP address to the Ethernet adapter0 (10.0.1.1)

Assigned a Cluster address to the Ethernet adapter1 (VMNET2) = 192.168.1.1

Opened up REGEDIT and changed the following;

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\TCPIP\Parameters\

Value: EnableRSS (the key is a REG_DWORD) changed its value from 1 to 0

The above value can cause problems after Windows 2003 SP2 is installed (for example, RDP connection issues, communication issues to Domain Controllers) as Windows 2003 SP2 installed a new update called the “Scalable Networking Pack” which contains RSS (Recieve Side Scaling) – for more information see http://support.microsoft.com/kb/936594

Cloning the image

Ok, now that I have a working Windows 2003 server image it is now time to clone it (with a view to creating the second node of the cluster) – the following is the process that I have used for a number of years with VMWARE.

Firstly I ensured that the source Windows 2003 Virtual machine was shutdown which ensures that I have full access to all of the composite files of the machine. I then copied the following VMWARE files into another directory (which I called X64EXND02);

NVRAM

*.vmdk

*.log

*.vmsd

*.vmc

Now some people will be curious as to why I did not use SYSPREP – my main reasons is this method is quicker – although not supported by Microsoft but by using NewSid (which ironically is now a Microsoft Product) still works.

I then added the copied VMX file to VMWARE server, when doing so I received a prompt about keeping VMWARE’s unique identifier for the Virtual Machine – or choosing another ID – I ensured that at this point create a new ID.

At this stage I still needed to ensure that my original machine was indeed closed down (otherwise you can get IP address and NETBIOS conflicts), then I powered on my second node and waited for it to boot.

When the system had booted I ran the newSID tool (remember the software which I installed above on the base image).

1. Run newSID

When you run new side you are presented with the following welcome screen:

newsid1

I clicked on the “Next” button which displays the following dialog box;

newsid2

I chose the “Random Sid” option and clicked next whichthen displays the following screen:

newsid3

 

At this point I chose to rename to computer as part of the SID change process – this is where the “cloning” aspect from the original machine comes into its own, I end up with a computer which is identical in every way to my source machine, but with a different name, and network SID.

When I had chosen my new name I clicked on “Next” which displayed the following dialog;

newsid4

Upon clicking the next button newSID will complete the required changes to the Virtual Node –and then reboot the machine.

2. Configure a new IP settingsfor the machine

When the machine has rebooted I am in the position of being able to configure the machine IP settings which (bearing in mind this is the second node of the cluster) will be slightly different to that of the first node;

Change the LAN IP address for ethernet adapter0 = 10.0.1.2

Change the LAN IP address for ethernet adapter1 (VMNET2) = 192.168.1.2

It is at this stage however, I needed to add in some additional settings to the IP configuration to prepare the machine for being joined to my test labs domain, therefore I added in;

Primary DNS Server: 10.0.1.20 (the IP of my Virtual DC & DNS Server)

DNS Suffix: infrastructure.local

I then joined the machine (2nd node) to my test lab’s domain (Infrastructure.local – or NETBIOS: Infra).

At this point I am in the position where I can power on the first node of my cluster (bearing in mind that the IP, NETBIOS name and SID are now different – and there is no shared storage) so I can finish the configuration in preparation for joining it to the domain.

When my first node was powered up I made the same DNS changes as mentioned above and joined the node to the Infrastructure.local domain.

Network Changes to be made on both nodes;

  • Configuring the Heartbeat connection;

At this stage I have two network controllers configured on the Virtual Machine – one is bridged over the Hosts LAN connection (which gives me access to my public network) and the other is connected to VMWARE VNET2 (a private network that VMWARE encapsulates) – both networks are common to each node.

In order to comply with clustering best practices the VMNET2 connection (or Heartbeat) I configured using the following parameters;

Logging on to the Virtual Machine, from the desktop I right clicked on the “My Network Places” icon and selected “Properties”.

When the network properties window opened I right clicked on the connection that I know is mapped to the VMNET2 device and clicked “Properties”– which displayed the following Dialog box;

Here I un-ticked the “File and Print Sharing for Microsoft Networks” – I then clicked on the “Internet Protocol (TCP/IP)” option and clicked on the “Properties” button which displayed the following dialog box;

From the dialog box above I clicked on the “Advanced” button which opened the following dialog box;

Here I clicked on the “DNS” tab and ensured that the “Append parent suffixes of the primary DNS suffix” and the “Register this connection’s addresses in DNS” were both un-ticked.

I then clicked on the WINS tabwhich changed the view to the following;

Here I un-ticked the “Enable LMHOSTS lookup” and checked the “Disable NETBIOS over TCP/IP” – then clicked on the OK button which returned me to the Network adapter properties soI clicked OK again.

  • Adjusting the Bindings Order;

Again in line with clustering best practice it is important to adjust the order of bindings for the network adapters on your nodes so that the Public Connection has Precedence over the Heart_Beat therefore the following are a couple of changes that I made to comply;

Situation Review;

On my Virtual Machines, from the desktop I right clicked on the “My Network Places” icon and selected “Properties”.

From the Advanced menu I clicked on the “Advanced Settings” option (see below);

Clicking on the “Advanced Options” option displayed the following dialog;

Under the “connections” list menu, I selected the public connection for my server and then clicked on the “UP” arrow to change the order or preference.

At this stage I now have;

X 2 Configured nodes each with a matching configuration in terms of Hard Disks, Memory and CPU power.

Each node has a different NETBIOSname (x64EXND01 & x64EXND02), IP Addresses (for both public and private network interfaces), and have both been joined to the test lab’s domain (Infrastructure.local).

Both Heartbeat interface cards should be communicating on VMNET2 and should be able to ping each other by the 192.x.x.x addresses.

Some Questions answered;

You may have noticed that all of the Disks that are configured on each machine are not connected in anyway to a shared SCSI bus under VMWARE, and I have omitted a shared drive for the Quorum resource – have I gone mad? – am I planning to cover this in part 2 of the article? – no – I have always been mad, and I would like to cover it here? –no because I have much more exciting stuff to cover in the next adventure, however just to let you in on what is coming next – in the lab, I am planning to make use of Exchange 2007 CCR clustering technology which means the type of clustering that I plan to use is based around the “Majority Node Set” – now for some people this type of cluster might introduce some new concepts that, if you are traditionally use to Shared Storage based quroums – the following although from my perspective might help:

 

The majority node set is a quorum resource when considering traditional clustering– however a key difference is that the data is stored on multiple disks.Each cluster node stores configuration data on a local disk (hence the local disk configuration in VMWARE) which it can access when it starts up.

In a default configuration, cluster data is stored in %systemroot%\cluster\.

The cluster service ensures that the cluster data stored on the MNS is kept up-to-date across each node in the cluster. If changes are made to the cluster configuration, the changes are copied across the different disks of the node in the cluster. Any modification can only be considered to have been stored if that modification has been made to a majority of the nodes that are part of the cluster – This ensures that most node in the cluster have current data as to the configuration.

Using a MNS configuration means that there is a modification to how the cluster service starts on each member node as since the cluster service is set to automaticallystart when the node boots up. On the first node the cluster service will try to join a cluster – since there is not a cluster available during boot this will fail and the cluster service will pause. However the cluster service is configured to retry the above operationevery minute (by default) which will be successful when the majority of member nodes are up.

When the cluster service has started up on a majority of nodes, the cluster resources will be brought online. If the cluster loses what can be considered a majority of nodes – the cluster services will stop on all nodes which will take all resources offline.

In a normal server cluster resources and the quorum can continue as long as at least one of the nodes can access and take charge on the quorum disk and associated resources. However a cluster running as a majority node setwill only start or continue running if a majority of the nodes are up and running and where communications between all nodes is maintained.

Ok, so why does this help?

As a MNS cluster can function with local disks where shared disks are typically viewed “single point of failure” the shared storage aspect is eliminated – as resource data is replicated across the member nodes.

Also nodes can exist in the same subnet (not limited to the same physical proximity because as the shared storage is removed) – therefore, for example if you currently (like me) have a two node Active/Passive cluster using a shared array – you can take one node and the shared array to another site in the same subnet (using the bunny hop method) then provide some local storage for the remaining node which is identical to the other, then you can use MNS to replicate that data over WAN (as long as it is within the same subnet – this is due to change in Windows 2008).

Basically, where you had two nodes and shared storage on the same site, you can have local storage per node, on different sites (using the same IP subnet) using a MNS cluster – which when combined with Exchange 2007 CCR and log shipping (see part 2) results in a far more resilient setup.

Next Articles;

  • Configuring a MNS Cluster
  • Configuring File Share Witness & Installing aExchange 2007 CCR – with the problems I ran into

{ 0 comments… add one now }

Leave a Comment

*

Previous post:

Next post: