I have for a little while wanted to put together a set of articles which attempt to explore the mindset of CCR clustering when being put to work in a production environment. Essentially my line of thinking (at least on the surface as this is a huge subject) is to go through some of the thought processes which might lead you (as an Exchange Admin) to the point where you are happy with the concepts of CCR clustering and are sure that it is for you within your Exchange Environment.
This article is along the same lines (although a little more detailed) as the article that I did on Hub Transport Specification which can be found here: http://www.telnetport25.com/component/content/article/31-exchange-2007–hub-transport-/122-suggested-hub-transport-hardware-config-for-exchange-2007-installations-of-5000-users.html
As mentioned this is a very large subject and contains quite a bit of varied theory as well as the usual technical staff that I like to throw in (for example I have a few CCR construction scripts that I would like to throw into a later part).
In the main the reasons for this series are:
- I have been working on a large production roll out of Exchange CCR (I know I always seem to be working on a large deployment of CCR into production) – but I have been considering (almost cliché) what is it all about really – does CCR really have to be the overall option when your really consider your overall Infrastructure environment?
- I totally trashed my CCR lab by accident and decided that it was not worth restoring from backup (I had made so many changes to the CCR configuration it was probably better to start from scratch) therefore I developed some scripts that make the process of deploying CCR on Windows 2008 much more straight forward
- I have been comparing some of my earlier posts about CCR to Microsoft’s TechNet deployment guidelines to ensure that my own personal thoughts were inline with general best practices
For information (and I include them as they are a darn good read) – the Microsoft articles are now the basis for much of my meanderings on CCR (as well as some good ole life experiences)
- http://technet.microsoft.com/en-us/library/dd425009.aspx – the Installation check list for Windows 2008 CCR Clustering
- http://technet.microsoft.com/en-us/library/bb124521.aspx – Introduction to CCR deployment and an overview of the technology
- http://technet.microsoft.com/en-us/library/bb629714.aspx – Introduction to CCR deployment on Windows 2008 this topic contains the meat and potatoes of CCR deployment on Windows 2008
As I have said in past articles I have covered CCR within a LAB environment, however in this article I would like to cover some of the scenarios you should be considering in production. The following is purely from a technical perspective and does not deal with any of the “people issues” (bear in mind that the concept of CCR is quite new, and a number of admins whom are familiar with the more traditional clustering models can find it quite alien).
Don’t get me wrong I am not suggesting that people are not moving with the times, I am just stating that the architecture change is fundamental – especially if you are considering Windows 2008 Failover Clustering to accommodate your CCR installation.
One additional things that I would also like to add is that some of the hardware configurations that I suggest might seem completely surreal and put you off CCR clustering in general (or indeed you might think that I have been taking crack) – but we need to consider that not everybody has access to huge amounts of funds for their mail environment
First things first – the hardware – the environment and striving for resilience:
I think that it is reasonable to assume that one of the key factors in choosing CCR as your method for deploying Exchange 2007 is that people want to maintain a level of high availability and protect your data. Now CCR offers this in many ways – but depending on your deployment and indeed options within your data centre there are a number of external factors which will dictate the uptime and availability that you can get.
These factors I have broken down into the following headings:
- Choice of Hardware for CCR
- Choice of O/S for CCR
- Number of Data Centres and indeed the features contained within
We all have our own favourite hardware vendor's – my own personal favourite in terms of server hardware is HP.
However irrespective of which hardware vendor you choose there are obviously some key specifications which are vendor agnostic – for example; the amount of memory, processor configuration, dual power supplies, number of network interface cards.
All of the above mentioned items in terms of capacity and performance tend not to vary between servers and manufacturers; most chassis will accept 32 GB of RAM (the maximum amount of memory that Microsoft Recommends within a mailbox server – you can install more), and again most chassis will accommodate a pair of quad core processors or even now six cores.
Where we tend to run into trouble (or actually more of a dilemma) is how much disk capacity a server chassis can handle and indeed the raid configuration this needs to be offset against best practices for database and transaction logs RAID level and LUN design, the amount of storage groups and databases that you will have against the all important issue of cost and practicality.
All of us typically work to a budget and especially in recent times with the Global Economy slowing down – the people whom sign off on the expenditure are less willing to commit large expense to hardware, but as system admins it is important to argue the case that cutting corners now can (and will) lead to costly issues in the future – there will be a temptation that in order to get your projects through by compromising in areas which will lead to issues later on – please, if you can don’t do it!.
A good example of an area where I have seen many companies finding savings is within the Disk Subsystem and the choice of RAID level for Exchange.
There has been much talk about in many forums about what is the best RAID level to assign to your databases – the general rule of thumb for a database recommendation is RAID 10 (1+0) – as it offers excellent performance for both read and writes as well as excellent performance during a rebuild or when a disk has failed. The down side to RAID 10 is its poor space utilisation and many companies are finding it hard to justify up to a 50% loss in RAW space from using RAID 10 – the following table is taken from: http://msexchangeteam.com/archive/2007/01/15/432199.aspx which given an overview of the main RAID levels that people consider with Exchange.
|Speed||Capacity Utilization||Rebuild Performance||Disk Failure Performance|
Interestingly enough the table suggests that RAID 6 has good capacity utilisation – I have found that it can be roughly proportional to that of RAID 10 with up to 50% space wastage (depending on the number of drives).
I have found that people are wrestling with the fact that most single server units (of up to 4U) which can be used for CCR often can only hold 16 internal disks (although the new ML 370 G6 can hold 24 SFF disks), however if you need more than 16 disks you are faced with the situation of either moving to the next level up in server technology (which means that you can be paying up to £12,000 – £15,000 per CCR node) or you are looking at a SAN / SCSI / iSCSI storage draw per CCR node.
Now there is nothing wrong with the concept of a storage draw per CCR node – however when balancing that against price of a server + a storage draw you are getting near that £12,000 + price tag again. Some organisations have opted for / or might be considering making use of an existing SAN within their environment for both their CCR nodes.
Whilst a good costs offset (by making use of an existing SAN) this approach builds in an inherent single point of failure which is depicted below:
The above very simply shows basic redundancy all the way through the infrastructure until you get to the SAN – if both nodes are replicating data to the same SAN – then, down the line you could have a problem.
It is understandable that finance directors and the like to wish to leverage existing technologies; if it represents a cost saving, especially if there has been a recent investment in a large SAN, and indeed a lot of companies will state – “but our SAN is reliable to 5:9's” – but from my own painful experience nothing is reliable to 5:9's if there is one of them (your availability is then based upon luck).
I have seen state of the art SAN’s fail and be out of action for a couple of days whilst your company finds out that the maintenance contract is based upon 24hr response – not fix, or the part that it is needed is in Nova Scotia and will arrive al la Camel transport in three days from Tibet (wasn’t that I film?….. no sorry that was 3 years in Tibet).
For production environments I strongly recommend that you give each CCR node its own DEDICATED storage despite the above financial temptations.
Getting back to RAID levels and why there might be an urge to “cut corners” the summary reasons that I have seen which cause issues here are:
- Amount of space within the server enclosure for disks
- The recommended RAID level for Exchange Databases results in large amounts of space being wasted
To understand this situation as mentioned I have above used the figure of 16 disk as a typical maximum number of disks which you can fit into a single (cost effective) CCR node enclosure (based upon a HP ML370G5) where the disks are SFF 300GB SAS and the RAID level for Transaction Logs is RAID 1 and the level for the Databases is RAID 10 (1+0) – given that consider the following diagram:
Given that the server has 16 disks, take two away for the O/S and Exchange Binaries = 14 disk remaining.
A Storage Group should use x 2 Disks (RAID 1) for the TS logs and a Database Drive should use a minimum of x 4 disks (RAID 10) for the databases = 6 disks therefore 8 disks remaining.
Add in another Storage Group and DB and you have only 2 disk remaining!
Here in lies the headache – for your investment in a CCR node with 16 disks, with 32GB of RAM and using best practices for the disk sub system – you will get two Storage Groups and Two Databases out of your server!
Now there is an argument to be had which states that having too many Storage Groups is a bad thing, and indeed it is true that the more SG’s that you have, the more memory the server will consume, but one would expect most larger companies to have at least 4 SG’s per server.
What do you do with those remaining disks?
Well there are a couple of suggestions – but, I have to admit they might not be very attractive:
- Depending on your environment and if you can get away with it – you can use these remaining disks for Public Folders (but will only really help in the case of one node as Public Folder Databases are not replicated via CCR).
- You could also consider using these Disks for an SCR target – but you would not be getting the space / RAID ratio’s which the production source Database would have.
- You could use them as a restore LUN
- You might consider using them for service mailboxes – such as Journaling or Operations Monitoring
- Use them as a dedicated Page File volume (although 300GB might be overkill)
So what do you do? – all of the above seems like doom and gloom if you are working to a budget, or indeed have been looking at implementing CCR without SAN storage (or iSCSI) per node.
Well the good news is that the above is meant to get you thinking about the type of hardware you want before you actually go out and buy it – I have seen a number of places whom, when I have been asked to look at their deployment hardware have said “nope this is it, and this is what we have to work with” – and then get upset when I say – “no way!” when I find out that there user populace does not fit the hardware specification.
To be honest in a lot of scenarios the ML370 (or other manufacturers equivalent) is an ideal CCR Cluster node depending on the amount of users and the storage limits and activity profiles that you wish to give them. If for example you have an organisation with around 1200 mailboxes – and wish to implement CCR then it is ideal!
Key things to remember BEFORE you buy your production CCR hardware are:
- Consider your user demographic in relation to your storage needs – for example, if you are going to give everyone a 2GB mailbox allowance and you have 3000 users – you need to be prepared for the Storage Demands and I/O requirements, if 2% of your users hit 2GB within your database that represents 60 mailboxes @ 2GB = 120GB bear in mind that the recommended size for a CCR database is 200GB so you will have to adapt your hardware solution accordingly
- Make sure that you get an idea of you storage and RAID requirements by using the Exchange 2007 Mailbox Serve Role Storage Requirements Calculator during your planning phase – you should be aiming for RAID 10 as the utopian Database level – RAID 5 is acceptable but you need to offset that against the performance drop
- Remember that one of the golden rules is performance over space – you might be tempted by RAID 5, 6 for your Databases to get the space – but, this will impact upon your performance
- Establish your memory Profile (see the next section) according to your user load, and storage group design
Other Hardware Considerations
I started this article with a pretty blasé statement about RAM and Processors for you CCR nodes, but they are very important consideration. I guess that was dismissive as they are the simplest components to be certain about – certainly from a specification and cost perspective.
Most servers these days come with a minimum of x 2 dual core processors (anything less is not economically worth buying) – quad core processors are now pretty much becoming the de facto in standard builds with 6 core processors now arriving on the market at the highest end.
Considering Microsoft’s recommendations for Processors for a Mailbox Server (which can be applied to a CCR node) then you should be looking for (at a minimum x 2 physical processors which are dual cores) (4 cores in practice) or if you have purchased a server with a pair of Quad cores you have hit the maximum recommended for a single mailbox server (as Microsoft points out – 8 cores is not a limit, just the largest amount where a perceivable performance benefit can be seen).
From a memory perspective this needs to be specified according to your user load, profile, amount of Storage Groups and the CORRECT configuration of your Page File – I would say that if you are running a CCR node with a medium user profile where you have 1200 users and > 5 storage groups then you should be looking at at least 12 GB of RAM with a page file of 12298 MB (Memory + 10MB) – the full best practice guidance for memory configurations can be found here: http://technet.microsoft.com/en-us/library/bb738124.aspx
It is also a good idea to factor in redundancy into the CCR nodes that you purchase – for example Dual Memory Boards, Dual Power Supplier and the like to ensure that you have the most basic levels of redundancy covered.
Putting the Hardware Together
Moving slightly back to disk configurations I would like to look at the options that you have when actually putting your production CCR nodes together physically.
Given that I have been using the ML 370 G5 in my examples it makes sense to base the following on that model – most of what I am discussing here can be applied to other vendors.
When talking about disks at Hardware level it can be broken down into the following areas:
- Disk Type (Speed, type [SAS of SATA])
- Disk Cages (or enclosures) and RAID Controller
Disk Type (Speed, type [SAS of SATA])
In terms of Exchange – and the types of disk to use, again you are faced with a plethora of options with the main protagonists being either Enterprise SATA, SCSI or SFF SAS drives (SCSI small form factor).
As a personal recommendation I would still at this stage (Exchange 2010 will really change the face of your storage designs) use either SCSI or SAS, eSATA is a good choice for smaller installations (of around and up to 500 heavy users) but if you have a lot of users (again between 500 and up) I have found that the SAS performance still has the edge (depending on the options that you purchase – see below) over other technologies.
With SAS have the options of Dual Channel, full duplex with bi-directional data transfer (which can double your data transfer rates) plus the consideration that SAS makes use of a point to point connection scheme to the host controller (whilst SCSI is a shared bus scheme) – again increasing the throughput of the drives.
You should also factor in the speeds of your drives – the higher the speed the great the performances, however you need to be aware that generally speaking the larger the drive, the slower the RPM.
From to time you will see recommendations that 15,000 RPM is the recommended drive speed for Exchange, however finding large enough drives which run at that speed can cost a lot of money.
However I would argue that that if you are using a good RAID level (such as RAID 10), combined with a health amount of BBW cache on your RAID controller and using the SAS specifications above you will be ok and the lost I/O with be marginal – given a scenario of 1200 users (perhaps more) with a medium / heavy user profile.
Disk: QUICK TIP:
When buying a server (or indeed SAN) with a large number of disks (for example 16 or more) have you checked to see if the serial numbers / batch numbers are consecutive?
Due to a rather nasty experience that I heard about involving a SAN and 8 disks all from the same batch with sequential numbering all failing within days of one another – I always ask my suppliers for disks are from non-consecutive batches – call me mad but things do happen!
Disk Cages (or enclosures) and RAID controller
Many servers of around 4U (or greater) are shipped in their default configuration with a single drive cage which supports around 8 drives (again referring as an example to my old work horse the ML 370 G5) – but have the capacity to add a second cage (some vendors supply such servers with two cages already).
With all of this “caginess” depending on the budget that you have and the level local server redundancy that you wish to achieve you will probably opt for x 2 RAID controllers. Now depending on the RAID controller that you purchase will dictate the level of resilience you can achieve between them – using the ML370 example the default RAID controller is a HP P400 – this will support a single cage – but no interoperability between cages.
For example even with a pair of P400 controllers, you could not configure Cage 1 / Disk 1 or be a mirror of Cage 2 / Disk 1 – if you wish to achieve such a goal then you will need to purchase a pair of more fully featured RAID controllers which will allow for Cage interoperability .
However you might think that the default P400 is adequate for your needs given the fact that you have another pair on P400’s in the other node which can take over if you lost one in the primary node – with this being the case your RAID configuration might look something like the following:
If you are planning your Exchange CCR implementation right now, then I would seriously suggest that Windows 2008 is the chosen O/S. You will be benefitting from the latest release of Windows with a number of optimisations that the O/S brings that Exchange 2007 can take advantage of (for example the enhanced SMB Protocol for Log Shipping between nodes). With the release of SP2 for Windows 2008 I would argue that the O/S is now more mature.
Another good reason for going to Windows 2008 is that you will eliminate worries over future supportability – Windows 2003 will go into extended support at some point – and you cannot in place upgrade a Windows 2003 server which has an Exchange installation on it as per http://msexchangeteam.com/archive/2007/10/04/447188.aspx so it is perhaps best not to build in what might become an integral legacy issue.
It also goes without saying that if you wish to implement a geographical CCR solution which spans different IP Subnets you will need to be using Windows 2008.
Data Centre Considerations
What has your Data Centre really got to do with your CCR installation? Well I would argue quite a bit when you really begin to analyse the subject.
Initially CCR was not designed as a geographic node clustering solution – however with Windows 2008 and Exchange 2007 SP1 you can span your cluster nodes across disperse subnets – therefore giving the possibility of nodes in different physical sites. As I mentioned in the beginning of this article “the key factors in choosing CCR as your method for deploying Exchange 2007 is that people want to maintain a level of high availability and protect your data” now this statement can be used to introduced the point that you installation is as strong as its weakest link.
Admittedly the further you work back in the availability chain the less likely a failure will happen, but the cost rises proportionally – you will need to decide where you and your organisation wishes to sit – I have put together the following examples:
CCR Cluster – same Data Centre – same rack – what you will get: [Low Cost – Higher Risk]
- Data Base resiliency (you will have a reasonably up-to-date copy of your production database on another server which can be brought online automatically without serious disruption to customers) as each database should be stored on dedicated storage for the node
- Remember if your Database becomes inconsistent then the backup copy will not remain up to date as the log shipping / replay process should fail
- Protection against hardware / software on the primary CCR node
- Simplified updates and service packs
CCR Cluster – same Data Centre – different racks – what you will get: [Low Cost – Reducing Risk]
As per above but also:
- Different Racks would imply at the very least different power supplies within the room – therefore if one node goes down then the other stays online – of course having the nodes in different racks also protects against those little “accidents” – surely are not the only person to have accidentally pull power to a rack (okay then maybe I am the only person is pull power to a rack by accident)
- Again by placing the nodes in different racks you have the option of taking advantage of disparate networking within the room – this could include a backup connection for the CCR Log Shipping process
CCR Cluster – nodes in different Data Centres – what you will get: [Higher Cost – Further Reduced Risk]
As per the first point but with the following benefits
- Protection against fires, theft, damage on a single site (at least in terms of the mail system)
- Separate power between sites
- Independent communications channels but at a local and remote level
- Using Windows 2008 you can have the CCR nodes on different subnets
- Providing that your accompanying configuration is correct you can sustain the loss of a full site, but maintain e-mail communications and access for you company (more on this below)
The following chart is a visual representation of the offset of cost in relation to availability:
Now I will not claim that the above scenarios are exhaustive but are designed to give you an idea as to why you might consider for or against when trying to eliminate risk in your environment.
If to eliminate as many points of failure as possible and you also have the budget – the following should get your head going – but it is important to remember the Availability to Cost Ratio – the further back you eliminate risk – the less open you are to failure – but the more money you will spend! – This should be considered as a possible example where the investment would be quite high, but the risk would be reduced:
Two Data Centres which are at least 7 miles from one another (well I say 7 miles – less might be sufficient and probably more possible – the key thing is that each site needs independent power and Telco access – as well as not being under a flight path, known flood plane, high risk to civil unrest or nuclear attack)
- Two Data Centres each on geographically separate power supplies – ALL Tier 3 standard Data Centres will have diverse Power back to the Electrical Supplier (at sub station level) – higher tiers have diverse power from different suppliers.
- Each of the two Data Centres has geographically (and separate Communication provider networks links) – again ideally from different suppliers (or Telco’s)
- Each of the two Data Centres will have redundant Routing, Switching and firewalls, as well as redundant power to each rack from two separate distribution boards in the room
- Data Centre 1 has CCR node 1
- Data Centre 1 has HT 1
- Data Centre 1 has CAS 1 (Load Balanced with the CAS in Data Centre 2)
- Data Centre 2 has CCR node 2
- Data Centre 2 has HT 2
- Data Centre 2 has CAS 2 (Load Balanced with the CAS in Data Centre 1)
- Each CCR Node has its own storage – this can be accomplished via internal Disk or each node having its own storage attached via Fibre or SCSI draws
- Each Exchange Server in the configuration has redundant components (power supplier, Nics, Memory Board, processors)
Summary for this Part:
I hope that you have enjoyed this part as much I have writing it. I know that it is a bit of a ramble – but I hope that it gets people think about their CCR deployments and what they want to achieve, perhaps some of you will now not go with CCR – perhaps some of you will now change the way in which they look at it. Remember that CCR does not have to be used in isolation – you can combine it with SCR for example to further increase recovery times but reduce other costs.
In the next part I would like to further expand on effective cost cutting and go through some recommendations for recovery – I will then move onto builds and scripts which might help you along the way