produced for the March 2000 Microsoft Enterprise site.
price of reliability
Clustering is becoming
a mainstream solution to the reliability challenge. Is it right for you?
A recurring nightmare for many IT managers these days is the phrase 24x7x365.
Thanks to the Web, globalization and other trends, IT professionals must
be able to provide high-availability guarantees for their organizations'
order processing, e-mail and other applications.
High availability refers to a system's capacity to recover from inevitable
failures. It assumes that every device can fail and provides mechanisms
to recover from that failure automatically and quickly. One of the most
common technologies to provide high availability is clustered processors.
Clusters are two or more tightly connected independent servers that share
common storage. In the event that one server fails, another server in
the cluster automatically restarts the application that had been running
on the failed server, a process that can take 20 to 30 minutes.
The concept of clustering for high availability, however, has expanded
to encompass a range of technologies beyond a strict server cluster. It
now includes load balancing and various initiatives to eliminate single
points of failure through the entire system environment, from the power
source through the network links to the storage.
As a result, the decision to deploy clustering actually consists of a
complex series of considerations revolving around the importance of a
given application and an organization's tolerance for losses resulting
from system downtime. The solution will likely be different for each application
within the organization's portfolio. For example, an application that
generates batch reports overnight and can easily tolerate a few hours
of downtime may need no high-availability capabilities at all, while an
enterprise resource planning (ERP) system may benefit from a basic two-node
An organization can also create high availability for a busy Web site
through effective load balancing of the multiple servers required to handle
the level of activity. A high-volume online transaction processing (OLTP)
application, by comparison, may require a costly, comprehensive, fault-tolerant
solution that ensures 99.999% availability.
In essence, there is no single solution that works for every organization
or every application within the organization. Rather, the clustering decision
will be based on a combination of considerations that will vary with each
application. To provide IT professionals with some specific guidance about
when they should consider clustering, The Enterprise developed a Rule
of Thumb decision tree and estimating tool. The development of the
Rule of Thumb is based on information provided by Harvey Hindin, vice
president at D.H. Brown Associates, of Port Chester, N.Y.; Dan Kusnetzky,
program director at International Data Corp. (IDC), of Framingham, Mass.;
and Donna Scott, vice president/research director at Gartner Group, of
Stamford, Conn. We are grateful for their support.
D.H. Brown is a research and consulting firm that provides in-depth analysis
of computing technologies to support engineering, manufacturing, design
and open systems. For more information about D.H. Brown, go to its Web
IDC is a leader in providing research and analysis services on IT products
and services. For more information about IDC, go to its Web site: www.idc.com.
Gartner Group is a research and consulting firm that addresses a broad
range of technologies. For more information about Gartner Group, go to
its Web site: www.gartner.com.