|
Achieving
The Next Level in High Availability Systems
Increasing availability to five "9"s and beyond for computing requires
several considerations beyond the computing hardware and software
platforms. Although the actual computer hardware and software are
critical components in achieving high availability, one of the most
important areas that is often overlooked at the computer system level
is a reliable and highly available power source. This requirement is
either assumed, or it is requested that the end user ensure that a
reliable power source is available 24/7. Because high availability
computing is primarily found in the converging market where computer
telephony and telecommunications have come together, the power
infrastructure is typically robust and has been designed from both a
high reliability and high availability perspective. It may seem
redundant to mention both high reliability and high availability,
however they are not mutually exclusive but mutually dependent.
You may have a very
reliable system that seldom fails: however, when it does fail and takes
your system down for hours, it can be devastating and significantly
reduce overall availability to less than five "9"s. Conversely a highly
available system that has redundant hot swap capability with no
downtime during a failure does not mean it is a reliable system. A
highly available system may not cause downtime when replacing hot swap
redundant modules, however, if they are replaced continuously the
chances that a failure of the redundant modules may occur during a
replacement increases with each failure.
Merging the availability
and reliability of the power system and the computing system is
sometimes difficult because these systems are normally the
responsibility of two different system engineers. IBUS has brought
together a unique combination of both power and computing system
engineers who combine their respective knowledge from their areas of
expertise to define a new level of reliability and availability for the
complete system.
Achieving five "9"s
availability with any system is challenging enough, going beyond that
requires an entirely different way of thinking. Developing an
independent five "9"s computer system and powering it with an
independently developed five "9"s power system will not necessarily
accomplish the objective of going beyond that level of availability. An
integrated understanding of both the power and computing systems will
be required to achieve higher levels of availability.
A Variety of Power
Sources
The convergence of computer telephony and telecommunications has
created challenges for computing hardware, firmware and software. Much
attention has been focused in these areas to ensure high availability
fail over schemes and hot swap capabilities. Power is often an
afterthought that someone else worries about and has been addressed by
providing redundant power supplies within the computing system. This
convergence has also created challenges for reliable available power
for these computing systems and to achieve the next level of system
availability, it requires going beyond supplying redundant power
supplies and feeding them with a power source that someone else worries
about.
The primary power source
can be a highly available Dual DC source with a maintained and
monitored DC power plant that will reliably provide backup power for 4
hours or more at a central office of a telecommunications service
provider. The power source can be a Dual AC source comprised of two
primary AC sources from a utility or a utility primary and a local AC
source that can be a generator with a UPS bridge at a customer premise
site. The UPS may be a centralised parallel redundant large UPS or a
decentralised non-redundant small UPS. The power source can be a single
feed AC source at a co-located ISP premise with or without a UPS. Each
of these power sources provides different levels of reliability and
availability.
Depending on the point of
view one may be considered more reliable or available than the other. A
centralised UPS advocate can provide data that shows a large UPS with
redundant modules is more reliable and available than having a plethora
of small decentralised UPS. While this may be true from a theoretical
standpoint there are some considerations that may cause
second-guessing. A centralised UPS is typically hundreds of feet from
the actual critical load. While the wire, panel boards, PDUs, and
circuit breakers that cover this distance are very reliable if they do
fail, they typically are not a hot swap few minute repair. These
interim power systems may or may not be redundant and if they are not,
even if you continue power with a second source you have lost
redundancy, and therefore availability. Additionally, the reliability
of all of the loads can potentially be an issue if the failure of one
can cause a fault clearing device (breaker, fuse) or protection circuit
on the UPS to open to clear the fault. Also the repair and maintenance
time required for large UPS is longer than smaller decentralised UPS.
The system may be completely available during these maintenance times
with redundancy if the service person doesn't make a mistake.
A decentralised UPS scheme
gets the power protection closer to the critical load, removing some
uncertainties. However, there would be more UPS, and the overall
reliability when you add up all the systems may be less. Even if the
reliability is less, the availability may be greater for several
reasons. One would be redundancy at the lower level, which is a popular
design today and is normally a much faster swap and repair than a
larger UPS system. Also depending on the computer system scheme, a
decentralised approach may only reduce availability to a smaller
portion of the overall computer system that may be redundant as well.
One of the drawbacks of decentralised UPS is the use of potential
revenue generating rack space that could be filled with additional
computing power.
Meeting The Challenge
In order to provide a next-generation high availability computer
system, IBUS proposes to consider the power source availability as an
integral part, and provide high availability choices for power without
depending on external batteries in a box. In addition, it is important
to get the protection as close to the source as possible without using
valuable rack space.
There are several
challenges to be overcome to service a market with a variety of power
sources and a need for high availability, and there is not one single
solution. In fact, five different solutions that can be provided in one
computer server chassis (currently the nFUZION 8U systems) have been
defined and are in development.
Two of the solutions
feature dual cross feed redundant hot swap DC or AC modules. These
modules fit in a 3U slot and directly interface with the IBUS cPCI back
plane. They provide interconnection to a Dual DC source or a Dual AC
source with cross feed redundancy to redundant DC or AC power supplies.
These modules also provide fault isolation, loss detection and hot swap
capability. This enables high reliability and availability at the
computer power interface when there is a highly reliable and available
power source.
The Dual AC can be used
with a hot swap DC boost converter module and hot swap battery module
that is contained in a special compartment that is in the airflow path
of the IBUS patented hot swap multi-fan cooling module integrated into
the 8U chassis. This multi-fan assembly provides the required cooling
for both the high density SBCs and the batteries for an integrated UPS.
This integrated UPS approach provides high available and reliable
back-up close to the critical load without the use of valuable rack
space.
When looking at a UPS on
the motherboard IBUS's design may not be considered a UPS at all. Using
Maxwell PowerCache solid-state energy storage technology in conjunction
with a DC regulator at the bus voltage of the microprocessor enables
back-up for a short period without batteries for the motherboard micro.
This protection does not provide back-up to the entire computer system
but does provide power back-up as close to the primary source as
possible and at least provides a short-term back-up of the intelligence
of a system without batteries.
The I-Bus approach to high
availability system is not a traditional design. It is an integrated
approach that takes advantage of understanding both the power and
computing side of high availability and provides power protection
options that can utilise external dual power sources or internal power
back-up sources.
|