This article is under development....

Understanding Hign Availability

High availability is technology that enables networkwide resilience to increase IP network availability. Network applications must cross different network segments, from the Enterprise Backbone, Enterprise Edge, and Service Provider Edge, through the Service Provider Core. All segments must be resilient to recover quickly enough for faults to be transparent to users and network applications.

Redundancy

A redundant design can use several mechanisms to prevent single points of failure:

Geographic diversity and path diversity are often included.
Dual devices and links are common, as shown in Figure 5-2.
Dual WAN providers are common.
Dual data centers are sometimes used, especially for large companies and large e-commerce sites.
Dual collocation facilities, dual phone central office facilities, and dual power substations can be implemented.

Redundant design must trade off cost versus benefit. It takes time to plan redundancy and verify geographic diversity of service providers. Additional links and equipment cost money to purchase and maintain. These options must be balanced against risks, costs of downtime, and so on.

Technology

Several Cisco routing continuity options, such as Cisco Nonstop Forwarding (NSF) and Stateful Switchover (SSO) exist, and graceful restart capabilities improve availability. Techniques exist to detect failure and trigger failover to a redundant device. These tech- niques include service monitoring for Cisco IOS IP Service Level Agreements (SLA) and Object Tracking.

People

In the Prepare, Plan, Design, Implement, Operate, and Optimize (PPDIOO) methodology, the people component is vitally important, too. Staff work habits and skills can impact high availability. For example, attention to detail enhances high availability, whereas carelessness hurts availability.

Processes

Sound, repeatable processes can lead to high availability. Continual process improvement as part of the PPDIOO methodology plays a role in achieving high availability.

Organizations should build repeatable processes in the following ways:

By documenting change procedures for repeated changes (for example, Cisco IOS Software upgrades)
By documenting failover planning and lab testing procedures
By documenting the network implementation procedure so that the process can be revised and improved the next time components are deployed.

Organizations should use labs appropriately, as follows:

Lab equipment should accurately reflect the production network.
Failover mechanisms are tested and understood.
New code is systematically validated before deployment.

Because staff members tend to ignore processes that consume alot of time or appear to be a waste of time, organizations also need meaningful change controls in the following ways:

Test failover and all changes before deployment.
Plan well, including planning rollbacks in detail.
Conduct a realistic and thorough risk analysis.
Perform regular capacity management audits.
Track and manage Cisco IOS versions.
Track design compliance as recommended practices change.
Develop plans for disaster recovery and continuity of operations.

Tools

Organizations are starting to monitor service and component availability. With proper failover, services should continue operating when single components fail. Without component monitoring, a failure to detect and replace a failed redundant component can lead to an outage when the second component subsequently fails.

Network diagrams help in planning and in fixing outages more quickly.
Documentation explaining how and why the network design evolved helps capture knowledge that can be critical when a different person needs to make design changes.
Key addresses, VLANs, and servers should be documented.
Documentation tying services to applications and virtual and physical servers can be incredibly useful.

High Availability and Failover Times

Navigation menu

CCNP SWITCH/Implementing High Availability in a Campus Network

Contents