Another bright sunday morning and I was trying to concentrate on a Cisco's whitepaper for designing highly available networks and I came across a small conclusion to the Best Practices for High-Availability Network Design, I found it worth sharing so, posting the brief excerpts from the document -
Cisco has developed a set of best practices for network designers to ensure high availability of the network. The five-step Cisco recommendations are
Analyze technical goals and constraints
Technical goals include availability levels, throughput, jitter, delay, response time, scalability requirements, introductions of new features and applications, security, manageability, and cost. Investigate constraints, given the available resources. Prioritize goals and lower expectations that can still meet business requirements. Prioritize constraints in terms of the greatest risk or impact to the desired goal.
Determine the availability budget for the network
Determine the expected theoretical availability of the network. Use this information to determine the availability of the system to help ensure the design will meet business requirements.
Create application profiles for business applications
Application profiles help the task of aligning network service goals with application or business requirements by comparing application requirements, such as performance and availability, with realistic network service goals or current limitations.
Define availability and performance standards
Availability and performance standards set the service expectations for the organization.
Create an operations support plan
Define the reactive and proactive processes and procedures used to achieve the service level goal. Determine how the maintenance and service process will be managed and measured. Each organization should know its role and responsibility for any given circumstance. The operations support plan should also include a plan for spare components.
To achieve 99.99-percent availability (often referred to as “four nines”), the following problems must be eliminated:
- Inevitable outage for hardware and software upgrades
- Long recovery time for reboot or switchover
- No tested hardware spares available on site
- Long repair times because of a lack of troubleshooting guides and process
- Inappropriate environmental conditions
To achieve 99.999-percent availability (often referred to as “five nines”), you also need to eliminate these problems:
- High probability of failure of redundant modules
- High probability of more than one failure on the network
- Long convergence for rerouting traffic around a failed trunk or router in the core
- Insufficient operational control