Understanding Five Nines of Availablity

whenever someone demands for 100% uptime or the highest level of uptime Five Nines (99.999%) is what he gets in SLA commitment. Lets discuss on this long pending debate today. So, Lets start with a question - 

How much downtime Five Nines is - on per year basis ?

Well, if we calculate 99.999% uptime then downtime per year should not be more than five minutes and fifteen seconds but we have a long existing debate on another aspect of the downtime - Whether scheduled maintenance should be taken into account or not ?? Well... Some people say since this is scheduled this could not be counted as downtime and some dont agree to it but I beleive to understand uptime SLA, we should be able to understand few of the following terms -

Reliability is the probability that a product can perform a required function for a given time interval. Reliability is generally used to describe the quality of a product through mean time between failure (MTBF) data provided by the equipment vendor. MTBF is the average time taken for a component to transit from an operation state to a failure state.

Availability, on the other hand, is the total amount of time a system is up and functioning properly to accomplish its mission. When talking about five nines, availability is what you are interested in. Bear in mind, however, that reliability is also an important contributing factor. Those who prefer the MTBF approach would suggest the following formula for availability:


 

where

Lets have a look at the various calculations on the basis of percentage uptime committed - 

 

Now, Managers had started to understand the these are merely theoretical calculations and network uptime depends on the various other factors. This approach sets the expectation of network availability, and ultimately impacts how the network is designed and what type of equipment must be purchased. With this approach, however, network managers debate what counts toward downtime (the yardstick of a five-nines system). Therefore, it is common to see network managers referencing some standards documents or network characteristics when trying to clarify their expectations, in addition to referencing the actual five-nines availability requirement. For example, the Telcordia GR-512-Core document will be used as a reference for downtime, or a 50-millisecond (ms) reroute capability will be used to measure resiliency. Unfortunately, such references might not actually help to express the real expectation and with evolution of Cloud Computing SLA would become more stringent & market more competitive.

You might also like these recent post - 


Distribured Virtual Datacenter for Enterprise cloud - Read This 
Cisco ASR 9000 - Network Virtualization Technology - Read This 
Cisco ISR G2 Licensing - Simpified - Read This
Juniper QFabric - Can it compete with Cisco FabricPath - Read This
Cisco SAFE - Could be a tool to boost your sales - Read This
Cisco GLBP is an unbeatable FHRP - Read this


Found it useful, Consider sharing it with your friends -

Labels: , , , ,