What Is Server Fault Tolerance?



by tony1kenobi on July 7, 2008

Fault Tolerence is a term that is used quiet a lot when looking at the design of mission critical systems. So what do we mean when we say that a system is fault tolerent?

Lets assume that we are running a website that gets lots of traffic wevery day. In order to handle all of this traffic the system designer decides that we need 5 servers and we will use a load balancer to distribute web page requests across the 5 servers. Each server has been sized so that it can handle 30% of the requests that arrive.

This means that we only need 3 and a bit servers tio handle all the requests. Why has the designer done this? Is it a waste of money? Well in this case the web company actually makes a ton of money from the traffic to just one server so the fact that 3 servers are busy handling requests is good news for the chief financial officer.

So the reason that the designer has some extra servers is to introduce fault tolerence into the system. This mean that if on server fails there will be no impact on the system and all web page requests will be handled without any loss of money to the business.

Introducing fault tolerance by having extra servers is expensive so the company needs to be sure that it is worth the money. This is typically the responsibility of the capacity planner. He/she can users their knowledge and usually special software to work out how many servers are required to handle a partuicular forecasted load on the system.

Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • bodytext
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google

{ 0 comments… add one now }

There are no comments yet...

Kick things off by filling out the form below ↓

Leave a Comment

You can use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

hot air balloons