Microservices Architectures - What is Fault Tolerance?


In this article, we discuss an important property of microservices, called fault tolerance.

You will learn

  • What is Fault Tolerance?
  • Why is fault tolerance important in microservices architecture?
  • How do you achieve fault tolerance?

Cloud and Microservices Terminology

This is the last article in a series of six articles on terminology used with cloud and microservices:

What Is Fault Tolerance?

Microservices need to be extremely reliable.

When we build a microservices architecture, there are a large number of small microservices, and they all need to communicate with one another.

Lets consider the following example:

image info

Let’s say Microservice5 is down at some point of time.

All the other microservices are directly or indirectly dependent on it, so they all go down as well.

The solution to this problem is to have a fallback in case of failure of a microservice. This aspect of a microservice is called fault tolerance.

Implementing Fault Tolerance with Hystrix

A popular framework used to implement fault tolerance is Hystrix, a Netflix open source framework. Here is a code example of the same:


	@GetMapping("/fault-tolerance-example")
	@HystrixCommand(fallbackMethod="fallbackRetrieveConfguration")
	public LimitConfiguration retrieveConfiguration() {
		throw new RuntimeException("Not Available");
	}

	public LimitConfiguration fallbackRetrieveConfiguration() {
		return new LimitConfiguration(999, 9);
	} 

Hystrix enables you to specify the fallback method for each of your service methods. If the method throws an exception, what should be returned to the service consumer?

Here, if retrieveConfiguration() fails, then fallbackRetrieveConfiguration is called, which returns a hardcoded LimitConfiguration instance:

image info

Hystrix And Alerts

With Hystrix, you can also configure alerts at the back-end. If a service starts failing continuously, you can send alerts to the maintainance team.

Hystrix is not a silver bullet

Using Hystrix and fallback methods is appropriate for services that handle non critical information.

However, it is not a silver bullet.

Consider for instance, a service that returns the balance of a bank account. You cannot provide a default hardcoded value back.

Using sufficient redundancy

It is important to design critical services in a fail safe manner. It is important to build enough redundancy into the system to ensure that the services do not fail.

Have sufficient testing

It is important to test for failure. Bring a microservice down. See how your system reacts.

Chaos Monkey is a good example from Netflix.

Do check out our video on this:

image info

Summary

In this article, we discussed about fault tolerance. We saw how fault tolerance is essential in microservices architecture. We then saw how it can be implemented at the code level using frameworks such as Hystrix.

Related Posts

Spring Boot Tutorials for Beginners

At in28Minutes, we are creating a number of tutorials with videos, articles & courses on Spring Boot for Beginners and Experienced Developers. This resources will help you learn and gain expertise at Spring Boot.

Introduction To Aspect Oriented Programming and Cross Cutting Concerns

Software applications are built in layers. There is common functionality that is sometimes needed across layers - logging, performance tracing etc. How do you implement these common features?

Programming Basics - Introduction To Object Oriented Programming

Object oriented programming (OOP) is all about thinking in terms of objects. Let's dig deeper.

Programming Basics - Five Things To Think About While Programming

You would obviously want to write code that meets your core requirements and provide good performance - choosing right data structures and algorithms to use is the fundamental part of programming. What are the other things that you need to worry about? Here are five things that we think are essential.

Asynchronous communication with queues and microservices - A perfect combination?

In this article, we throw some light on what asynchronous messaging is all about and discuss why you should consider it for your microservices architectures.

Microservice Best Practice - Build an Archetype

In this article, we focus on learning why creating proper archetypes is important for successful microservices architecture.

Microservice Architecture Best Practices - Messaging Queues

In this article, we discuss why Messaging queues are needed, and how they form the cornerstone of communication in microservices architectures.

Microservice Best Practice - Why do you build a Vertical Slice?

In this article, we look at what is a vertical slice, and why we build it. We also discuss the best practices involved in building vertical slices.

Microservices Architectures - Event Driven Approach

In this article, we talk about event driven approach, in the context of microservices architectures. We also discuss what are the advantages of using an event driven approach.

The 12 Factor App - Best Practices In Cloud Native Applications and Microservices

In order that an application be deployed in the cloud and enjoy features such as auto scaling, it first needs to be cloud native. In this article, we have a close look at the best practices for cloud native applications, popularly known as The 12 Factor App.