How To Implement Chaos Engineering For Microservices Using Istio

“Embrace Failures. Chaos and failures are your friends not enemies.” A microservice ecosystem is going to fail at some point. The issue is not if you fail but when you fail will you notice or not. It’s between whether it will affect your users because all of your services are down or it will affect only a few users and you can fix it at your own time.

Chaos Engineering is a practice to intentionally introduce faults and failures into your microservice architecture to test the resilience and stability of your system. Istio can be a great tool to do so. Let’s have a look at how Istio made it easy.

For more information on how to setup Istio and what are virtual service and Gateways, please have a look at the following blog, how to setup Istio on GKE.

Fault Injection With ISTIO

Generating HTTP 503 Error

Currently, the traffic on the recommendation service is automatically load balanced between those two pods.

Now let’s apply a fault injection using virtual service which will send 503 HTTP error codes in 30% of the traffic serving the above pods.

To test whether it is working, check the output from the curl of customer service microservice endpoint.

You will find the 503 error on approximately 30% of the request coming to recommendation service.

To restore normal operation, please delete the above virtual service using:

Delay

Now if you hit the URL of endpoints of the above service in a loop, you will see the delays in some of the requests.

Retry‍

For that mechanism, you can insert retries on those services as follows:

Now any request coming to recommendation will do 3 attempts before considering it as failed.

Timeout‍

Wait only for N seconds before failing and giving up.

Conclusion‍

Originally published at https://www.velotio.com.

Velotio Technologies is an outsourced software and product development partner for technology startups & enterprises. #Cloud #DevOps #ML #UI #DataEngineering