2022-04-15 15:44:08 +02:00
---
title: "Traefik CircuitBreaker Documentation"
description: "The HTTP circuit breaker in Traefik Proxy prevents stacking requests to unhealthy Services, resulting in cascading failures. Read the technical documentation."
---
2019-02-26 05:50:07 -08:00
# CircuitBreaker
Don't Waste Time Calling Unhealthy Services
{: .subtitle }
2021-06-11 15:30:05 +02:00
![CircuitBreaker ](../../assets/img/middleware/circuitbreaker.png )
2019-02-26 05:50:07 -08:00
2021-02-11 14:34:04 +01:00
The circuit breaker protects your system from stacking requests to unhealthy services, resulting in cascading failures.
2019-02-26 05:50:07 -08:00
2020-09-08 17:52:03 +02:00
When your system is healthy, the circuit is closed (normal operations).
2021-02-11 14:34:04 +01:00
When your system becomes unhealthy, the circuit opens, and the requests are no longer forwarded, but instead are handled by a fallback mechanism.
2019-02-26 05:50:07 -08:00
2021-02-11 14:34:04 +01:00
To assess if your system is healthy, the circuit breaker constantly monitors the services.
2019-02-26 05:50:07 -08:00
2019-09-23 14:32:04 +02:00
!!! note ""
2019-02-26 05:50:07 -08:00
2021-02-11 14:34:04 +01:00
The CircuitBreaker only analyzes what happens _after_ its position within the middleware chain. What happens _before_ has no impact on its state.
2019-02-26 05:50:07 -08:00
!!! important
2021-02-11 14:34:04 +01:00
Each router gets its own instance of a given circuit breaker.
One circuit breaker instance can be open while the other remains closed: their state is not shared.
2019-09-23 14:32:04 +02:00
This is the expected behavior, we want you to be able to define what makes a service healthy without having to declare a circuit breaker for each route.
2019-02-26 05:50:07 -08:00
## Configuration Examples
2023-05-10 15:28:05 +02:00
```yaml tab="Docker & Swarm"
2019-03-29 12:34:05 +01:00
# Latency Check
labels:
2019-09-23 17:00:06 +02:00
- "traefik.http.middlewares.latency-check.circuitbreaker.expression=LatencyAtQuantileMS(50.0) > 100"
2019-03-29 12:34:05 +01:00
```
2019-04-03 14:32:04 +02:00
```yaml tab="Kubernetes"
# Latency Check
2023-03-20 15:38:08 +01:00
apiVersion: traefik.io/v1alpha1
2019-04-03 14:32:04 +02:00
kind: Middleware
metadata:
name: latency-check
spec:
circuitBreaker:
expression: LatencyAtQuantileMS(50.0) > 100
```
2019-10-15 18:34:08 +03:00
```yaml tab="Consul Catalog"
# Latency Check
- "traefik.http.middlewares.latency-check.circuitbreaker.expression=LatencyAtQuantileMS(50.0) > 100"
```
2019-07-22 09:58:04 +02:00
```yaml tab="File (YAML)"
# Latency Check
http:
middlewares:
latency-check:
circuitBreaker:
expression: "LatencyAtQuantileMS(50.0) > 100"
```
2021-06-19 00:08:08 +02:00
```toml tab="File (TOML)"
# Latency Check
[http.middlewares]
[http.middlewares.latency-check.circuitBreaker]
expression = "LatencyAtQuantileMS(50.0) > 100"
```
2019-02-26 05:50:07 -08:00
## Possible States
There are three possible states for your circuit breaker:
2020-09-08 17:52:03 +02:00
- Closed (your service operates normally)
2019-02-26 05:50:07 -08:00
- Open (the fallback mechanism takes over your service)
- Recovering (the circuit breaker tries to resume normal operations by progressively sending requests to your service)
2019-07-01 11:30:05 +02:00
2020-09-08 17:52:03 +02:00
### Closed
2019-02-26 05:50:07 -08:00
2020-09-08 17:52:03 +02:00
While the circuit is closed, the circuit breaker only collects metrics to analyze the behavior of the requests.
2019-02-26 05:50:07 -08:00
2021-02-11 14:34:04 +01:00
At specified intervals (`checkPeriod` ), the circuit breaker evaluates `expression` to decide if its state must change.
2019-02-26 05:50:07 -08:00
### Open
2019-04-03 14:32:04 +02:00
While open, the fallback mechanism takes over the normal service calls for a duration of `FallbackDuration` .
2024-01-29 01:58:05 -08:00
The fallback mechanism returns a `HTTP 503` (or `ResponseCode` ) to the client.
2021-02-11 14:34:04 +01:00
After this duration, it enters the recovering state.
2019-02-26 05:50:07 -08:00
### Recovering
2021-02-11 14:34:04 +01:00
While recovering, the circuit breaker sends linearly increasing amounts of requests to your service (for `RecoveryDuration` ).
If your service fails during recovery, the circuit breaker opens again.
If the service operates normally during the entire recovery duration, then the circuit breaker closes.
2019-02-26 05:50:07 -08:00
## Configuration Options
### Configuring the Trigger
2021-02-11 14:34:04 +01:00
You can specify an `expression` that, once matched, opens the circuit breaker and applies the fallback mechanism instead of calling your services.
2019-02-26 05:50:07 -08:00
2021-02-11 14:34:04 +01:00
The `expression` option can check three different metrics:
2019-02-26 05:50:07 -08:00
- The network error ratio (`NetworkErrorRatio` )
- The status code ratio (`ResponseCodeRatio` )
2021-02-11 14:34:04 +01:00
- The latency at a quantile in milliseconds (`LatencyAtQuantileMS` )
2019-07-01 11:30:05 +02:00
2019-04-03 14:32:04 +02:00
#### `NetworkErrorRatio`
2019-02-26 05:50:07 -08:00
2021-02-11 14:34:04 +01:00
If you want the circuit breaker to open at a 30% ratio of network errors, the `expression` is `NetworkErrorRatio() > 0.30`
2019-02-26 05:50:07 -08:00
2019-04-03 14:32:04 +02:00
#### `ResponseCodeRatio`
2019-02-26 05:50:07 -08:00
2021-02-11 14:34:04 +01:00
You can configure the circuit breaker to open based on the ratio of a given range of status codes.
2019-02-26 05:50:07 -08:00
The `ResponseCodeRatio` accepts four parameters, `from` , `to` , `dividedByFrom` , `dividedByTo` .
The operation that will be computed is sum(`to` -> `from` ) / sum (`dividedByFrom` -> `dividedByTo` ).
2019-09-23 14:32:04 +02:00
!!! note ""
2021-02-11 14:34:04 +01:00
2019-02-26 05:50:07 -08:00
If sum (`dividedByFrom` -> `dividedByTo` ) equals 0, then `ResponseCodeRatio` returns 0.
2021-02-11 14:34:04 +01:00
`from` is inclusive, `to` is exclusive.
For example, the expression `ResponseCodeRatio(500, 600, 0, 600) > 0.25` will trigger the circuit breaker if 25% of the requests returned a 5XX status (amongst the request that returned a status code from 0 to 5XX).
2019-02-26 05:50:07 -08:00
2019-04-03 14:32:04 +02:00
#### `LatencyAtQuantileMS`
2019-02-26 05:50:07 -08:00
2021-02-11 14:34:04 +01:00
You can configure the circuit breaker to open when a given proportion of your requests become too slow.
2019-02-26 05:50:07 -08:00
2021-02-11 14:34:04 +01:00
For example, the expression `LatencyAtQuantileMS(50.0) > 100` opens the circuit breaker when the median latency (quantile 50) reaches 100ms.
2019-02-26 05:50:07 -08:00
2019-09-23 14:32:04 +02:00
!!! note ""
2019-02-26 05:50:07 -08:00
2021-02-11 14:34:04 +01:00
You must provide a floating point number (with the trailing .0) for the quantile value
#### Using Multiple Metrics
2019-02-26 05:50:07 -08:00
2021-02-11 14:34:04 +01:00
You can combine multiple metrics using operators in your `expression` .
2019-02-26 05:50:07 -08:00
Supported operators are:
- AND (`&&` )
2019-04-03 14:32:04 +02:00
- OR (`||` )
2019-02-26 05:50:07 -08:00
2021-02-11 14:34:04 +01:00
For example, `ResponseCodeRatio(500, 600, 0, 600) > 0.30 || NetworkErrorRatio() > 0.10` triggers the circuit breaker when 30% of the requests return a 5XX status code, or when the ratio of network errors reaches 10%.
2019-02-26 05:50:07 -08:00
#### Operators
Here is the list of supported operators:
- Greater than (`>` )
- Greater or equal than (`>=` )
- Lesser than (`<` )
- Lesser or equal than (`<=` )
- Equal (`==` )
- Not Equal (`!=` )
2019-09-23 14:32:04 +02:00
2019-02-26 05:50:07 -08:00
### Fallback mechanism
2021-02-11 14:34:04 +01:00
The fallback mechanism returns a `HTTP 503 Service Unavailable` to the client instead of calling the target service.
This behavior cannot be configured.
2019-07-01 11:30:05 +02:00
2019-04-03 14:32:04 +02:00
### `CheckPeriod`
2019-02-26 05:50:07 -08:00
2022-04-05 11:30:08 +01:00
_Optional, Default="100ms"_
The interval between successive checks of the circuit breaker condition (when in standby state).
2019-02-26 05:50:07 -08:00
2019-04-03 14:32:04 +02:00
### `FallbackDuration`
2019-02-26 05:50:07 -08:00
2022-04-05 11:30:08 +01:00
_Optional, Default="10s"_
The duration for which the circuit breaker will wait before trying to recover (from a tripped state).
2019-02-26 05:50:07 -08:00
2022-04-05 11:30:08 +01:00
### `RecoveryDuration`
2019-02-26 05:50:07 -08:00
2022-04-05 11:30:08 +01:00
_Optional, Default="10s"_
2019-02-26 05:50:07 -08:00
2022-04-05 11:30:08 +01:00
The duration for which the circuit breaker will try to recover (as soon as it is in recovering state).
2024-01-29 01:58:05 -08:00
### `ResponseCode`
_Optional, Default="503"_
The status code that the circuit breaker will return while it is in the open state.