Load Shedding reference guide
Unresolved directive in load-shedding-reference.adoc - include::{includes}/extension-status.adoc[]
服务降级是指检测服务过载并拒绝请求。
Load shedding is the practice of detecting service overload and rejecting requests.
在 Quarkus 中, quarkus-load-shedding
扩展提供了服务降级机制。
In Quarkus, the quarkus-load-shedding
extension provides a load shedding mechanism.
Use the Load Shedding extension
要使用服务降级扩展,您需要向项目添加 io.quarkus:quarkus-load-shedding
扩展:
To use the load shedding extension, you need to add the io.quarkus:quarkus-load-shedding
extension to your project:
<dependency>
<groupId>io.quarkus</groupId>
<artifactId>quarkus-load-shedding</artifactId>
</dependency>
implementation("io.quarkus:quarkus-load-shedding")
虽然在下面描述了可能的配置选项,但不需要配置。
No configuration is required, though the possible configuration options are described below.
The load shedding algorithm
服务降级算法有两个部分:
The load shedding algorithm has 2 parts:
-
overload detection
-
priority load shedding (optional)
Overload detection
要检测当前服务是否过载,使用 TCP Vegas 的改编版本。
To detect whether the current service is overloaded, an adaptation of TCP Vegas is used.
该算法从 100 个允许的并发请求开始。对于每个请求,它将当前请求数与允许限制进行比较,如果超过限制,则表示过载情况。
The algorithm starts with 100 allowed concurrent requests. For each request, it compares the number of current requests with the allowed limit and if the limit is exceeded, an overload situation is signalled.
如果未超过限制,或优先级服务降级确定不应拒绝该请求(见下文),则允许该请求。当它完成时,将它的持续时间与迄今为止看到的最低持续时间进行比较,以估计队列大小。如果队列大小低于 alpha ,则当前限制将增加,但仅限于给定的最大值(默认为 1000)。如果队列大小大于 beta ,则当前限制将减少。否则,当前限制保持不变。
If the limit is not exceeded, or if priority load shedding determines that the request should not be rejected (see below), the request is allowed. When it finishes, its duration is compared with the lowest duration seen so far to estimate a queue size. If the queue size is lower than alpha, the current limit is increased, but only up to a given maximum, by default 1000. If the queue size is greater than beta, the current limit is decreased. Otherwise, the current limit is kept intact.
Alpha 和 beta 是通过将可配置常数与当前限制的 10 个小数对数相乘来计算的。
Alpha and beta are computed by multiplying the configurable constants with a base 10 logarithm of the current limit.
在某些请求数后(可以通过配置 probe 因子进行修改),看到的最低持续时间将重置为请求最近看到的持续时间。
After some number of requests, which can be modified by configuring the probe factor, the lowest duration seen is reset to the last seen duration of a request.
Priority load shedding
如果发出过载情况信号,则调用优先级负载卸载。
If an overload situation is signalled, priority load shedding is invoked.
默认情况下,优先级负载卸载已启用,这意味着只有当前 CPU 负载足够高时才会拒绝请求。通过考虑以下 2 个属性来确定是否应该拒绝某个请求:
By default, priority load shedding is enabled, which means a request is only rejected if the current CPU load is high enough. To determine whether a request should be rejected, 2 attributes are considered:
-
request priority
-
request cohort
共有 5 个静态定义的优先级和 128 个队列,总计 640 个请求组。
There are 5 statically defined priorities and 128 cohorts, which amounts to 640 request groups in total.
在将优先级和队列都分配给某个请求后,会计算出一个请求组号码:group = priority * num_cohorts + cohort
。接下来,将该组号码与当前 CPU 负载的一个简单三次函数进行比较,其中 load`是一个介于 0 到 1 之间的数:`num_groups * (1 - load^3)
。如果组号码较高,则拒绝该请求;否则,即使在过载情况下,该请求也会被允许。
After both priority and cohort are assigned to a request, a request group number is computed: group = priority * num_cohorts + cohort
.
Then, the group number is compared to a simple cubic function of current CPU load, where load
is a number between 0 and 1: num_groups * (1 - load^3)
.
If the group number is higher, the request is rejected, otherwise it is allowed even in an overload situation.
如果禁用优先级负载卸载,则所有请求都将在过载情况下遭到拒绝。
If priority load shedding is disabled, all requests are rejected in an overload situation.
Customizing request priority
优先级由一个 io.quarkus.load.shedding.RequestPrioritizer`分配。在 `io.quarkus.load.shedding.RequestPriority`枚举中有 5 个静态定义的优先级:`CRITICAL
、IMPORTANT
、NORMAL
、BACKGROUND`和 `DEGRADED
。默认情况下,如果没有请求优先级设定项适用,则优先级假定为 NORMAL
。
Priority is assigned by a io.quarkus.load.shedding.RequestPrioritizer
.
There is 5 statically defined priorities in the io.quarkus.load.shedding.RequestPriority
enum: CRITICAL
, IMPORTANT
, NORMAL
, BACKGROUND
and DEGRADED
.
By default, if no request prioritizer applies, the priority is assumed to be NORMAL
.
有一个默认优先级设定项,它会为发往非应用程序端点的请求分配 CRITICAL`优先级。它不声明任何 `@Priority
。
There is one default prioritizer which assigns the priority of CRITICAL
to requests to the non-application endpoints.
It declares no @Priority
.
可以定义 `RequestPrioritizer`界面的自定义实现。此类实现必须是 CDI Bean,否则它们将被忽略。必须遵循类型安全解析的 CDI 规则。也就是说,如果存在带有不同 `@Priority`值的多个实现,并且其中一些是 `@Alternative`s, only the alternatives with the highest priority value are retained. If no implementation is an alternative, all implementations are retained and are sorted in descending `@Priority`顺序(优先级值最高者优先)。
It is possible to define custom implementations of the RequestPrioritizer
interface.
The implementations must be CDI beans, otherwise they are ignored.
The CDI rules of typesafe resolution must be followed.
That is, if multiple implementations exist with a different @Priority
value and some of them are @Alternative`s, only the alternatives with the highest priority value are retained.
If no implementation is an alternative, all implementations are retained and are sorted in descending `@Priority
order (highest priority value comes first).
Customizing request cohort
队列由一个 `io.quarkus.load.shedding.RequestClassifier`分配。共有 128 个静态定义的队列,其中最低号码为 1,最高号码为 128。分类器应返回该区间内的某个号码;如果未返回,则会自动调整该号码。
Cohort is assigned by a io.quarkus.load.shedding.RequestClassifier
.
There is 128 statically defined cohorts, with the lowest number being 1 and highest number being 128.
The classifier should return a number in this interval; if it does not, the number is adjusted automatically.
有一个默认分类器,它会根据远程 IP 地址和当前时间的哈希值分配一个队列,以便一个 IP 地址每大约 1 小时会更改一次其队列。它不声明任何 @Priority
。
There is one default classifier which assigns a cohort based on a hash of the remote IP address and current time, such that an IP address changes its cohort roughly every hour.
It declares no @Priority
.
可以定义 `RequestClassifier`界面的自定义实现。此类实现必须是 CDI Bean,否则它们将被忽略。必须遵循类型安全解析的 CDI 规则。也就是说,如果存在带有不同 `@Priority`值的多个实现,并且其中一些是 `@Alternative`s, only the alternatives with the highest priority value are retained. If no implementation is an alternative, all implementations are retained and are sorted in descending `@Priority`顺序(优先级值最高者优先)。
It is possible to define custom implementations of the RequestClassifier
interface.
The implementations must be CDI beans, otherwise they are ignored.
The CDI rules of typesafe resolution must be followed.
That is, if multiple implementations exist with a different @Priority
value and some of them are @Alternative`s, only the alternatives with the highest priority value are retained.
If no implementation is an alternative, all implementations are retained and are sorted in descending `@Priority
order (highest priority value comes first).
Limitations
目前,负载卸载扩展仅适用于 HTTP 请求,并且严重偏向于请求/响应网络交互。这意味着 gRPC、WebSocket 和通过 HTTP 进行的其他类型的流传输不受支持。其他 Quarkus 应用程序“入口点”,如消息传递,也不受支持。
The load shedding extension currently only applies to HTTP requests, and is heavily skewed towards request/response network interactions. This means that gRPC, WebSocket and other kinds of streaming over HTTP are not supported. Other "entrypoints" to Quarkus applications, such as messaging, are not supported either.
此外,负载卸载实现目前相当基础,而且在生产环境中并未经过大量测试。可能需要改进。
Further, the load shedding implementation is currently rather basic and not heavily tested in production. Improvements may be necessary.
Configuration reference
Unresolved directive in load-shedding-reference.adoc - include::{generated-dir}/config/quarkus-load-shedding.adoc[]