使用ai-token-ratelimit如何实现多维度限流? || How to use ai-token-ratelimit to implement multi-dimensional current limiting? #3745

rhh777 · 2026-04-23T07:28:02Z

rhh777
Apr 23, 2026

现在的单个ai-token-ratelimit实例只能配置一个rule, 如果我想同时控制consumer和header中的某个业务key流量, 例如当consumer维度到达阈值或者header中的某个业务key到达阈值时, 拒绝请求, 反之两个维度的token计算都incr, 应该如何配置? 串联两个ai-token-ratelimit实例吗?

Currently, a single ai-token-ratelimit instance can only be configured with one rule. If I want to control the traffic of a certain business key in the consumer and header at the same time, for example, when the consumer dimension reaches the threshold or a certain business key in the header reaches the threshold, the request is rejected. Otherwise, the token calculation in both dimensions is incr. How should I configure it? Can I connect two ai-token-ratelimit instances in series?

CH3CHO · 2026-04-23T07:58:45Z

CH3CHO
Apr 23, 2026
Maintainer

CC @hanxiantao

cc @韩香桃

0 replies

hanxiantao · 2026-04-23T08:10:04Z

hanxiantao
Apr 23, 2026
Maintainer

ai-token-ratelimit实例只能配置一个r

目前一个 rule 下可以支持多种匹配维度的，我理解你要的效果是 consumer 到达阈值或者 header 中的某个业务 key 到达阈值时，拒绝请求

rule_name: default_rule
rule_items:
  - limit_by_header: x-ca-key # 根据请求头限流
    limit_keys:
      - key: 102234
        token_per_minute: 10
      - key: 308239
        token_per_hour: 10
  - limit_by_consumer: '' # 根据consumer限流
    limit_keys:
      - key: consumer1
        token_per_second: 10
      - key: consumer2
        token_per_hour: 100
redis:
  service_name: redis.static

看下是否能满足你的需求，详细可以看下 AI Token 限流插件配置文档：https://higress.io/docs/latest/user/plugins/ai/api-consumer/ai-token-ratelimit/

The ai-token-ratelimit instance can only be configured with one r

Currently, one rule can support multiple matching dimensions. I understand that the effect you want is to reject the request when the consumer reaches the threshold or a business key in the header reaches the threshold.

rule_name: default_rule
rule_items:
  - limit_by_header: x-ca-key # Limit current flow based on request header
    limit_keys:
      - key: 102234
        token_per_minute: 10
      - key: 308239
        token_per_hour: 10
  - limit_by_consumer: '' # Limit current based on consumer
    limit_keys:
      - key: consumer1
        token_per_second: 10
      - key: consumer2
        token_per_hour: 100
redis:
  service_name: redis.static

See if it can meet your needs. For details, you can read the AI Token rate limit plug-in configuration document: https://higress.io/docs/latest/user/plugins/ai/api-consumer/ai-token-ratelimit/

4 replies

rhh777 Apr 23, 2026
Author

这种配置下, 如果请求匹配到规则x-ca-key(例如x-ca-key的102234所属consumer的consumer1), x-ca-key在redis中的token计数会增加, 但是x-ca-key所属的consumer1 token计数不变, 好像不太符合我们的场景, 我们想的是x-ca-key的102234 计数增加后, 他对应的consumer1 的token也要增加, 不知道我理解的是否正确

Under this configuration, if the request matches the rule x-ca-key (for example, consumer1 of the consumer to which 102234 of x-ca-key belongs), the token count of x-ca-key in redis will increase, but the token count of consumer1 to which x-ca-key belongs remains unchanged, which does not seem to fit our scenario. What we want is that after the 102234 count of x-ca-key increases, the token of its corresponding consumer1 will also increase. I don’t know if what I understand is correct

hanxiantao Apr 23, 2026
Maintainer

这种配置下, 如果请求匹配到规则x-ca-key(例如x-ca-key的102234所属consumer的consumer1), x-ca-key在redis中的token计数会增加, 但是x-ca-key所属的consumer1 token计数不变, 好像不太符合我们的场景, 我们想的是x-ca-key的102234 计数增加后, 他对应的consumer1 的token也要增加, 不知道我理解的是否正确

Under this configuration, if the request matches the rule x-ca-key (for example, consumer1 of the consumer to which 102234 of x-ca-key belongs), the token count of x-ca-key in redis will increase, but the token count of consumer1 to which x-ca-key belongs remains unchanged, which does not seem to fit our scenario. What we want is that after the 102234 count of x-ca-key increases, the token of its corresponding consumer1 will also increase. I don’t know if what I understand is correct

抱歉，我忽略了这一点，目前rule下面的规则只要命中了一个就不会继续判断其他规则了，只能一个路由串联两个ai-token-ratelimit实例了

rhh777 Apr 23, 2026
Author

这种配置下, 如果请求匹配到规则x-ca-key(例如x-ca-key的102234所属consumer的consumer1), x-ca-key在redis中的token计数会增加, 但是x-ca-key所属的consumer1 token计数不变, 好像不太符合我们的场景, 我们想的是x-ca-key的102234 计数增加后, 他对应的consumer1 的token也要增加, 不知道我理解的是否正确
Under this configuration, if the request matches the rule x-ca-key (for example, consumer1 of the consumer to which 102234 of x-ca-key belongs), the token count of x-ca-key in redis will increase, but the token count of consumer1 to which x-ca-key belongs remains unchanged, which does not seem to fit our scenario. What we want is that after the 102234 count of x-ca-key increases, the token of its corresponding consumer1 will also increase. I don’t know if what I understand is correct

抱歉，我忽略了这一点，目前rule下面的规则只要命中了一个就不会继续判断其他规则了，只能一个路由串联两个ai-token-ratelimit实例了

有计划在单个插件里支持多组rule(rule_group)吗? 一个请求必须同时通过group下所有rule的检查(rule下item匹配规则不变)，任一超限即拒绝。若通过, 在response阶段对匹配到的rule都计数.

hanxiantao Apr 23, 2026
Maintainer

这种配置下, 如果请求匹配到规则x-ca-key(例如x-ca-key的102234所属consumer的consumer1), x-ca-key在redis中的token计数会增加, 但是x-ca-key所属的consumer1 token计数不变, 好像不太符合我们的场景, 我们想的是x-ca-key的102234 计数增加后, 他对应的consumer1 的token也要增加, 不知道我理解的是否正确
Under this configuration, if the request matches the rule x-ca-key (for example, consumer1 of the consumer to which 102234 of x-ca-key belongs), the token count of x-ca-key in redis will increase, but the token count of consumer1 to which x-ca-key belongs remains unchanged, which does not seem to fit our scenario. What we want is that after the 102234 count of x-ca-key increases, the token of its corresponding consumer1 will also increase. I don’t know if what I understand is correct

抱歉，我忽略了这一点，目前rule下面的规则只要命中了一个就不会继续判断其他规则了，只能一个路由串联两个ai-token-ratelimit实例了

有计划在单个插件里支持多组rule(rule_group)吗? 一个请求必须同时通过group下所有rule的检查(rule下item匹配规则不变)，任一超限即拒绝。若通过, 在response阶段对匹配到的rule都计数.

https://github.com/alibaba/higress/blob/fb8e1ef33f755cab0525aec8589ecad9003b827d/plugins/wasm-go/extensions/ai-token-ratelimit/main.go#L161-L192

因为目前 redis 限流的逻辑都是回调的形式，如果同时命中了两个限流规则，要写两次 redis ，插件这里的逻辑不太好处理

使用ai-token-ratelimit如何实现多维度限流? || How to use ai-token-ratelimit to implement multi-dimensional current limiting? #3745

Uh oh!

Uh oh!

rhh777 Apr 23, 2026

Replies: 2 comments · 4 replies

Uh oh!

Uh oh!

CH3CHO Apr 23, 2026 Maintainer

Uh oh!

Uh oh!

hanxiantao Apr 23, 2026 Maintainer

Uh oh!

Uh oh!

rhh777 Apr 23, 2026 Author

Uh oh!

hanxiantao Apr 23, 2026 Maintainer

Uh oh!

rhh777 Apr 23, 2026 Author

Uh oh!

hanxiantao Apr 23, 2026 Maintainer

rhh777
Apr 23, 2026

Replies: 2 comments 4 replies

CH3CHO
Apr 23, 2026
Maintainer

hanxiantao
Apr 23, 2026
Maintainer

rhh777 Apr 23, 2026
Author

hanxiantao Apr 23, 2026
Maintainer

rhh777 Apr 23, 2026
Author

hanxiantao Apr 23, 2026
Maintainer