diff --git a/docs/user-guide/protocols/http.md b/docs/user-guide/protocols/http.md index 7c4adf161..cbe707d04 100644 --- a/docs/user-guide/protocols/http.md +++ b/docs/user-guide/protocols/http.md @@ -92,6 +92,7 @@ Supported hints: | `append_mode` | Boolean | `false` | Enables [append-only mode](/reference/sql/create.md#create-an-append-only-table) for the table, which disables deduplication by primary key and supports duplicate rows. For InfluxDB line protocol writes, an explicit `append_mode=true` hint creates the table with `append_mode = 'true'` and `merge_mode = 'last_row'`. | | `merge_mode` | String | None | Sets the [merge mode](/reference/sql/create.md#create-a-table-with-merge-mode) for the table, e.g. `last_non_null`, `last_row`. For auto-created InfluxDB line protocol tables, this hint takes precedence over [`influxdb.default_merge_mode`](/user-guide/deployments-administration/configuration.md), which defaults to `last_non_null`. When `append_mode` is enabled, only `last_row` is allowed. | | `physical_table` | String | None | Specifies the physical table name for the [metric engine](/contributor-guide/datanode/metric-engine.md). | +| `query.enable_remote_dynamic_filter_pushdown` | Boolean | `true` | Enables remote dynamic filter pushdown for SQL queries. Set it to `false` to disable Frontend-to-Datanode dynamic filter propagation for the current request. See [Remote dynamic filter pushdown](/user-guide/query-data/sql.md#remote-dynamic-filter-pushdown). | | `skip_wal` | Boolean | `false` | Skips WAL (Write-Ahead Log) writes for the table. | | `sst_format` | String | None | Sets the SST (Sorted String Table) file format for the table. Valid values: `flat`, `primary_key`. | | `trace_table_partitions` | Int | None | Override default partition number (16) of trace tables. Set to `0` or `1` to disable partitioning. | diff --git a/docs/user-guide/query-data/sql.md b/docs/user-guide/query-data/sql.md index c652b7336..b29a388d4 100644 --- a/docs/user-guide/query-data/sql.md +++ b/docs/user-guide/query-data/sql.md @@ -242,6 +242,35 @@ SELECT * FROM monitor ORDER BY ts ASC; SELECT * FROM monitor ORDER BY ts DESC; ``` +## Remote dynamic filter pushdown + +GreptimeDB enables remote dynamic filter pushdown by default for distributed SQL queries. +It is mainly used by distributed join queries. +When a join builds a runtime dynamic filter from one side of the join, the Frontend can propagate that filter to Datanodes so scans on the other side may prune data earlier. +Some other operators, such as TopK (`ORDER BY ... LIMIT`), may also produce dynamic filters in eligible plans. + +This is a best-effort performance optimization and does not change query results. +It only applies when the distributed query plan contains a dynamic filter that can be sent to remote scans. +If the query plan has no dynamic filter, or if the filter cannot be encoded or applied safely, GreptimeDB runs the query without remote dynamic filter pushdown. + +To disable this optimization for one HTTP SQL request, set the `query.enable_remote_dynamic_filter_pushdown` hint to `false`: + +```bash +curl -X POST \ +-H 'Content-Type: application/x-www-form-urlencoded' \ +-H 'x-greptime-hints: query.enable_remote_dynamic_filter_pushdown=false' \ +--data-urlencode "sql=SELECT m.* FROM monitor m JOIN host_info h ON m.host = h.host WHERE h.region = 'us-west'" \ +http://localhost:4000/v1/sql +``` + +The option defaults to `true`. +Setting it to `false` disables Frontend-to-Datanode remote dynamic filter propagation for the current query only. +It also applies to standalone deployments when the query is executed through the local Frontend-to-Datanode region query path. +Currently, there is no persistent Frontend or Datanode configuration option for changing this default. +It does not disable local dynamic filter optimizations inside one execution node. + +For more information about request hints, see [HTTP hints](/user-guide/protocols/http.md#hints). + ## `CASE` Expression You can use the `CASE` statement to perform conditional logic within your queries. diff --git a/i18n/zh/docusaurus-plugin-content-docs/current/user-guide/protocols/http.md b/i18n/zh/docusaurus-plugin-content-docs/current/user-guide/protocols/http.md index f0c919a1b..9f71ad71d 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/current/user-guide/protocols/http.md +++ b/i18n/zh/docusaurus-plugin-content-docs/current/user-guide/protocols/http.md @@ -86,6 +86,7 @@ x-greptime-hint-key2: value2 | `append_mode` | Boolean | `false` | 启用表的 [append-only 模式](/reference/sql/create.md#创建-append-only-表),该模式禁用按主键去重,支持重复行。对于 InfluxDB 行协议写入,显式设置 `append_mode=true` hint 时,会使用 `append_mode = 'true'` 和 `merge_mode = 'last_row'` 创建表。 | | `merge_mode` | String | 无 | 设置表的 [merge 模式](/reference/sql/create.md#创建带有-merge-模式的表),例如 `last_non_null`、`last_row`。对于通过 InfluxDB 行协议自动创建的表,该 hint 优先于 [`influxdb.default_merge_mode`](/user-guide/deployments-administration/configuration.md) 配置;该配置默认值为 `last_non_null`。启用 `append_mode` 时,仅允许使用 `last_row`。 | | `physical_table` | String | 无 | 指定 [metric 引擎](/contributor-guide/datanode/metric-engine.md)的物理表名。 | +| `query.enable_remote_dynamic_filter_pushdown` | Boolean | `true` | 为 SQL 查询启用远程动态过滤下推。设置为 `false` 可为当前请求关闭 Frontend 到 Datanode 的动态过滤传播。请参阅[远程动态过滤下推](/user-guide/query-data/sql.md#远程动态过滤下推)。 | | `skip_wal` | Boolean | `false` | 跳过表的 WAL(Write-Ahead Log)写入。 | | `sst_format` | String | 无 | 设置表的 SST(Sorted String Table)文件格式。可选值:`flat`、`primary_key`。 | | `trace_table_partitions` | Int | None | 自定义 Trace 表的默认分区数(16)。设置为 `0` 或 `1` 时禁用分区。 | diff --git a/i18n/zh/docusaurus-plugin-content-docs/current/user-guide/query-data/sql.md b/i18n/zh/docusaurus-plugin-content-docs/current/user-guide/query-data/sql.md index 927a5837d..6114136f1 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/current/user-guide/query-data/sql.md +++ b/i18n/zh/docusaurus-plugin-content-docs/current/user-guide/query-data/sql.md @@ -228,6 +228,35 @@ SELECT * FROM monitor ORDER BY ts ASC; SELECT * FROM monitor ORDER BY ts DESC; ``` +## 远程动态过滤下推 + +GreptimeDB 默认会为分布式 SQL 查询启用远程动态过滤下推。 +它主要用于分布式 Join 查询。 +当 Join 的一侧在运行时构建出动态过滤条件时,Frontend 可以将该过滤条件传播到 Datanode,让另一侧的扫描更早地裁剪数据。 +其他一些算子,例如 TopK(`ORDER BY ... LIMIT`),在符合条件的执行计划中也可能产生动态过滤条件。 + +这是一个尽力而为的性能优化,不会改变查询结果。 +它只会在分布式查询计划中包含可发送到远端扫描的动态过滤条件时生效。 +如果查询计划中没有动态过滤条件,或者该过滤条件无法被安全编码或应用,GreptimeDB 会在没有远程动态过滤下推的情况下执行查询。 + +要为单个 HTTP SQL 请求关闭该优化,可以将 `query.enable_remote_dynamic_filter_pushdown` hint 设置为 `false`: + +```bash +curl -X POST \ +-H 'Content-Type: application/x-www-form-urlencoded' \ +-H 'x-greptime-hints: query.enable_remote_dynamic_filter_pushdown=false' \ +--data-urlencode "sql=SELECT m.* FROM monitor m JOIN host_info h ON m.host = h.host WHERE h.region = 'us-west'" \ +http://localhost:4000/v1/sql +``` + +该选项默认值为 `true`。 +设置为 `false` 后,只会为当前查询关闭 Frontend 到 Datanode 的远程动态过滤传播。 +当查询通过本地 Frontend 到 Datanode 的 region query 路径执行时,该选项也适用于 standalone 部署。 +当前没有用于修改该默认值的持久化 Frontend 或 Datanode 配置项。 +它不会关闭单个执行节点内部的本地动态过滤优化。 + +有关请求 hints 的更多信息,请参阅 [HTTP hints](/user-guide/protocols/http.md#hints)。 + ## `CASE` 表达式 你可以使用 `CASE` 表达式在查询中执行条件逻辑。