Skip to content

[yang] Add SONiC YANG model and frrcfgd plumbing for FIB route filtering#27192

Open
kalash-nexthop wants to merge 2 commits into
sonic-net:masterfrom
nexthop-ai:add-sonic-yang-model-and-frrcfgd
Open

[yang] Add SONiC YANG model and frrcfgd plumbing for FIB route filtering#27192
kalash-nexthop wants to merge 2 commits into
sonic-net:masterfrom
nexthop-ai:add-sonic-yang-model-and-frrcfgd

Conversation

@kalash-nexthop

@kalash-nexthop kalash-nexthop commented May 6, 2026

Copy link
Copy Markdown
Contributor

Implements the HLD posted here: sonic-net/SONiC#2367

Add a new SONiC YANG module that models FIB route filtering — the policy that decides which routes from each routing protocol are installed into the FIB (and therefore APPL_DB and the hardware ASIC). Backed by FRR's ip|ipv6 protocol <PROTO> route-map <NAME> (see https://docs.frrouting.org/en/stable-7.5/zebra.html#zebra-route-filtering); a deny-match flows through nexthop_active_check -> rib_process and short-circuits rib_install_kernel, so fpmsyncd never sees the prefix and APPL_DB / hardware are never programmed for it. The route remains in zebra's RIB.

Each CONFIG_DB entry binds a route-map to a (vrf, addr_family, protocol) tuple; addr_family reuses sonic-types:ip-family (IPv4/IPv6); a must expression enforces AFI/protocol compatibility (ospf/rip/eigrp IPv4-only, ospf6/ripng IPv6-only); route_map is a mandatory leafref into ROUTE_MAP_SET.

Both modes are wired end-to-end, boot and runtime:

  • separated mode: boot rendering via dockers/docker-fpm-frr/frr/zebra/zebra.fib_route_filter.conf.j2 (included from zebra.conf.j2), plus runtime translation via a new FibRouteFilterMgr in bgpcfgd that subscribes to FIB_ROUTE_FILTER and pushes the matching FRR command through vtysh.
  • unified mode: same boot include from frr.conf.j2 (sonic-frr-mgmt-framework), plus runtime translation in frrcfgd.py (fib_route_filter_handler) routed via mgmtd because the FRR CLI is DEFPY_YANG.

The boot template is the source of truth for warm reboot / fresh boot; the managers are the runtime mutation hooks. Net effect: a sonic-db-cli write to FIB_ROUTE_FILTER takes effect immediately in either mode — no config_reload required.

Why I did it

SONiC has YANG coverage for route-maps, prefix-lists, community sets, etc., but no schema for "limit which routes from this protocol are programmed into the FIB / APPL_DB / hardware." Today the only way to configure ip protocol <proto> route-map <name> is to hand-edit zebra.conf (of frr.conf in unified mode). This PR adds a CONFIG_DB-driven schema so the feature can be configured through the standard SONiC flow, plus the conversion layer to render and runtime-update the corresponding FRR config.

The primary operator use case is reducing hardware route capacity pressure: routes from a chosen protocol can be selectively excluded from the FIB without affecting the BGP/OSPF/etc. control plane.

Work item tracking
  • Microsoft ADO (number only):

How I did it

New YANG module src/sonic-yang-models/yang-models/sonic-fib-route-filter.yang:

  • Table FIB_ROUTE_FILTER keyed on vrf_name | addr_family | protocol.
  • vrf_name: union of the literal default and a leafref into sonic-vrf.
  • addr_family: sonic-types:ip-family (IPv4 / IPv6); rendered into the FRR ip / ipv6 CLI keyword via the same {'IPv4':'ip','IPv6':'ipv6'} lookup the prefix-list and route-map templates already use.
  • protocol: enum (any, bgp, connected, eigrp, isis, kernel, nhrp, ospf, ospf6, rip, ripng, static, table) — same set as FRR's frr-route-types.
  • route_map: mandatory leafref into sonic-route-map:ROUTE_MAP_SET.

Conversion layer:

  • Separated mode boot rendering: new include dockers/docker-fpm-frr/frr/zebra/zebra.fib_route_filter.conf.j2, pulled into zebra.conf.j2 after zebra.interfaces.conf.j2. Renders empty when the table is missing; emits a single vrf <N> / ... / exit-vrf block per non-default VRF (rows for the same VRF grouped) and bare lines for the default VRF.
  • Separated mode runtime apply: new FibRouteFilterMgr in src/sonic-bgpcfgd/bgpcfgd/managers_fib_route_filter.py, registered in main.py next to the other route-policy managers. Subscribes to CONFIG_DB / FIB_ROUTE_FILTER; on SET pushes ip|ipv6 protocol <proto> route-map <name> (wrapped in vrf <N> / ... / exit-vrf for non-default VRFs) via cfg_mgr.push_list; on DEL pushes the matching no ... form using a per-key state cache so the prior route_map field is not needed. Idempotent re-sets short-circuit (no churn on ConfigDBConnector replay). del_handler checks the tracked-key fast-exit before the AFI guard so a DEL for an unsupported-AFI key (which set_handler would have rejected on the way in) is a quiet skip rather than a spurious error.
  • Unified mode: same boot include pulled into src/sonic-frr-mgmt-framework/templates/frr/frr.conf.j2, plus fib_route_filter_handler in frrcfgd.py that: (a) seeds fib_route_filter_state from CONFIG_DB at init, (b) translates SET / UPDATE / DELETE events into vtysh, (c) advances state only on __run_command success so failures are retryable, (d) is registered against mgmtd in TABLE_DAEMON because the FRR CLIs are DEFPY_YANG and unified mode dispatches them through mgmtd's northbound (going to ['zebra'] directly returns "Unknown command").
Example

Filter BGP-learned IPv4 routes and OSPFv3 IPv6 routes in the default VRF, and filter both static and BGP routes in Vrf_red:

CONFIG_DB:

{
    "FIB_ROUTE_FILTER": {
        "default|IPv4|bgp":    { "route_map": "RM_FROM_BGP" },
        "default|IPv6|ospf6":  { "route_map": "RM_FROM_OSPF6" },
        "Vrf_red|IPv4|static": { "route_map": "RM_STATIC_V4" },
        "Vrf_red|IPv4|bgp":    { "route_map": "RM_BGP_RED" }
    }
}

Corresponding FRR CLI (rendered into zebra.conf / frr.conf and pushed via vtysh):

ip protocol bgp route-map RM_FROM_BGP
ipv6 protocol ospf6 route-map RM_FROM_OSPF6
!
vrf Vrf_red
 ip protocol static route-map RM_STATIC_V4
 ip protocol bgp route-map RM_BGP_RED
exit-vrf
!

This maps onto FRR's own YANG list filter-protocol under /frr-vrf:lib/vrf[name=N]/frr-zebra:zebra/ (frr-zebra.yang). The SONiC schema flattens FRR's VRF-augmented tree into a single multi-key CONFIG_DB table and tightens the type system (enum protocol, IPv4/IPv6 AFI only, AFI/protocol must-guard).

How to verify it

The yang-models test harness (src/sonic-yang-models/tests/yang_model_tests/test_yang_model.py) globs yang-models/*.yang and every fixture under tests/yang_model_tests/tests/, so the new model + tests run automatically in CI. Expected outcomes:

  • Positive: FIB_ROUTE_FILTER_VALID, FIB_ROUTE_FILTER_VALID_NAMED_VRF load cleanly.
  • Negative: LeafRef (NON_EXIST_RT_MAP), InvalidValue x2 (NON_EXIST_VRF, INVALID_PROTOCOL), Mandatory (MISSING_RT_MAP), Must x5 (AFI_PROTO_MISMATCH_V4_OSPF6, V6_OSPF, V4_RIPNG, V6_RIP, V6_EIGRP).

frrcfgd unit tests run in src/sonic-frr-mgmt-framework; the CI builds the wheel and runs them. 17 FRF tests pass alongside the existing suite.

test_zebra_frr_fib_route_filter in src/sonic-config-engine/tests/test_frr.py verifies separated-mode template rendering against the committed sample output.

bgpcfgd unit tests run in src/sonic-bgpcfgd; CI builds the wheel and runs them. tests/test_fib_route_filter_mgr.py contributes 22 tests covering: the exact wire format of set / del commands (default and named VRFs, IPv4 / IPv6); set_handler over default-VRF v4 / v6, named-VRF wrapping, idempotent re-set, upsert with a different route-map, malformed-key and unknown-AFI rejection, tuple-key acceptance; del_handler over tracked / untracked / malformed / unknown-AFI cases; per-(vrf, afi, proto) tracking independence; and the set -> del -> re-set lifecycle (verifies _applied eviction-then-re-registration so a re-set after delete re-pushes rather than idempotent-skipping).

Which release branch to backport (provide reason below if selected)

  • 202305
  • 202311
  • 202405
  • 202411
  • 202505
  • 202511

Tested branch (Please provide the tested image version)

Description for the changelog

Add FIB_ROUTE_FILTER CONFIG_DB table and YANG schema, plus frrcfgd handler and zebra/frr.conf templates: bind a route-map to (vrf, addr_family, protocol) so chosen routes are filtered out before FIB / APPL_DB / hardware install. Renders to FRR's ip|ipv6 protocol <PROTO> route-map <NAME>.

Link to config_db schema for YANG module changes

Configuration.md entry for FIB_ROUTE_FILTER will be added in a follow-up doc PR.

A picture of a cute animal (not mandatory but encouraged)

@mssonicbld

Copy link
Copy Markdown
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

@kalash-nexthop kalash-nexthop force-pushed the add-sonic-yang-model-and-frrcfgd branch 2 times, most recently from cdfa544 to 9393580 Compare May 6, 2026 04:30
@kalash-nexthop kalash-nexthop changed the title Add SONiC YANG model and frrcfgd plumbing for FIB route filtering [yang] Add SONiC YANG model and frrcfgd plumbing for FIB route filtering May 6, 2026
@mssonicbld

Copy link
Copy Markdown
Collaborator

/azp run Azure.sonic-buildimage

1 similar comment
@mssonicbld

Copy link
Copy Markdown
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

1 similar comment
@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

@kalash-nexthop kalash-nexthop force-pushed the add-sonic-yang-model-and-frrcfgd branch from 9393580 to 4f84140 Compare May 7, 2026 21:55
@mssonicbld

Copy link
Copy Markdown
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld

Copy link
Copy Markdown
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

@kalash-nexthop

Copy link
Copy Markdown
Contributor Author

/azp run Azure.sonic-buildimage

@azure-pipelines

Copy link
Copy Markdown
Commenter does not have sufficient privileges for PR 27192 in repo sonic-net/sonic-buildimage

This adds a new SONiC YANG module and CONFIG_DB table that models
FIB route filtering — the policy that decides which routes from each
routing protocol are installed into the FIB (and therefore APPL_DB and
the hardware ASIC). It is backed by FRR's
`ip|ipv6 protocol <PROTO> route-map <NAME>` (zebra route filtering;
docs.frrouting.org/en/stable-7.5/zebra.html#zebra-route-filtering).

In FRR, a deny-match on this route-map flows from
`zebra_route_map_check()` -> `nexthop_active_check()` -> `rib_process()`
and short-circuits `rib_install_kernel()`, so the rib_update hook never
fires for the route. As a result fpmsyncd never sees the prefix and
APPL_DB / hardware are never programmed for it. The route does remain
in zebra's RIB, so the BGP / OSPF / IS-IS control plane is unaffected
and the route can still be advertised to peers — only the dataplane
install side is filtered.

The primary operator use case is reducing hardware route-table
pressure: routes from a chosen protocol can be selectively excluded
from the FIB without affecting the BGP / OSPF / etc. control plane.

YANG (sonic-fib-route-filter.yang)
==================================

  container FIB_ROUTE_FILTER
    list FIB_ROUTE_FILTER_LIST
      key  vrf_name addr_family protocol
      leaf vrf_name      union { 'default' | leafref VRF_LIST }
      leaf addr_family   sonic-types:ip-family   ('IPv4' | 'IPv6')
      leaf protocol      enum { any, bgp, connected, eigrp, isis,
                                kernel, nhrp, ospf, ospf6, rip,
                                ripng, static, table }
      leaf route_map     leafref ROUTE_MAP_SET   (mandatory)
      must               rejects AFI-incompatible protocol pairs:
                           ipv4 + (ospf6 | ripng),
                           ipv6 + (ospf | rip | eigrp)

The schema reuses sonic-types:ip-family rather than introducing a
local enum, matching the pattern in sonic-bgp-prefix-list.yang.
`addr_family` is rendered into the FRR `ip` / `ipv6` CLI keyword via
the `{'IPv4':'ip','IPv6':'ipv6'}` lookup that
bgpd.conf.db.pref_list.j2 and bgpd.conf.db.route_map.j2 already use.

Conversion layer
================

Two render paths, both honored:

* Separated mode (docker-fpm-frr): new include
    dockers/docker-fpm-frr/frr/zebra/zebra.fib_route_filter.conf.j2
  pulled into zebra.conf.j2 after zebra.interfaces.conf.j2. The
  template renders empty when the table is missing; otherwise it
  groups rows by VRF in two passes so multiple rows that share a
  non-default VRF render as a single `vrf <N> / ... / exit-vrf`
  block (instead of one block per row).

* Unified mode (sonic-frr-mgmt-framework): the same include is pulled
  into templates/frr/frr.conf.j2 for boot rendering. Runtime updates
  go through a new fib_route_filter_handler in frrcfgd.py. The
  handler:
    - seeds fib_route_filter_state from CONFIG_DB at init
    - on SET / UPDATE / DELETE, builds the vtysh command
      ([vrf <N>] ip|ipv6 protocol <PROTO> route-map <NAME>
      [exit-vrf]) and pushes it
    - advances state only after `__run_command` returns True so a
      transient FRR failure is retryable and the idempotent
      `prev_rm == route_map` guard does not silently skip retries
    - is registered against ['mgmtd'] in TABLE_DAEMON because the FRR
      CLI is DEFPY_YANG and unified mode dispatches it through
      mgmtd's northbound; targeting ['zebra'] directly returns
      'Unknown command'.

The handler relies on FRR's upsert behavior for this CLI
(NB_OP_MODIFY on the ./route-map leaf) — emitting a fresh
`ip protocol X route-map Y` replaces any prior binding for the same
(vrf, afi, proto), so no preceding `no ... route-map <prev>` is
needed.

Example
=======

CONFIG_DB:

    {
      "FIB_ROUTE_FILTER": {
        "default|IPv4|bgp":    { "route_map": "RM_FROM_BGP"   },
        "default|IPv6|ospf6":  { "route_map": "RM_FROM_OSPF6" },
        "Vrf_red|IPv4|static": { "route_map": "RM_STATIC_V4"  },
        "Vrf_red|IPv4|bgp":    { "route_map": "RM_BGP_RED"    }
      }
    }

Rendered FRR CLI:

    ip protocol bgp route-map RM_FROM_BGP
    ipv6 protocol ospf6 route-map RM_FROM_OSPF6
    !
    vrf Vrf_red
     ip protocol static route-map RM_STATIC_V4
     ip protocol bgp route-map RM_BGP_RED
    exit-vrf
    !

Tests
=====

* sonic-yang-models tests/yang_model_tests:
    2 positive cases (default VRF, named VRF) and 10 negative cases
    (LeafRef on missing route-map, InvalidValue on unknown VRF and
    out-of-enum protocol, Mandatory on missing route_map, and 5
    Must violations covering every AFI/protocol incompatibility).
    The new YANG file is registered in setup.py and seeded in
    sample_config_db.json so the wheel-build validator passes.

* sonic-frr-mgmt-framework tests/test_config.py:
    17 unit tests (test_frf_*) covering SET / SET-update / DELETE /
    DELETE-untracked / idempotent-set / tuple-key / malformed-key /
    unsupported-AFI / missing-route-map / table_handler_list
    registration; plus 4 failure-path tests that pin the invariant
    'state must only advance when vtysh actually succeeds.'

* sonic-config-engine tests/test_frr.py:
    test_zebra_frr_fib_route_filter renders zebra/zebra.conf.j2 with
    a 4-row FIB_ROUTE_FILTER fixture that exercises both default-VRF
    grouping and multi-row per-VRF grouping, comparing byte-for-byte
    against zebra_frr_fib_route_filter.conf.

Signed-off-by: Kalash Nainwal <kalash@nexthop.ai>
The original FIB_ROUTE_FILTER patch (yang + frrcfgd + zebra.conf.j2)
landed unified-mode runtime apply through frrcfgd, but in separated
mode the binding only takes effect at config_reload time because
zebra.conf is rendered from the new Jinja2 template at boot. This
adds a small bgpcfgd manager that closes that asymmetry: a SET / DEL
on FIB_ROUTE_FILTER produces the matching FRR command live, without
a reload, in either mode.

FibRouteFilterMgr
=================

managers_fib_route_filter.py subscribes to CONFIG_DB / FIB_ROUTE_FILTER
and is registered in main.py next to the other route-policy managers.
The CONFIG_DB key shape is vrf|addr_family|protocol (e.g.
"default|IPv4|bgp"); rows carry a single mandatory field, route_map.

Emitted vtysh commands:

  set, default VRF:
      ip|ipv6 protocol <proto> route-map <name>

  set, named VRF:
      vrf <name>
       ip|ipv6 protocol <proto> route-map <name>
      exit-vrf

  delete (both VRF forms):
      no ip|ipv6 protocol <proto>           (optionally wrapped)

These match the lines rendered at boot by
zebra.fib_route_filter.conf.j2 in the parent patch, so the runtime and
boot paths converge on the same FRR running-config. The boot path
stays the source of truth for warm reboot / fresh boot; the manager is
just the runtime mutation hook.

addr_family carries the sonic-types:ip-family values ('IPv4'/'IPv6')
and is mapped to FRR's 'ip'/'ipv6' CLI keyword via the same
{'IPv4':'ip','IPv6':'ipv6'} dict that bgpd.conf.db.pref_list.j2,
bgpd.conf.db.route_map.j2, the new zebra.fib_route_filter.conf.j2, and
the frrcfgd fib_route_filter_handler already use.

Per-key state cache
===================

A small dict (_applied: frf_key -> last route_map) lets:

  - DEL emit the matching 'no ip|ipv6 protocol <proto>' without
    needing the prior route_map field (the CONFIG_DB row is already
    gone by the time del_handler runs);
  - idempotent re-sets short-circuit, so a no-op CONFIG_DB replay
    (e.g. ConfigDBConnector reconnect) does not churn vtysh.

set_handler only mutates _applied after cfg_mgr.push_list buffers the
command; the actual vtysh write happens later at Runner commit. A
commit failure leaves _applied ahead of FRR state, the same trade-off
the other bgpcfgd managers carry.

del_handler short-circuits on untracked keys before validating the
AFI, so a DEL for an unsupported-AFI key (which set_handler would have
rejected on the way in, so it can never be in _applied) is a quiet
skip rather than a spurious error log.

FRR's `ip|ipv6 protocol <proto> route-map <name>` is upsert-style:
re-binding the same (vrf, afi, proto) tuple replaces any prior
route-map without an intervening `no`. The manager relies on that to
avoid emitting a separate `no ... route-map <prev>` line on route-map
change.

Tests
=====

tests/test_fib_route_filter_mgr.py — 22 unit tests grouped into:

  * TestBuildCmds: locks the exact wire format of _build_set_cmds /
    _build_del_cmds for default and named VRFs, IPv4 and IPv6.
  * TestSetHandler: default-VRF v4 / v6, named-VRF wrapping,
    idempotent re-set, upsert with a different route-map, missing
    route_map field rejected, malformed keys (too-few parts,
    empty-field) rejected, unknown addr_family rejected, tuple-key
    accepted (defensive — swsscommon delivers strings today).
  * TestDelHandler: tracked default-VRF and named-VRF deletes,
    untracked-key noop, malformed-key noop, unknown-AFI noop.
  * TestMultipleRows: independent per-(vrf, afi, proto) tracking;
    deleting one row leaves siblings intact.
  * TestLifecycle: set -> del -> re-set must re-push (not
    idempotent-skip) — locks the eviction-then-re-registration
    semantic of _applied.

  python -m pytest src/sonic-bgpcfgd/tests/test_fib_route_filter_mgr.py
  -> 22 passed.

Signed-off-by: Kalash Nainwal <kalash@nexthop.ai>
@kalash-nexthop kalash-nexthop force-pushed the add-sonic-yang-model-and-frrcfgd branch from 0fb6dfd to 47c7f68 Compare June 18, 2026 19:02
@mssonicbld

Copy link
Copy Markdown
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants