[yang] Add SONiC YANG model and frrcfgd plumbing for FIB route filtering#27192
Open
kalash-nexthop wants to merge 2 commits into
Open
[yang] Add SONiC YANG model and frrcfgd plumbing for FIB route filtering#27192kalash-nexthop wants to merge 2 commits into
kalash-nexthop wants to merge 2 commits into
Conversation
Collaborator
|
/azp run Azure.sonic-buildimage |
|
Azure Pipelines successfully started running 1 pipeline(s). |
cdfa544 to
9393580
Compare
Collaborator
|
/azp run Azure.sonic-buildimage |
1 similar comment
Collaborator
|
/azp run Azure.sonic-buildimage |
|
Azure Pipelines successfully started running 1 pipeline(s). |
1 similar comment
|
Azure Pipelines successfully started running 1 pipeline(s). |
9393580 to
4f84140
Compare
Collaborator
|
/azp run Azure.sonic-buildimage |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Collaborator
|
/azp run Azure.sonic-buildimage |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Contributor
Author
|
/azp run Azure.sonic-buildimage |
|
Commenter does not have sufficient privileges for PR 27192 in repo sonic-net/sonic-buildimage |
This adds a new SONiC YANG module and CONFIG_DB table that models
FIB route filtering — the policy that decides which routes from each
routing protocol are installed into the FIB (and therefore APPL_DB and
the hardware ASIC). It is backed by FRR's
`ip|ipv6 protocol <PROTO> route-map <NAME>` (zebra route filtering;
docs.frrouting.org/en/stable-7.5/zebra.html#zebra-route-filtering).
In FRR, a deny-match on this route-map flows from
`zebra_route_map_check()` -> `nexthop_active_check()` -> `rib_process()`
and short-circuits `rib_install_kernel()`, so the rib_update hook never
fires for the route. As a result fpmsyncd never sees the prefix and
APPL_DB / hardware are never programmed for it. The route does remain
in zebra's RIB, so the BGP / OSPF / IS-IS control plane is unaffected
and the route can still be advertised to peers — only the dataplane
install side is filtered.
The primary operator use case is reducing hardware route-table
pressure: routes from a chosen protocol can be selectively excluded
from the FIB without affecting the BGP / OSPF / etc. control plane.
YANG (sonic-fib-route-filter.yang)
==================================
container FIB_ROUTE_FILTER
list FIB_ROUTE_FILTER_LIST
key vrf_name addr_family protocol
leaf vrf_name union { 'default' | leafref VRF_LIST }
leaf addr_family sonic-types:ip-family ('IPv4' | 'IPv6')
leaf protocol enum { any, bgp, connected, eigrp, isis,
kernel, nhrp, ospf, ospf6, rip,
ripng, static, table }
leaf route_map leafref ROUTE_MAP_SET (mandatory)
must rejects AFI-incompatible protocol pairs:
ipv4 + (ospf6 | ripng),
ipv6 + (ospf | rip | eigrp)
The schema reuses sonic-types:ip-family rather than introducing a
local enum, matching the pattern in sonic-bgp-prefix-list.yang.
`addr_family` is rendered into the FRR `ip` / `ipv6` CLI keyword via
the `{'IPv4':'ip','IPv6':'ipv6'}` lookup that
bgpd.conf.db.pref_list.j2 and bgpd.conf.db.route_map.j2 already use.
Conversion layer
================
Two render paths, both honored:
* Separated mode (docker-fpm-frr): new include
dockers/docker-fpm-frr/frr/zebra/zebra.fib_route_filter.conf.j2
pulled into zebra.conf.j2 after zebra.interfaces.conf.j2. The
template renders empty when the table is missing; otherwise it
groups rows by VRF in two passes so multiple rows that share a
non-default VRF render as a single `vrf <N> / ... / exit-vrf`
block (instead of one block per row).
* Unified mode (sonic-frr-mgmt-framework): the same include is pulled
into templates/frr/frr.conf.j2 for boot rendering. Runtime updates
go through a new fib_route_filter_handler in frrcfgd.py. The
handler:
- seeds fib_route_filter_state from CONFIG_DB at init
- on SET / UPDATE / DELETE, builds the vtysh command
([vrf <N>] ip|ipv6 protocol <PROTO> route-map <NAME>
[exit-vrf]) and pushes it
- advances state only after `__run_command` returns True so a
transient FRR failure is retryable and the idempotent
`prev_rm == route_map` guard does not silently skip retries
- is registered against ['mgmtd'] in TABLE_DAEMON because the FRR
CLI is DEFPY_YANG and unified mode dispatches it through
mgmtd's northbound; targeting ['zebra'] directly returns
'Unknown command'.
The handler relies on FRR's upsert behavior for this CLI
(NB_OP_MODIFY on the ./route-map leaf) — emitting a fresh
`ip protocol X route-map Y` replaces any prior binding for the same
(vrf, afi, proto), so no preceding `no ... route-map <prev>` is
needed.
Example
=======
CONFIG_DB:
{
"FIB_ROUTE_FILTER": {
"default|IPv4|bgp": { "route_map": "RM_FROM_BGP" },
"default|IPv6|ospf6": { "route_map": "RM_FROM_OSPF6" },
"Vrf_red|IPv4|static": { "route_map": "RM_STATIC_V4" },
"Vrf_red|IPv4|bgp": { "route_map": "RM_BGP_RED" }
}
}
Rendered FRR CLI:
ip protocol bgp route-map RM_FROM_BGP
ipv6 protocol ospf6 route-map RM_FROM_OSPF6
!
vrf Vrf_red
ip protocol static route-map RM_STATIC_V4
ip protocol bgp route-map RM_BGP_RED
exit-vrf
!
Tests
=====
* sonic-yang-models tests/yang_model_tests:
2 positive cases (default VRF, named VRF) and 10 negative cases
(LeafRef on missing route-map, InvalidValue on unknown VRF and
out-of-enum protocol, Mandatory on missing route_map, and 5
Must violations covering every AFI/protocol incompatibility).
The new YANG file is registered in setup.py and seeded in
sample_config_db.json so the wheel-build validator passes.
* sonic-frr-mgmt-framework tests/test_config.py:
17 unit tests (test_frf_*) covering SET / SET-update / DELETE /
DELETE-untracked / idempotent-set / tuple-key / malformed-key /
unsupported-AFI / missing-route-map / table_handler_list
registration; plus 4 failure-path tests that pin the invariant
'state must only advance when vtysh actually succeeds.'
* sonic-config-engine tests/test_frr.py:
test_zebra_frr_fib_route_filter renders zebra/zebra.conf.j2 with
a 4-row FIB_ROUTE_FILTER fixture that exercises both default-VRF
grouping and multi-row per-VRF grouping, comparing byte-for-byte
against zebra_frr_fib_route_filter.conf.
Signed-off-by: Kalash Nainwal <kalash@nexthop.ai>
The original FIB_ROUTE_FILTER patch (yang + frrcfgd + zebra.conf.j2)
landed unified-mode runtime apply through frrcfgd, but in separated
mode the binding only takes effect at config_reload time because
zebra.conf is rendered from the new Jinja2 template at boot. This
adds a small bgpcfgd manager that closes that asymmetry: a SET / DEL
on FIB_ROUTE_FILTER produces the matching FRR command live, without
a reload, in either mode.
FibRouteFilterMgr
=================
managers_fib_route_filter.py subscribes to CONFIG_DB / FIB_ROUTE_FILTER
and is registered in main.py next to the other route-policy managers.
The CONFIG_DB key shape is vrf|addr_family|protocol (e.g.
"default|IPv4|bgp"); rows carry a single mandatory field, route_map.
Emitted vtysh commands:
set, default VRF:
ip|ipv6 protocol <proto> route-map <name>
set, named VRF:
vrf <name>
ip|ipv6 protocol <proto> route-map <name>
exit-vrf
delete (both VRF forms):
no ip|ipv6 protocol <proto> (optionally wrapped)
These match the lines rendered at boot by
zebra.fib_route_filter.conf.j2 in the parent patch, so the runtime and
boot paths converge on the same FRR running-config. The boot path
stays the source of truth for warm reboot / fresh boot; the manager is
just the runtime mutation hook.
addr_family carries the sonic-types:ip-family values ('IPv4'/'IPv6')
and is mapped to FRR's 'ip'/'ipv6' CLI keyword via the same
{'IPv4':'ip','IPv6':'ipv6'} dict that bgpd.conf.db.pref_list.j2,
bgpd.conf.db.route_map.j2, the new zebra.fib_route_filter.conf.j2, and
the frrcfgd fib_route_filter_handler already use.
Per-key state cache
===================
A small dict (_applied: frf_key -> last route_map) lets:
- DEL emit the matching 'no ip|ipv6 protocol <proto>' without
needing the prior route_map field (the CONFIG_DB row is already
gone by the time del_handler runs);
- idempotent re-sets short-circuit, so a no-op CONFIG_DB replay
(e.g. ConfigDBConnector reconnect) does not churn vtysh.
set_handler only mutates _applied after cfg_mgr.push_list buffers the
command; the actual vtysh write happens later at Runner commit. A
commit failure leaves _applied ahead of FRR state, the same trade-off
the other bgpcfgd managers carry.
del_handler short-circuits on untracked keys before validating the
AFI, so a DEL for an unsupported-AFI key (which set_handler would have
rejected on the way in, so it can never be in _applied) is a quiet
skip rather than a spurious error log.
FRR's `ip|ipv6 protocol <proto> route-map <name>` is upsert-style:
re-binding the same (vrf, afi, proto) tuple replaces any prior
route-map without an intervening `no`. The manager relies on that to
avoid emitting a separate `no ... route-map <prev>` line on route-map
change.
Tests
=====
tests/test_fib_route_filter_mgr.py — 22 unit tests grouped into:
* TestBuildCmds: locks the exact wire format of _build_set_cmds /
_build_del_cmds for default and named VRFs, IPv4 and IPv6.
* TestSetHandler: default-VRF v4 / v6, named-VRF wrapping,
idempotent re-set, upsert with a different route-map, missing
route_map field rejected, malformed keys (too-few parts,
empty-field) rejected, unknown addr_family rejected, tuple-key
accepted (defensive — swsscommon delivers strings today).
* TestDelHandler: tracked default-VRF and named-VRF deletes,
untracked-key noop, malformed-key noop, unknown-AFI noop.
* TestMultipleRows: independent per-(vrf, afi, proto) tracking;
deleting one row leaves siblings intact.
* TestLifecycle: set -> del -> re-set must re-push (not
idempotent-skip) — locks the eviction-then-re-registration
semantic of _applied.
python -m pytest src/sonic-bgpcfgd/tests/test_fib_route_filter_mgr.py
-> 22 passed.
Signed-off-by: Kalash Nainwal <kalash@nexthop.ai>
0fb6dfd to
47c7f68
Compare
Collaborator
|
/azp run Azure.sonic-buildimage |
|
Azure Pipelines successfully started running 1 pipeline(s). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Implements the HLD posted here: sonic-net/SONiC#2367
Add a new SONiC YANG module that models FIB route filtering — the policy that decides which routes from each routing protocol are installed into the FIB (and therefore APPL_DB and the hardware ASIC). Backed by FRR's
ip|ipv6 protocol <PROTO> route-map <NAME>(see https://docs.frrouting.org/en/stable-7.5/zebra.html#zebra-route-filtering); a deny-match flows throughnexthop_active_check->rib_processand short-circuitsrib_install_kernel, so fpmsyncd never sees the prefix and APPL_DB / hardware are never programmed for it. The route remains in zebra's RIB.Each CONFIG_DB entry binds a route-map to a
(vrf, addr_family, protocol)tuple;addr_familyreusessonic-types:ip-family(IPv4/IPv6); amustexpression enforces AFI/protocol compatibility (ospf/rip/eigrp IPv4-only, ospf6/ripng IPv6-only);route_mapis a mandatory leafref intoROUTE_MAP_SET.Both modes are wired end-to-end, boot and runtime:
dockers/docker-fpm-frr/frr/zebra/zebra.fib_route_filter.conf.j2(included fromzebra.conf.j2), plus runtime translation via a newFibRouteFilterMgrinbgpcfgdthat subscribes toFIB_ROUTE_FILTERand pushes the matching FRR command through vtysh.frr.conf.j2(sonic-frr-mgmt-framework), plus runtime translation infrrcfgd.py(fib_route_filter_handler) routed via mgmtd because the FRR CLI isDEFPY_YANG.The boot template is the source of truth for warm reboot / fresh boot; the managers are the runtime mutation hooks. Net effect: a
sonic-db-cliwrite toFIB_ROUTE_FILTERtakes effect immediately in either mode — noconfig_reloadrequired.Why I did it
SONiC has YANG coverage for route-maps, prefix-lists, community sets, etc., but no schema for "limit which routes from this protocol are programmed into the FIB / APPL_DB / hardware." Today the only way to configure
ip protocol <proto> route-map <name>is to hand-editzebra.conf(offrr.confin unified mode). This PR adds a CONFIG_DB-driven schema so the feature can be configured through the standard SONiC flow, plus the conversion layer to render and runtime-update the corresponding FRR config.The primary operator use case is reducing hardware route capacity pressure: routes from a chosen protocol can be selectively excluded from the FIB without affecting the BGP/OSPF/etc. control plane.
Work item tracking
How I did it
New YANG module
src/sonic-yang-models/yang-models/sonic-fib-route-filter.yang:FIB_ROUTE_FILTERkeyed onvrf_name | addr_family | protocol.vrf_name: union of the literaldefaultand a leafref intosonic-vrf.addr_family:sonic-types:ip-family(IPv4/IPv6); rendered into the FRRip/ipv6CLI keyword via the same{'IPv4':'ip','IPv6':'ipv6'}lookup the prefix-list and route-map templates already use.protocol: enum (any, bgp, connected, eigrp, isis, kernel, nhrp, ospf, ospf6, rip, ripng, static, table) — same set as FRR'sfrr-route-types.route_map: mandatory leafref intosonic-route-map:ROUTE_MAP_SET.Conversion layer:
dockers/docker-fpm-frr/frr/zebra/zebra.fib_route_filter.conf.j2, pulled intozebra.conf.j2afterzebra.interfaces.conf.j2. Renders empty when the table is missing; emits a singlevrf <N> / ... / exit-vrfblock per non-default VRF (rows for the same VRF grouped) and bare lines for the default VRF.FibRouteFilterMgrinsrc/sonic-bgpcfgd/bgpcfgd/managers_fib_route_filter.py, registered inmain.pynext to the other route-policy managers. Subscribes toCONFIG_DB / FIB_ROUTE_FILTER; on SET pushesip|ipv6 protocol <proto> route-map <name>(wrapped invrf <N> / ... / exit-vrffor non-default VRFs) viacfg_mgr.push_list; on DEL pushes the matchingno ...form using a per-key state cache so the priorroute_mapfield is not needed. Idempotent re-sets short-circuit (no churn on ConfigDBConnector replay).del_handlerchecks the tracked-key fast-exit before the AFI guard so a DEL for an unsupported-AFI key (which set_handler would have rejected on the way in) is a quiet skip rather than a spurious error.src/sonic-frr-mgmt-framework/templates/frr/frr.conf.j2, plusfib_route_filter_handlerinfrrcfgd.pythat: (a) seedsfib_route_filter_statefrom CONFIG_DB at init, (b) translates SET / UPDATE / DELETE events into vtysh, (c) advances state only on__run_commandsuccess so failures are retryable, (d) is registered againstmgmtdinTABLE_DAEMONbecause the FRR CLIs areDEFPY_YANGand unified mode dispatches them through mgmtd's northbound (going to['zebra']directly returns "Unknown command").Example
Filter BGP-learned IPv4 routes and OSPFv3 IPv6 routes in the default VRF, and filter both static and BGP routes in
Vrf_red:CONFIG_DB:
{ "FIB_ROUTE_FILTER": { "default|IPv4|bgp": { "route_map": "RM_FROM_BGP" }, "default|IPv6|ospf6": { "route_map": "RM_FROM_OSPF6" }, "Vrf_red|IPv4|static": { "route_map": "RM_STATIC_V4" }, "Vrf_red|IPv4|bgp": { "route_map": "RM_BGP_RED" } } }Corresponding FRR CLI (rendered into
zebra.conf/frr.confand pushed via vtysh):This maps onto FRR's own YANG list
filter-protocolunder/frr-vrf:lib/vrf[name=N]/frr-zebra:zebra/(frr-zebra.yang). The SONiC schema flattens FRR's VRF-augmented tree into a single multi-key CONFIG_DB table and tightens the type system (enum protocol, IPv4/IPv6 AFI only, AFI/protocolmust-guard).How to verify it
The yang-models test harness (
src/sonic-yang-models/tests/yang_model_tests/test_yang_model.py) globsyang-models/*.yangand every fixture undertests/yang_model_tests/tests/, so the new model + tests run automatically in CI. Expected outcomes:FIB_ROUTE_FILTER_VALID,FIB_ROUTE_FILTER_VALID_NAMED_VRFload cleanly.LeafRef(NON_EXIST_RT_MAP),InvalidValuex2 (NON_EXIST_VRF, INVALID_PROTOCOL),Mandatory(MISSING_RT_MAP),Mustx5 (AFI_PROTO_MISMATCH_V4_OSPF6, V6_OSPF, V4_RIPNG, V6_RIP, V6_EIGRP).frrcfgd unit tests run in
src/sonic-frr-mgmt-framework; the CI builds the wheel and runs them. 17 FRF tests pass alongside the existing suite.test_zebra_frr_fib_route_filterinsrc/sonic-config-engine/tests/test_frr.pyverifies separated-mode template rendering against the committed sample output.bgpcfgd unit tests run in
src/sonic-bgpcfgd; CI builds the wheel and runs them.tests/test_fib_route_filter_mgr.pycontributes 22 tests covering: the exact wire format of set / del commands (default and named VRFs, IPv4 / IPv6); set_handler over default-VRF v4 / v6, named-VRF wrapping, idempotent re-set, upsert with a different route-map, malformed-key and unknown-AFI rejection, tuple-key acceptance; del_handler over tracked / untracked / malformed / unknown-AFI cases; per-(vrf, afi, proto) tracking independence; and the set -> del -> re-set lifecycle (verifies_appliedeviction-then-re-registration so a re-set after delete re-pushes rather than idempotent-skipping).Which release branch to backport (provide reason below if selected)
Tested branch (Please provide the tested image version)
Description for the changelog
Add
FIB_ROUTE_FILTERCONFIG_DB table and YANG schema, plus frrcfgd handler and zebra/frr.conf templates: bind a route-map to(vrf, addr_family, protocol)so chosen routes are filtered out before FIB / APPL_DB / hardware install. Renders to FRR'sip|ipv6 protocol <PROTO> route-map <NAME>.Link to config_db schema for YANG module changes
Configuration.mdentry forFIB_ROUTE_FILTERwill be added in a follow-up doc PR.A picture of a cute animal (not mandatory but encouraged)