Skip to content

[client] Surface DNS forwarder upstream failures via Extended DNS Errors#6441

Open
lixmal wants to merge 1 commit into
mainfrom
dns-forwarder-ede
Open

[client] Surface DNS forwarder upstream failures via Extended DNS Errors#6441
lixmal wants to merge 1 commit into
mainfrom
dns-forwarder-ede

Conversation

@lixmal

@lixmal lixmal commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator

Describe your changes

When a domain route's DNS forwarder fails to resolve a query, the client only saw a bare SERVFAIL and had to cross-reference the routing peer's logs to learn why. This attaches an Extended DNS Error (RFC 8914) to forwarder failure responses describing the class of failure, and surfaces it on the querying client so the cause is visible in one place.

  • Forwarder tags upstream-failure responses as timeout or generic failure with an EDE option, without exposing the upstream resolver address
  • Uses codes from the RFC 8914 Private Use range so they never collide with a real upstream EDE
  • DNS interceptor advertises EDNS0 to the forwarder, records any returned EDE in the response trace, and strips the OPT for clients that did not request EDNS0
  • Move the shared OPT-stripping helper and an EDE extractor into the resutil package

Issue ticket number and link

Stack

Checklist

  • Is it a bug fix
  • Is a typo/documentation fix
  • Is a feature enhancement
  • It is a refactor
  • Created tests that fail without the change (if possible)
  • This change does not modify the public API, gRPC protocols, functionality behavior, CLI / service flags, or introduce a new feature — OR I have discussed it with the NetBird team beforehand (link the issue / Slack thread in the description). See CONTRIBUTING.md.

By submitting this pull request, you confirm that you have read and agree to the terms of the Contributor License Agreement.

Documentation

Select exactly one:

  • I added/updated documentation for this change
  • Documentation is not needed for this change (explain why)

Internal diagnostic surfaced in client trace logs; no user-facing configuration or API change.

Docs PR URL (required if "docs added" is checked)

Paste the PR link from https://github.com/netbirdio/docs here:

https://github.com/netbirdio/docs/pull/__

Summary by CodeRabbit

  • New Features

    • Added RFC 8914 Extended DNS Error support so upstream failures can return richer DNS error details when the client uses EDNS0.
    • When a client does not send EDNS0, the system still enables upstream EDE generation internally while ensuring responses don’t include EDNS0/OPT payloads.
  • Refactor

    • Centralized OPT stripping logic for consistent behavior across DNS resolution paths.
  • Tests

    • Expanded coverage to validate EDE presence/absence, correct error metadata, and proper failover behavior.

@coderabbitai

coderabbitai Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 47ad3693-cdd8-4084-b86d-f426b41fb723

📥 Commits

Reviewing files that changed from the base of the PR and between f7f528c and c4337f4.

📒 Files selected for processing (7)
  • client/internal/dns/resutil/resolve.go
  • client/internal/dns/resutil/resolve_test.go
  • client/internal/dns/upstream.go
  • client/internal/dns/upstream_test.go
  • client/internal/dnsfwd/forwarder.go
  • client/internal/dnsfwd/forwarder_test.go
  • client/internal/routemanager/dnsinterceptor/handler.go
💤 Files with no reviewable changes (1)
  • client/internal/dns/upstream_test.go
🚧 Files skipped from review as they are similar to previous changes (6)
  • client/internal/dns/resutil/resolve_test.go
  • client/internal/dnsfwd/forwarder_test.go
  • client/internal/dns/resutil/resolve.go
  • client/internal/routemanager/dnsinterceptor/handler.go
  • client/internal/dnsfwd/forwarder.go
  • client/internal/dns/upstream.go

📝 Walkthrough

Walkthrough

Adds two shared helpers (StripOPT, ExtractEDE) to the resutil package, removes the duplicate file-local stripOPT from upstream.go, extends the DNS forwarder to attach RFC 8914 Extended DNS Error options to SERVFAIL responses when the client used EDNS0, and updates the route manager's DNS interceptor to detect EDNS0, propagate EDE metadata, and strip OPT records from replies to non-EDNS0 clients.

Changes

EDNS/EDE shared helpers, forwarder attachment, and interceptor propagation

Layer / File(s) Summary
Shared StripOPT and ExtractEDE helpers with tests
client/internal/dns/resutil/resolve.go, client/internal/dns/resutil/resolve_test.go
StripOPT filters *dns.OPT records from a message's Extra section; ExtractEDE returns the first EDNS0_EDE option from a message's EDNS0 options. Both functions are tested with edge cases including absent OPT and OPT without EDE.
upstream.go refactor to resutil.StripOPT
client/internal/dns/upstream.go, client/internal/dns/upstream_test.go
Removes the file-local stripOPT function and replaces its two call sites with resutil.StripOPT. Deletes the now-redundant TestStripOPT from upstream tests and adds TestUpstreamResolver_NonRetryableEDEShortCircuits to verify SERVFAIL propagation, no failover, and no OPT leakage for non-EDNS0 clients.
DNS forwarder EDE attachment on upstream failure
client/internal/dnsfwd/forwarder.go, client/internal/dnsfwd/forwarder_test.go
Introduces private-use EDE info code constants for upstream timeout vs. generic failure. Updates handleDNSError to accept an EDNS0 flag and conditionally attach an EDNS0_EDE option (creating an OPT record if absent). Tests cover timeout, generic failure, and no-EDNS0 cases, asserting upstream address is not leaked in EDE text.
DNS interceptor EDNS0 detection and EDE propagation
client/internal/routemanager/dnsinterceptor/handler.go
In ServeDNS, records whether the client query included EDNS0 and if not, temporarily sets EDNS0 on the upstream request to enable EDE receipt. After upstream resolution, extracts EDE into response metadata and strips OPT from the reply for non-EDNS0 clients.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • netbirdio/netbird#6089: Directly modifies client/internal/dns/upstream.go EDNS0/OPT stripping and EDE-based upstream failover short-circuit logic, which this PR refactors to use the new shared resutil.StripOPT helper.

Suggested reviewers

  • pappz

Poem

🐇 Hop, hop through the DNS maze,
OPT records stripped in the haze,
EDE codes returned with care,
No upstream address leaks laid bare,
Shared helpers now keep things bright —
The rabbit approves this RFC delight! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 53.85% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title directly and clearly describes the main change: adding Extended DNS Error support to surface DNS forwarder upstream failures on the client side.
Description check ✅ Passed The description provides comprehensive details of changes, rationale, and implementation approach; it properly completes the required template with feature enhancement and test coverage selections.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch dns-forwarder-ede

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@client/internal/dnsfwd/forwarder_test.go`:
- Around line 654-697: The test sets up mock expectations on mockResolver with
the On() call for LookupNetIP but never verifies those expectations were
actually met during test execution. After the assertions on the written response
(after the ExtraText checks), add an assertion to verify that the mockResolver
met its expected calls using the standard testify mock assertion method. This
ensures that call-path regressions where the resolver is not called as expected
will be caught by the test.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: bd6ccc8e-a009-48f2-8564-bb4ed47c29f1

📥 Commits

Reviewing files that changed from the base of the PR and between 08a2b63 and f7f528c.

📒 Files selected for processing (7)
  • client/internal/dns/resutil/resolve.go
  • client/internal/dns/resutil/resolve_test.go
  • client/internal/dns/upstream.go
  • client/internal/dns/upstream_test.go
  • client/internal/dnsfwd/forwarder.go
  • client/internal/dnsfwd/forwarder_test.go
  • client/internal/routemanager/dnsinterceptor/handler.go
💤 Files with no reviewable changes (1)
  • client/internal/dns/upstream_test.go

Comment thread client/internal/dnsfwd/forwarder_test.go
@lixmal lixmal force-pushed the dns-forwarder-ede branch from f7f528c to c4337f4 Compare June 16, 2026 10:42
@github-actions

github-actions Bot commented Jun 16, 2026

Copy link
Copy Markdown

Release artifacts

Built for PR head c4337f4 in workflow run #15896.

Artifact Link
All release artifacts Download
Linux packages Download
Windows packages Download
macOS packages Download
UI artifacts Download
UI macOS artifacts Download

GHCR images (amd64)

This comment is updated by the Release workflow. Artifact links expire according to the workflow retention policy.

@sonarqubecloud

Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant