Skip to content

fix: invalid telemetry backendref degrades gateway#9406

Open
vishwas-bm wants to merge 3 commits into
envoyproxy:mainfrom
nokia:fix/9229-invalid-telemetry-backendref-degrades-gateway
Open

fix: invalid telemetry backendref degrades gateway#9406
vishwas-bm wants to merge 3 commits into
envoyproxy:mainfrom
nokia:fix/9229-invalid-telemetry-backendref-degrades-gateway

Conversation

@vishwas-bm

@vishwas-bm vishwas-bm commented Jul 2, 2026

Copy link
Copy Markdown

What type of PR is this?

Bug fix.

What this PR does / why we need it

An invalid backendRef in an EnvoyProxy's telemetry configuration (tracing, access log, or metrics) previously caused the translator to mark the Gateway as Accepted=False.

This PR changes the behavior in processProxyObservability:

  • Telemetry errors no longer reject the Gateway.
  • The individual errors are aggregated and added as a warning on the EnvoyProxy status

Which issue(s) this PR fixes

Fixes #9229


PR Checklist

  • Authorship & ownership: Coding agents / AI assistants are welcome, but I have reviewed every change, understand how and why it works, can explain and maintain it, and take full responsibility for this PR. I have not submitted generated output I do not understand.
  • DCO: All commits are signed off (git commit -s). See DCO: Sign your work.
  • API agreed first: If this PR contains API changes (changes under /api), the API was discussed and agreed before the implementation. The API change can be in a separate PR, or in the same PR, but the API must be agreed before implementation. N/A if this PR does not contain API changes.
  • Required checks pass: make generate gen-check, make lint, and the unit-test/coverage build pass. (Flaky e2e failures are not considered breakages, but gen-check, lint, and coverage MUST pass.)
  • Tests added/updated: New/changed code is covered by appropriate tests. N/A if this PR does not contain code changes.
  • Docs: User-facing changes update the docs, either in this PR or a follow-up PR. N/A if this PR does not contain user-facing changes.
  • Release notes: For any non-trivial change, added a release-note fragment under release-notes/current/<section>/<pr-number>-<slug>.md (see release-notes/current/README.md for sections and naming). N/A if this PR does not contain non-trivial changes.
  • Generated files committed: Ran make gen-check and committed the result if API/helm charts/modules changed.
  • Scope & compatibility: The PR is reasonably scoped (no unrelated changes) and preserves backward compatibility, or any breaking change is called out above and documented in release-notes/current/breaking_changes/.
  • Codex review: Requested a Codex review and addressed all of its comments.
  • Copilot review: Requested a Copilot review and addressed all of its comments.

An invalid tracing, access log, or metrics backendRef in the referenced
EnvoyProxy previously set the Gateway to Accepted=False
(InvalidParameters).

Telemetry misconfigurations are now added as a warning on the
EnvoyProxy status (Accepted=True, reason Accepted) and only the affected
telemetry feature is skipped, so the Gateway and its infrastructure keep
running.

Fixes envoyproxy#9229

Signed-off-by: vishwas-bm <b_m.vishwas@nokia.com>
@vishwas-bm vishwas-bm requested a review from a team as a code owner July 2, 2026 09:41
@netlify

netlify Bot commented Jul 2, 2026

Copy link
Copy Markdown

Deploy Preview for cerulean-figolla-1f9435 ready!

Name Link
🔨 Latest commit 79e1321
🔍 Latest deploy log https://app.netlify.com/projects/cerulean-figolla-1f9435/deploys/6a47813f972c8b000828641f
😎 Deploy Preview https://deploy-preview-9406--cerulean-figolla-1f9435.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@vishwas-bm vishwas-bm changed the title Fix/9229 invalid telemetry backendref degrades gateway fix: invalid telemetry backendref degrades gateway Jul 2, 2026

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7507206974

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines 713 to +715
xdsIR.AccessLog, err = t.processAccessLog(gwCtx, envoyProxy, resources)
if err != nil {
status.UpdateGatewayStatusNotAccepted(gwCtx.Gateway, gwapiv1.GatewayReasonInvalidParameters,
fmt.Sprintf("Invalid access log backendRefs in the referenced EnvoyProxy: %v", err))
return
warnings = append(warnings, fmt.Errorf("invalid access log backendRefs in the referenced EnvoyProxy: %w", err))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve access logs when only CEL matches are invalid

When an access log setting contains any invalid matches CEL expression, processAccessLog returns that error before building the IR; this new blanket handling treats it as an invalid backendRef warning and leaves xdsIR.AccessLog nil, so a config with valid file/OTel sinks plus one bad match now keeps the Gateway accepted but removes all access logging. The access-log API documents invalid CEL expressions as ignored, so this path should drop/report only the bad match and reserve feature skipping for actual backendRef resolution failures.

Useful? React with 👍 / 👎.

@codecov

codecov Bot commented Jul 2, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 75.20%. Comparing base (f8f0370) to head (7507206).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #9406   +/-   ##
=======================================
  Coverage   75.19%   75.20%           
=======================================
  Files         252      252           
  Lines       41093    41112   +19     
=======================================
+ Hits        30900    30917   +17     
- Misses       8090     8091    +1     
- Partials     2103     2104    +1     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

vishwas-bm added a commit to nokia/envoyproxy-gateway that referenced this pull request Jul 3, 2026
… logs

Invalid CEL match expressions in an EnvoyProxy access log setting no longer
cause the whole access log feature to be skipped. Per the AccessLog API,
invalid CEL expressions are ignored: only the invalid expressions are dropped
while the valid expressions and the rest of the access log (file/OTel sinks)
are preserved. Feature skipping remains reserved for backendRef resolution
failures.

Addresses Codex review feedback on envoyproxy#9406.

Signed-off-by: vishwas-bm <b_m.vishwas@nokia.com>
… logs

Invalid CEL match expressions in an EnvoyProxy access log setting no longer
cause the whole access log feature to be skipped. Per the AccessLog API,
invalid CEL expressions are ignored: only the invalid expressions are dropped
while the valid expressions and the rest of the access log (file/OTel sinks)
are preserved. Feature skipping remains reserved for backendRef resolution
failures.

Addresses Codex review feedback on envoyproxy#9406.

Signed-off-by: vishwas-bm <b_m.vishwas@nokia.com>
@vishwas-bm vishwas-bm force-pushed the fix/9229-invalid-telemetry-backendref-degrades-gateway branch from 9455161 to 79e1321 Compare July 3, 2026 09:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Invalid tracing backendRef degrades Gateway, causing loss of load balancer address

1 participant