feat: add Browser RUM dashboard template#2413
Conversation
Adds a "Browser RUM" template to the dashboards gallery for browser sessions instrumented with the HyperDX Browser SDK (or any OTel browser instrumentation emitting a rum.sessionId resource attribute): - Performance Overview: page-view/session/error KPIs, Core Web Vitals (LCP/INP/CLS) p75, median/p75/p90 page-load percentiles, long tasks - Page Views Breakdown: traffic by URL, browser, country, device size (derived from screen.xy) - Errors section with tabs (overview, JS exceptions by message and by page, failing API calls) - Six dashboard filters: Service, Environment, Service Version, Page URL, Browser, Country Top Browsers / Top Countries tiles and the Browser/Country filters populate when the collector's useragent and geoip processors are on.
🦋 Changeset detectedLatest commit: 0f640f0 The changes in this PR will be included in the next version bump. This PR includes changesets to release 3 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
🔴 Tier 4 — CriticalTouches auth, data models, config, tasks, OTel pipeline, ClickHouse, or CI/CD. Why this tier:
Review process: Deep review from a domain expert. Synchronous walkthrough may be required. Stats
|
Deep Review✅ No critical issues found. This PR is a declarative dashboard template ( 🟡 P2 -- recommended
🔵 P3 nitpicks (5)
Reviewers (6): correctness, maintainability, testing, project-standards, performance, learnings-researcher. Testing gaps:
|
E2E Test Results✅ All tests passed • 197 passed • 3 skipped • 1299s
Tests ran across 4 shards in parallel. |
Trim the dashboard description to a single sentence to match the length and style of the existing runtime-metrics templates.
The browser is already captured out of the box: the OTel document-load instrumentation sets http.user_agent (navigator.userAgent) on documentLoad spans. The template was instead grouping on user_agent.name / user_agent.original, which require collector-side enrichment that isn't present by default, so Top Browsers came up empty against real data. - Top Browsers now parses the browser name from SpanAttributes ['http.user_agent'] in SQL (Edge/Opera/Firefox/Chrome/Safari/Other), scoped to spans carrying the UA. Works with no SDK or collector change. - Removed the dashboard-level Browser filter: http.user_agent only exists on documentLoad spans, so a cross-tile filter keyed on it would zero out every non-documentLoad tile. It can return once the UA is promoted to a resource attribute (present on every span). Country tile/filter still depend on the collector geoip processor, since the browser cannot determine the user's country.
The chart builder editor only renders a WHERE input bound to the per-series aggCondition (ChartSeriesEditor); the top-level `where` input renders solely for Search-type tiles (ChartEditorControls.tsx:148 vs :334). So builder tiles that stored their filter in top-level `where` showed an empty WHERE box even though the filter applied correctly in SQL (renderChartConfig reads config.where directly). This affected nearly every tile, not just Page Views; the earlier OR-vs-AND theory was a red herring. Move each tile's filter from top-level `where` into the aggCondition of every select (clearing `where`). renderChartConfig promotes an all-selects aggCondition back into a real WHERE clause (renderChartConfig.ts:944,1019), so for a single shared condition the rendered query is result-identical (count() WHERE c == countIf(c) WHERE c, etc.) while the condition now shows in the editor. Left unchanged: Errors over Time and Top Errored Sessions, which already use per-series aggConditions (their meaningful conditions already display; their top-level where is only the broad rum.sessionId scope). Verified: dashboardTemplates schema test + app ci:lint pass; SQL result-equivalence confirmed by reading renderChartConfig's aggCondition promotion. Live editor click-through deferred (dev stack down).
Wire up the table onClick row-action (SavedChartConfig.onClick, type
'search') on the tables whose grouped value reverses cleanly into a
search filter:
- Top Errored Sessions -> opens the session's spans
(rum.sessionId:"{{Session}}") — the client-side tracing drilldown
- Top URLs / Slowest Pages -> page views / doc loads for that URL
- Errors per Page -> errors for that URL
- Top JS Errors -> spans for that exception message
Each targets the Traces source by name ({ mode: 'id', id: 'Traces' });
the import flow auto-matches that to the user's mapped source and
rewrites it to the concrete ID (DBDashboardImportPage onClick mapping +
convertToDashboardDocument), so it stays portable. whereTemplate uses
Handlebars row-column variables. Skipped tiles whose group key can't be
reversed (Top Failing API Calls concat, Top Browsers/Countries/Device
derived buckets).
Builder tables without an onClick fall back to buildTableRowSearchUrl, which derives the drilldown from config.where — now empty (filters moved to aggCondition), so those drilldowns lost their scope. And the derived group keys (browser/device/concat) don't reverse into a filter. There's no template-level way to disable a builder-table row action, so give the remaining tables a correct onClick instead: - Top JS Errors: match the coalesced group value across exception.message / message / SpanName (it previously only matched exception.message, so e.g. an "unhandledrejection" row returned nothing). - Top Browsers: substring-match the parsed name against http.user_agent. - Top Countries: exact geo.country.name match. - Top Failing API Calls: regroup by http.url so the row reverses; drill into fetch/xhr calls to that endpoint. - Top Device Sizes: regroup by raw screen.xy so the row reverses; drill into documentLoad spans at that resolution. Every table now has a working, scoped row action; the scope-less legacy fallback no longer fires.
Greptile SummaryThis PR adds a new Browser RUM dashboard template to the dashboards gallery, covering Core Web Vitals, page-load percentiles, traffic breakdowns, and an errors section with JS exceptions and AJAX failure tracking. The change is purely declarative — a new JSON file registered in
Confidence Score: 4/5Safe to merge with a minor fix recommended: the Top Browsers click-through produces empty results for Edge and Opera users. The template is declarative JSON with no runtime logic changes. The one concrete defect is in the Top Browsers onClick.whereTemplate: it substitutes human-readable display names that do not appear verbatim in modern Edge or Opera user-agent strings, so clicking those rows returns no results. Chrome, Firefox, and Safari click-throughs work correctly. The rest of the dashboard logic and session scoping is sound. The onClick.whereTemplate for rum-026 (Top Browsers) in browser-rum.json needs the Edge/Opera token mismatch addressed. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Browser RUM Template] --> B[Performance Overview]
A --> C[Page Views Breakdown]
A --> D[Errors Section]
B --> B1[KPIs: Page Views, Sessions, Load percentiles, Core Web Vitals]
B --> B2[Page Load Chart median p75 p90]
B --> B3[Page Views over Time + Long Tasks]
C --> C1[Top URLs]
C --> C2[Top Browsers — onClick broken for Edge and Opera]
C --> C3[Top Countries — requires geoip]
C --> C4[Top Device Sizes]
C --> C5[Slowest Pages p75]
C --> C6[Top Errored Sessions]
D --> D1[Overview: JS Errors KPI, AJAX Errors KPI, Errors over Time]
D --> D2[JS Exceptions: by message, by page]
D --> D3[API Failures: Top Failing API Calls]
A --> F[5 Filters: Service, Environment, Version, Page URL, Country]
Reviews (2): Last reviewed commit: "fix: scope AJAX error tiles to RUM sessi..." | Re-trigger Greptile |
…ition Code-review fixes for the Errors section: 1. AJAX Errors KPI (rum-008) and Top Failing API Calls (rum-013) had no rum.sessionId guard, so server-side fetch/xhr spans could inflate the counts relative to the rest of the dashboard. Add the SQL equivalent of the lucene rum.sessionId:* guard the sibling tiles use (ResourceAttributes['rum.sessionId'] != ''). 2. The AJAX Errors KPI counted status>=400 OR error span status, while the "Errors over Time" AJAX series only counted error span status — so a 404 with no error status hit the KPI but not the chart. Align the chart's AJAX series to the same (more complete) definition so the KPI total and the chart line measure the identical event set.
pulpdrew
left a comment
There was a problem hiding this comment.
Cool stuff, nice to see some newer dashboard features being exercised! Couple of suggestions
| "config": { | ||
| "name": "Median Page Load (ms)", | ||
| "source": "Traces", | ||
| "displayType": "number", | ||
| "granularity": "auto", | ||
| "alignDateRangeToGranularity": true, | ||
| "select": [ | ||
| { | ||
| "aggFn": "quantile", | ||
| "level": 0.5, | ||
| "valueExpression": "Duration / 1000000", | ||
| "aggCondition": "SpanName:\"documentLoad\"", | ||
| "aggConditionLanguage": "lucene" | ||
| } | ||
| ], | ||
| "where": "", | ||
| "whereLanguage": "lucene", | ||
| "numberFormat": { | ||
| "output": "number", | ||
| "mantissa": 0, | ||
| "thousandSeparated": true | ||
| } |
There was a problem hiding this comment.
Is there a change this will be the wrong precision for duration? Same thing elsewhere.
Instead, we could remove the divisor and numberFormat, so that duration format with the correct precision will be inferred:
| "config": { | |
| "name": "Median Page Load (ms)", | |
| "source": "Traces", | |
| "displayType": "number", | |
| "granularity": "auto", | |
| "alignDateRangeToGranularity": true, | |
| "select": [ | |
| { | |
| "aggFn": "quantile", | |
| "level": 0.5, | |
| "valueExpression": "Duration / 1000000", | |
| "aggCondition": "SpanName:\"documentLoad\"", | |
| "aggConditionLanguage": "lucene" | |
| } | |
| ], | |
| "where": "", | |
| "whereLanguage": "lucene", | |
| "numberFormat": { | |
| "output": "number", | |
| "mantissa": 0, | |
| "thousandSeparated": true | |
| } | |
| "config": { | |
| "name": "Median Page Load", | |
| "source": "Traces", | |
| "displayType": "number", | |
| "granularity": "auto", | |
| "alignDateRangeToGranularity": true, | |
| "select": [ | |
| { | |
| "aggFn": "quantile", | |
| "level": 0.5, | |
| "valueExpression": "Duration", | |
| "aggCondition": "SpanName:\"documentLoad\"", | |
| "aggConditionLanguage": "lucene" | |
| } | |
| ], | |
| "where": "", | |
| "whereLanguage": "lucene" |
| "config": { | ||
| "name": "LCP p75 (ms)", | ||
| "source": "Traces", | ||
| "displayType": "number", | ||
| "granularity": "auto", | ||
| "alignDateRangeToGranularity": true, | ||
| "select": [ | ||
| { | ||
| "aggFn": "quantile", | ||
| "level": 0.75, |
There was a problem hiding this comment.
This is interesting, we don't support p75 through the app, so this renders as an empty aggFn.
Do we need p75 or could we do p95? If p75, maybe we should try a custom aggregation to populate the dropdown correctly.
Sidenote, we should probably add a validation so we don't accept this during import, or expand support to custom quantile levels.
| "groupBy": [ | ||
| { | ||
| "valueExpression": "coalesce(nullif(SpanAttributes['http.url'], ''), nullif(SpanAttributes['page.url'], ''), nullif(SpanAttributes['location.href'], ''))", | ||
| "alias": "URL" | ||
| } | ||
| ], |
| "w": 24, | ||
| "h": 8, | ||
| "config": { | ||
| "name": "Top Errored Sessions", |
There was a problem hiding this comment.
A few of these tables could probably benefit from setting groupByColumnsOnLeft so they read more naturally
| "mode": "id", | ||
| "id": "Traces" | ||
| }, | ||
| "whereTemplate": "ResourceAttributes.rum.sessionId:\"{{Session}}\"", |
There was a problem hiding this comment.
Nice to see this getting used!
| { | ||
| "aggFn": "quantile", | ||
| "level": 0.75, | ||
| "valueExpression": "Duration / 1000000", | ||
| "aggCondition": "SpanName:\"documentLoad\"", | ||
| "aggConditionLanguage": "lucene", | ||
| "alias": "Page Load p75 (ms)" | ||
| }, | ||
| { | ||
| "aggFn": "count", | ||
| "valueExpression": "", | ||
| "alias": "Views", | ||
| "aggCondition": "SpanName:\"documentLoad\"", | ||
| "aggConditionLanguage": "lucene" | ||
| } |
There was a problem hiding this comment.
We could add per-series numberFormats here to render the p75 as a duration and the count as a number

Summary
Adds a Browser RUM template to the dashboards gallery (
Dashboards → Templates) for browser sessions instrumented with the HyperDX Browser SDK — or any OpenTelemetry browser instrumentation that emits arum.sessionIdresource attribute. It fills a gap: HyperDX ships a browser SDK but had no out-of-the-box RUM dashboard. The template is purely declarative JSON validated by the existingdashboardTemplatesschema test; the only code change is registering it indashboardTemplates/index.ts, plus a changeset.The dashboard is organized into three sections: Performance Overview (page-view/session/error KPIs, Core Web Vitals LCP/INP/CLS p75, median/p75/p90 page-load percentiles, long tasks), Page Views Breakdown (traffic by URL, browser, country, and device size derived from
screen.xy), and a tabbed Errors section (overview, JS exceptions by message and by page, failing API calls). It also defines dive dashboard-level filters: Service, Environment, Service Version, Page URL, and Country.Screenshots or video
Tab-1780519086079.webm
How to test locally or on Vercel
Dashboards → Templates → Browser RUM → Import, then map each tile/filter to your Traces source (auto-maps if a source named "Traces" exists).@hyperdx/browserat your collector (or seedwebvitals/documentLoad/ fetch+xhr / exception spans carryingrum.sessionId). Verify the KPIs, Core Web Vitals, breakdown tables, and Errors tabs populate, and that the six filters apply.useragentandgeoipprocessors are enabled (noted in the tile titles + dashboard description).References