Skip to content

Upgrade authlib to 1.7.x and fix critical vulnerabilities#7711

Closed
wtfiwtz wants to merge 15 commits into
getredash:masterfrom
orchestrated-io:vuln-critical-2026-05
Closed

Upgrade authlib to 1.7.x and fix critical vulnerabilities#7711
wtfiwtz wants to merge 15 commits into
getredash:masterfrom
orchestrated-io:vuln-critical-2026-05

Conversation

@wtfiwtz
Copy link
Copy Markdown

@wtfiwtz wtfiwtz commented May 14, 2026

What type of PR is this?

Based off a different branch (release 26.3-based) I've upgraded most dependencies in the Docker container that have Critical vulnerabilities (detected by Docker Scout)

  • Refactor
  • Feature
  • Bug Fix
  • New Query Runner (Data Source)
  • New Alert Destination
  • Other

Description

How is this tested?

  • Unit tests (pytest, jest)
  • E2E Tests (Cypress)
  • Manually
  • N/A

Related Tickets & Documents

I'm in the process of testing this related branch - https://github.com/orchestrated-io/redash/commits/debug/v26.3.0p2/

Mobile & Desktop Screenshots/Recordings (if there are UI changes)

Comment thread Dockerfile Outdated
Comment thread Dockerfile Outdated
@wtfiwtz
Copy link
Copy Markdown
Author

wtfiwtz commented May 14, 2026

@eradman would you prefer Flask 3 upgrade included as well in this PR?
https://github.com/orchestrated-io/redash/commits/vuln-critical-2026-05-with-flask3/

@wtfiwtz
Copy link
Copy Markdown
Author

wtfiwtz commented May 15, 2026

This branch goes even further - https://github.com/orchestrated-io/redash/commits/vuln-critical-2026-05-with-flask3-patching/
However, it does remove some capabilities (unmaintained Python dependency, advocate - see orchestrated-io@8a0b422#diff-50c86b7ed8ac2cf95bd48334961bf0530cdc77b5a56f852c5c61b89d735fd711L26) and disables some deadlocking JWT tests (to be resolved - example at orchestrated-io@8a0b422#diff-fe702eb4ce0f41059045e995535201a805bb92a7fbb162d4036087626c0a6c85R419).

Here's a similar ECR scan from the branch I mentioned above, which is equivalent but on the v2026.3 release - https://github.com/orchestrated-io/redash/tree/debug/v26.3.0p2

image

@wtfiwtz
Copy link
Copy Markdown
Author

wtfiwtz commented May 18, 2026

@eradman I'll tidy up this PR and then submit some more follow ups

@wtfiwtz
Copy link
Copy Markdown
Author

wtfiwtz commented May 19, 2026

AI-generated comment (Cursor, Claude Opus 4.7): authored by the assistant on behalf of @wtfiwtz to explain the test-stability work in 030b3f72e Werkzeug 3.x on Flask 2.x and a follow-up fix to tests/test_cli.py.

Context

Staying on Flask 2.x (instead of going to Flask 3) and bumping Werkzeug to 3.x surfaced two pre-existing test-suite bugs that only show up once the dependency versions move. They aren't security-relevant on their own, but they had to be fixed for make test to be green on this PR.

1. g._login_user / g.org leakage across requests in the test client

tests/__init__.py::BaseTestCase.setUp pushes a single app.app_context() for the whole test case and then makes many requests via self.app.test_client() under that one context:

self.app_ctx = self.app.app_context()
self.app_ctx.push()
...
self.client = self.app.test_client()

In Flask 2.x, g is bound to the app context, not the request. Flask-Login caches the resolved user on g._login_user, and Redash's current_org caches the resolved org on g.org. Because the context isn't popped between requests in tests, the user/org resolved by the first request leaks into every subsequent request in the same test method.

That is exactly what produced the failure pattern of "got 200 where we expected 403/404" across test_alerts.py, test_groups.py, test_queries.py, test_users.py, and test_visualizations.py — the second (unauthorized) user is being served as the first (authorized) one because Flask-Login returns the cached user.

The fix in redash/authentication/__init__.py::init_app is a TESTING-only @app.before_request hook that clears those two keys:

@app.before_request
def reset_request_g_cache():
    if current_app.config.get("TESTING"):
        g.pop("_login_user", None)
        g.pop("org", None)

Production behavior is unchanged — the hook is gated on current_app.config.get("TESTING") and only runs under the test client.

2. Moving API-key auth out of the session user_loader

The same commit also removed the eager api_key_load_user_from_request(request) call from @login_manager.user_loader def load_user(...). API-key auth is not being removed as a Redash feature — it still lives in request_loader (and get_user_from_api_key / api_key_load_user_from_request are unchanged):

def request_loader(request):
    user = None
    if settings.AUTH_TYPE == "hmac":
        user = hmac_load_user_from_request(request)
    elif settings.AUTH_TYPE == "api_key":
        user = api_key_load_user_from_request(request)
    ...

The reason to take it out of load_user is that load_user is Flask-Login's session user loader: it gets _user_id from the session cookie and should return the corresponding session user. Running the API-key loader there means an API key on the request silently overrides whichever user Flask-Login would have restored from the session. That changes authorization semantics on endpoints that intentionally rely on the session user — concretely it caused tests/handlers/test_queries.py::QueryRefreshTest::test_refresh_forbiden_with_query_api_key to return 200 instead of the expected 403. Keeping API-key resolution in request_loader preserves the clean separation between session auth and request auth.

3. tests/test_cli.py::DataSourceCommandTests::test_list expectation

redash/utils/__init__.py::json_dumps does not set separators=(",", ":"), so ConfigurationContainer.to_json() produces the standard spaced form ({"dbpath": "/tmp/test.db"}). The test expectation has been aligned back to that, matching json_dumps's actual default output. No production code change.

Result

With these three fixes the previously failing tests on vuln-critical-2026-05 all pass:

tests/test_cli.py .                                                      [  8%]
tests/handlers/test_alerts.py ..                                         [ 25%]
tests/handlers/test_groups.py .                                          [ 33%]
tests/handlers/test_queries.py ...                                       [ 58%]
tests/handlers/test_users.py ..                                          [ 75%]
tests/handlers/test_visualizations.py ...                                [100%]
======================= 12 passed, 58 warnings in 11.16s =======================

@wtfiwtz wtfiwtz marked this pull request as ready for review May 19, 2026 01:45
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 12 files

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

Comment thread Dockerfile Outdated
@zachliu
Copy link
Copy Markdown
Contributor

zachliu commented May 28, 2026

I would approach solving these vulnerabilities by first categorizing them into two categories.

  1. Those that could be solved by simply upgrading libraries.
  2. Those that need actual code changes.

Using LLMs is good, but stuffing everything into a giant PR is still not preferable.

eg. #7712
2026-05-28_16-02

@eradman @arikfr @yoshiokatsuneo what do you think?

@zachliu
Copy link
Copy Markdown
Contributor

zachliu commented May 28, 2026

@wtfiwtz for better burning your tokens, could you kindly ask your agents to create smaller PRs and name them specifically such as Fixing Critical/High Vulnerability CVE-20xx-xxxxx. mushing everything together really creams my corn 😂

@zachliu zachliu self-requested a review May 28, 2026 20:23
@eradman
Copy link
Copy Markdown
Collaborator

eradman commented May 29, 2026

I would approach solving these vulnerabilities by first categorizing them into two categories.

  1. Those that could be solved by simply upgrading libraries.
  2. Those that need actual code changes.

That's a fine approach, but most importantly, solve one problem at a time.

@wtfiwtz wtfiwtz closed this Jun 1, 2026
@zachliu
Copy link
Copy Markdown
Contributor

zachliu commented Jun 1, 2026

@wtfiwtz Thanks for splitting these up and for your continued effort on the security front. it's appreciated!

However, the new PRs still bundle unrelated concerns together. The project's CONTRIBUTING.md is explicit:

"If you implement multiple unrelated features, bug fixes, or refactors please split them into individual pull requests."

Here's what I'm seeing:

#7718 - Titled as a base image upgrade, but also bumps Poetry 2.1.4 to 2.4.1, adds pip/setuptools/wheel upgrades, and adds core-js as a build fix (core-js@2 is EOL/unmaintained. The proper long-term fix is upgrading to core-js: 3, but that's a separate discussion)

#7719 - Titled as Flask/Werkzeug, but also includes boto3 1.28 to 1.43, snowflake-connector-python 3.x to 4.x, cryptography 43 to 48, python-dotenv, pre-commit, and new dependencies (azure-core, grpcio, h11,
httpcore, marshmallow).

#7720 - Mixes frontend dependency bumps with a markdown library migration to marked.

#7721 - Bundles a library replacement (advocate to champion) with an urllib3 upgrade.

The principle for security work across major OSS projects is one concern per PR:

  • Tightly coupled upgrades belong together (Flask + Werkzeug + Jinja2 + itsdangerous share a release ecosystem, that's fine as one PR)
  • Unrelated dependency bumps (boto3, snowflake, cryptography) each get their own PR
  • Code changes to adapt to a new API (authlib 1.x) get their own PR
  • Library replacements (advocate to champion) get their own PR
  • Build toolchain changes (Poetry, setuptools) are separate from OS base image changes
  • Frontend additions (core-js) don't belong in a Dockerfile PR

Why this matters practically: if snowflake-connector 4.x breaks queries, we revert that without touching Flask. If the authlib migration has a subtle OAuth bug, we isolate it without affecting boto3. If core-js causes a frontend build issue, we don't want to revert the trixie upgrade to fix it.

I know further splitting feels like busywork, but it genuinely makes each PR faster to review and safer to merge. A clean single-concern PR is an easy approval; a multi-concern PR sits in the queue because reviewers have to evaluate the blast radius of every change at once.

@wtfiwtz
Copy link
Copy Markdown
Author

wtfiwtz commented Jun 4, 2026

#7725 #7726 #7727 #7729 #7730 #7731 #7732 #7733 #7734 all ready to review and merge @zachliu

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants