Skip to content

Add verbose CLI progress events#2673

Open
quanru wants to merge 5 commits into
mainfrom
feat/shared-cli-verbose-progress
Open

Add verbose CLI progress events#2673
quanru wants to merge 5 commits into
mainfrom
feat/shared-cli-verbose-progress

Conversation

@quanru

@quanru quanru commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Add a global --verbose CLI flag that prints readable progress while Midscene commands are running.
  • Make text-mode aiAct verbose output a clean human-readable timeline: start prompt, each plan's latest screenshot, planned reasoning, action details, action running/done status, timing, and final completion text.
  • Surface aiAct screenshot paths next to the report under midscene_run/report/screenshots/, so agents and humans can open the exact screenshot referenced by each plan step.
  • Preserve the generic verbose output for non-act commands and keep --verbose=jsonl as the structured machine-readable event stream.
  • Reuse one CLI screenshot file writer for final tool-result screenshots and verbose dump screenshots; file extensions follow the image mime/format instead of forcing .png.
  • Keep verbose payloads bounded: screenshot file paths are surfaced without base64, command args keep their nested shape, and dump updates use explicit minimal type guards.

Example

Real stdout from ./packages/web-integration/bin/midscene-web --verbose act --prompt "click the blue Search button" against packages/web-integration/tests/ai/fixtures/input-test.html:

[Midscene][aiAct] Start: click the blue Search button
[Midscene][aiAct][Plan 1/20] Thinking with the latest screenshot: midscene_run/report/screenshots/6b5f797f-d7fd-45b3-a5fa-d2f555ca3d5f.jpeg
[Midscene][aiAct][Plan 1/20] Planned: Click the blue Search button
[Midscene][aiAct][Plan 1/20] Action: Tap "the blue Search button" at (893, 364), bbox=(832,340,954,389)
[Midscene][aiAct][Action] Running: Tap at (893, 364)
[Midscene][aiAct][Action] Done: Tap cost=891ms
[Midscene][aiAct][Plan 2/20] Thinking with the latest screenshot: midscene_run/report/screenshots/068dd6e3-8296-41ce-9494-95d74e6f3747.jpeg
[Midscene][aiAct][Plan 2/20] Planned: The user's instruction was to "click the blue Search button", and the previous action has already executed that click. The current screenshot shows the same interface with the S...
[Midscene][aiAct] Complete: The blue Search button has been clicked.
Action "act" completed.
Result: The blue Search button has been clicked.

Validation

  • pnpm exec biome check packages/shared/src/cli/verbose.ts packages/shared/src/cli/screenshot-file.ts packages/shared/src/mcp/tool-generator.ts packages/core/src/types.ts packages/core/src/agent/tasks.ts packages/shared/tests/unit-test/tool-generator.test.ts
  • pnpm --dir packages/shared exec vitest run tests/unit-test/tool-generator.test.ts -t "verbose"
  • pnpm exec nx test @midscene/shared
  • pnpm exec nx build @midscene/web
  • Real midscene-web run against local fixture: ./packages/web-integration/bin/midscene-web --verbose act --prompt "click the blue Search button"; verified stdout includes Start, per-plan screenshot paths, Planned, Action, Running, Done, and Complete lines.
  • pnpm run lint was run, but it fails locally on unrelated untracked files packages/web-integration/swipe-verify.cjs and packages/web-integration/swipe-verify2.cjs with lint/style/useSingleVarDeclarator; those files are not part of this PR.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 66136ee73c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

const startedAt = Date.now();
emitCliVerboseEvent({
event: 'command_start',
args: compactCliVerboseArgs(handlerArgs),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Redact sensitive CLI args from verbose events

When --verbose is used, the command_start event serializes all handlerArgs directly; for the computer RDP tools the accepted init args include password (packages/computer/src/mcp-tools.ts), so a command such as midscene-computer --verbose act --host ... --username ... --password ... prints the password in the structured JSON progress stream. Because these verbose logs are intended for progress collection and may be persisted by CI or wrappers, sensitive keys such as password/token/apiKey should be redacted before emitting the args payload.

Useful? React with 👍 / 👎.

@cloudflare-workers-and-pages

cloudflare-workers-and-pages Bot commented Jun 12, 2026

Copy link
Copy Markdown

Deploying midscene with  Cloudflare Pages  Cloudflare Pages

Latest commit: 7ece567
Status: ✅  Deploy successful!
Preview URL: https://06d3ebb4.midscene.pages.dev
Branch Preview URL: https://feat-shared-cli-verbose-prog.midscene.pages.dev

View logs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant