Skip to content

sort: do not panic on a non-UTF-8 --files0-from name#12773

Open
leeewee wants to merge 1 commit into
uutils:mainfrom
leeewee:sort-fix-files0-from-non-utf8
Open

sort: do not panic on a non-UTF-8 --files0-from name#12773
leeewee wants to merge 1 commit into
uutils:mainfrom
leeewee:sort-fix-files0-from-non-utf8

Conversation

@leeewee

@leeewee leeewee commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Fixes #12772

sort --files0-from=F decoded each NUL-separated filename with str::from_utf8(&line).expect(...), which panicked (exit 134) on any non-UTF-8 byte.

Build the name as an OsString from the raw bytes via uucore::os_string_from_vec and run the -/empty-name checks on the bytes, so a non-UTF-8 name now fails to open gracefully (exit 2) like GNU instead of aborting. Other cases (-, empty, valid list) are unchanged.

Added a Unix-gated regression test.

`sort --files0-from=F` decoded each NUL-separated filename with
`str::from_utf8(&line).expect(...)`, aborting on any non-UTF-8 byte — Linux
filenames are arbitrary byte strings. Carry the name as an `OsString` built from
the raw bytes (via `uucore::os_string_from_vec`) and do the "-"/empty checks on
the bytes, so a non-UTF-8 name flows into the normal open-failure path (exit 2)
like GNU instead of aborting.
@github-actions

Copy link
Copy Markdown

GNU testsuite comparison:

Skip an intermittent issue tests/misc/io-errors (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/tail/tail-n0f (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/cut/bounded-memory (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/date/date-locale-hour (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/tail/retry (passes in this run but fails in the 'main' branch)

@codspeed-hq

codspeed-hq Bot commented Jun 11, 2026

Copy link
Copy Markdown

Merging this PR will improve performance by 4.3%

⚠️ Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 3 improved benchmarks
✅ 320 untouched benchmarks
⏩ 46 skipped benchmarks1

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation ls_recursive_balanced_tree[(6, 4, 15)] 52.3 ms 49.6 ms +5.53%
Simulation ls_recursive_wide_tree[(10000, 1000)] 34.3 ms 33 ms +3.95%
Simulation single_date_now 85.5 µs 82.7 µs +3.41%

Tip

Curious why this is faster? Comment @codspeedbot explain why this is faster on this PR, or directly use the CodSpeed MCP with your agent.


Comparing leeewee:sort-fix-files0-from-non-utf8 (2e84ad1) with main (220a8ec)

Open in CodSpeed

Footnotes

  1. 46 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

sort panic (from_utf8().expect) on a non-UTF-8 filename in --files0-from

1 participant