examples: fix fraud case-study blog link + byte-hash reproducibility claim by ZhengyaoJiang · Pull Request #155 · WecoAI/weco-cli

ZhengyaoJiang · 2026-06-12T20:41:00Z

What

Two reader-facing fixes in the fraud-detection examples (strict + loose), found while empirically verifying the strict example end-to-end for the Vardera case study blog (WecoAI/landing#92).

Dead blog link (4 occurrences across examples/README.md and both fraud READMEs): weco.ai/blog/framing-the-problem → weco.ai/blog/framing-the-puzzle-for-autoresearch, the slug the case-study post actually ships with (WecoAI/landing#90).
Byte-hash reproducibility claim: prepare_data.py printed "Expected SHA-256 (matches the published case study parquets)" and the loose README said the baseline is "verifiable via the SHA-256s". Parquet bytes embed pandas/pyarrow writer metadata, so hashes don't survive version drift even when the data is logically identical. Verified 2026-06-12 on a fresh Windows venv (pandas 3.0.3, pyarrow 24.0.0): both hashes differ from the published ones while evaluate.py still prints auc_roc: 0.909132 (strict) and auc_roc: 0.910171 (loose) exactly. A reader who checks hashes would wrongly conclude their data prep failed. Replaced with the version-independent check — evaluate.py reproducing the README baseline — in both prepare_data.py copies (docstring + final print) and the loose README.

The two prepare_data.py files remain byte-identical to each other after the edit; both compile-checked.

🤖 Generated with Claude Code

…y claim Two reader-facing fixes in the fraud-detection examples: 1. The case-study blog link pointed at weco.ai/blog/framing-the-problem, which 404s; the post ships as framing-the-puzzle-for-autoresearch (landing PR #90). Fixed in examples/README.md and both fraud-detection READMEs (4 occurrences). 2. prepare_data.py promised byte-identical SHA-256 parquets, but parquet bytes embed pandas/pyarrow writer metadata, so hashes do not survive version drift. Verified 2026-06-12 on a fresh venv (pandas 3.0.3, pyarrow 24.0.0): hashes differ from the published ones while evaluate.py still prints auc_roc: 0.909132 (strict) / 0.910171 (loose) exactly. A reader checking hashes would wrongly conclude their data prep failed. Replaced the hash claim in both prepare_data.py copies (docstring + final print) and the loose README with the version-independent check: evaluate.py reproducing the README baseline AUC. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

examples: fix fraud case-study blog link + byte-hash reproducibility claim#155

examples: fix fraud case-study blog link + byte-hash reproducibility claim#155
ZhengyaoJiang wants to merge 1 commit into
mainfrom
vk/fraud-example-link-sha-fixes

ZhengyaoJiang commented Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ZhengyaoJiang commented Jun 12, 2026

What

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant