Rendered from docs/scoreboard/anthony-chaudhary/dos-kernel.md — the
Markdown in the repo is the source of truth; this page is generated by
scripts/build_incident_pages.py, never hand-edited.
How AI built anthony-chaudhary/dos-kernel
5 of 315 AI commit messages here claimed work the commit's own diff doesn't show (1.6%). The rest checked out. All five flags are subject-only doc claims with no contradicting effect — four are convention-driven empty re-stamp commits and one is a doc reflow; the receipts are below. This is page #1 of the index, the self-grade: the scoreboard names no other repository before publishing its own verdict, and ours is deliberately not airbrushed to zero. The check: an AI agent's commit message is just text it wrote — the diff is what git recorded. This page reports, for the AI-authored commits, whether each concrete claim in a message ("fix X", "add tests for Y") is backed by that commit's own diff. A message-vs-diff mismatch is never a correctness, honesty, or intent grade — only a note that a commit's words and its own diff disagree. Schema and the precise definition: docs/311.
As of
| Audited range | abe74e880309c98cdb38f3ac295218745ab9efeb → 1ffdaff70a3282d6ad90940438f09b1f1705a44d |
| Commits in range | 500 (the full visible history since the 2026-06-10 public seed) |
| Rendered | 2026-06-16 |
| Auditor | dos-kernel 0.27.0 at 1ffdaff — the range's own end commit, so the auditor and the audited history pin together; includes the #79/#81 fire-narrowing fixes |
| Tier | self |
| Attribution | all commits, author-neutral (the self page audits everything; foreign pages audit agent-attributed commits only) |
The verdict
| Commits | Checkable | Backed by the diff | Claimed, not shown (raw) | Skipped | Raw rate | Final grade |
|---|---|---|---|---|---|---|
| 500 | 315 | 310 | 5 | 185 | 1.6% | 5 of 315 (1.6%) |
Raw and adjudicated agree here: zero of the five flags is an auditor
artifact, so adjudication removed nothing. (Before the #79/#81
fire-narrowing landed in the auditor — 86f437f, in this same range — the
sweep carried additional artifact fires; that history, hand-adjudicated, is
in the methodology's false-positive section.)
By kind of claim
| Kind of claim | Backed by the diff | Claimed, not shown | Skipped |
|---|---|---|---|
fix / add / remove (code) |
77 | 1 | 0 |
tests |
9 | 0 | 0 |
docs |
224 | 4 | 0 |
| no checkable claim (skipped) | — | — | 185 |
The receipts — every flag, adjudicated
| Commit | Subject | Ruling | Rung | Rationale |
|---|---|---|---|---|
841d38d |
fix(answers): keep the canonical ship-stamp on one line so the lockstep scan passes |
CONFIRMED(unexplained) |
human | A fix: subject whose diff only reflows two doc lines so a literal example subject sits intact on one line — no rendered guidance changed. Not a re-stamp (so not the convention class); a true non-effecting doc edit whose claim rests on subject text alone. The auditor is right to count it, and we leave it flagged rather than airbrush it: a doc fix: that the diff doesn't strongly witness is exactly the kind of flag this scoreboard is honest about keeping. |
c956d2a |
docs(plans): re-stamp under the full slug while docs/317 is contested (docs/317_duplicate-plan-number-disambiguation-plan P1) |
CONFIRMED(convention) |
human | A deliberate re-stamp under the full plan slug after a docs/317 plan-number collision — the first live firing of the very slug-or-nothing rule that plan shipped. The claim rests on subject text by design; the original ship SHA it points at is the witness. |
0843842 |
docs(plans): re-stamp the work-account CLI verb post-renumber (docs/310 P3) |
CONFIRMED(convention) |
human | A deliberate empty commit: this workspace's re-stamp convention re-anchors a plan phase's ship-stamp after a plan-number collision, so the claim rests on subject text alone by design. The auditor is right to count it. |
cc00bf1 |
docs(plans): re-stamp the severity-gate wiring post-renumber (docs/310 P2) |
CONFIRMED(convention) |
human | Same convention, same renumber event. |
bf05e27 |
docs(plans): re-stamp the work-kind account leaf post-renumber (docs/310 P1) |
CONFIRMED(convention) |
human | Same convention, same renumber event. |
Four flags come from re-stamp events (the docs/310 and docs/317 plan-number collisions); the fifth is a one-line doc reflow that changed no rendered guidance. The re-stamp convention itself is under design review — #80 proposes making re-stamps carry a plan-doc line so they witness themselves — and if that lands, those flags drop to zero the honest way: by changing the commits, never the auditor.
Reproduce it
The auditor version is pinned to the same commit the range ends on, so one checkout gives you both the tool and the history it graded:
git clone https://github.com/anthony-chaudhary/dos-kernel.git && cd dos-kernel
git checkout 1ffdaff70a3282d6ad90940438f09b1f1705a44d
pip install -e .
dos commit-audit --sweep --json --workspace . \
abe74e880309c98cdb38f3ac295218745ab9efeb..1ffdaff70a3282d6ad90940438f09b1f1705a44d
A newer auditor over the same pinned range may count differently as fire-narrowing continues (each narrowing is a public issue, e.g. #79/#81); the as-of block above is what this page graded, with what.
Corrections
A contested flag gets re-adjudicated and the page re-rendered (the
docs/311 §3 path). Until the
scoreboard-correction template ships (docs/311 P4), open a plain issue
naming this page and the SHA. Methodology — what the witness reads, what it
abstains on, where it has been wrong:
docs/scoreboard/methodology.md.