# Multi-Round Audit Report: cs-ai/deep-llm-layer-redundancy

Generated: 2026-06-05T10:02:33+00:00

## Verdict: needs work

## Submission readiness

- Status: blocked
- Requirement mode: draft audit with final-readiness blockers
- Blocker: No full-text supported claims found in brief.md; current claims are draft/preliminary only.
- Blocker: Only 0 full-text verified rows; target is at least 1.
- Blocker: 5 evidence row(s) are preliminary or abstract-derived and need full-text audit.
- Blocker: 2604.24938v2 uses result/comparison language while still preliminary; verify before final claims.
- Blocker: 2411.03513v1 uses result/comparison language while still preliminary; verify before final claims.
- Blocker: 2510.22228v1 uses result/comparison language while still preliminary; verify before final claims.
- Blocker: 2602.14649v1 uses result/comparison language while still preliminary; verify before final claims.
- Blocker: 5 preliminary-linked claim(s) remain; do not promote to final support.
- Blocker: Only 1 evidence row(s) carry numerical results; need at least 2. Zero-signal submissions cannot be marked ready.
- Blocker: Correctness score 0.145 below floor 0.5; quotes weakly match PDF or claims weakly match quotes.
- Blocker: Demo proof score 0.0 below required floor 0.8 (0/5 claims independently re-verified against cached PDFs).
- Blocker: Demo proof verdict=fail.
- Blocker: Claim not independently re-verified: 2604.24938v2 (overlap=0.0, substring=False).
- Blocker: Claim not independently re-verified: 2411.03513v1 (overlap=0.0, substring=False).
- Blocker: Claim not independently re-verified: 2510.22228v1 (overlap=0.0, substring=False).
- Blocker: Claim not independently re-verified: 2406.07929v1 (overlap=0.0, substring=False).
- Blocker: Claim not independently re-verified: 2602.14649v1 (overlap=0.0, substring=False).
- Blocker: Correctness score 0.145 below floor 0.55; claim→quote→PDF chain is too weak for product trust.
- Blocker: 5 evidence row(s) cannot be checked because cached PDFs are missing.
- Blocker: correctness detail: 2604.24938v2: no cached PDF — quote unverifiable
- Blocker: correctness detail: 2604.24938v2: missing source_quote/page/checked_at
- Blocker: correctness detail: 2411.03513v1: no cached PDF — quote unverifiable
- Blocker: correctness detail: 2411.03513v1: missing source_quote/page/checked_at
- Blocker: correctness detail: 2510.22228v1: no cached PDF — quote unverifiable

## Audit artifacts

- Run directory: `workspace/cs-ai/deep-llm-layer-redundancy/audit_runs/2026-06-05T10-02-33-00-00`
- Per-round JSON: `round_1.json` ... `round_13.json`
- Input hashes: `input_hashes.json`

| Round | Check | Verdict | Issues | Warnings |
| ---: | --- | --- | ---: | ---: |
| 1 | Ledger integrity | needs work | 0 | 1 |
| 2 | Evidence depth and numerical discipline | needs work | 0 | 6 |
| 3 | Paper quality and framing | pass | 0 | 0 |
| 4 | Coverage, taxonomy leakage, and missing-literature risk | pass | 0 | 0 |
| 5 | Claim calibration and submission readiness | needs work | 0 | 1 |
| 6 | Positive-signal floor | needs work | 0 | 2 |
| 7 | Academic format and scholarly correctness | pass | 0 | 0 |
| 8 | Demo and proof (independent re-verification) | needs work | 0 | 7 |
| 9 | Direction coherence (anti-boilerplate-leak) | pass | 0 | 0 |
| 10 | Research-value (gap/contradiction/surprise/recency) | pass | 0 | 0 |
| 11 | System correctness (claim→quote→PDF) | needs work | 0 | 7 |
| 12 | Cross-model reviewer committee | pass | 0 | 0 |
| 13 | Citation integrity (cited→cached metadata) | pass | 0 | 0 |

## Evidence profile

- Filled evidence rows: 5
- Full-text verified rows: 0
- Preliminary / abstract-derived rows: 5
- Source-depth unclear rows: 0

## Round details

### Round 1: Ledger integrity — needs work

**Warnings**

- No full-text supported claims found in brief.md; current claims are draft/preliminary only.

**Notes**

- claim_rows=5
- supported_claims=0
- preliminary_claims=5
- filled_evidence_rows=5
- This round checks structure and status calibration: supported means full-text verified; preliminary-linked means traceable draft evidence.

### Round 2: Evidence depth and numerical discipline — needs work

**Warnings**

- Only 0 full-text verified rows; target is at least 1.
- 5 evidence row(s) are preliminary or abstract-derived and need full-text audit.
- 2604.24938v2 uses result/comparison language while still preliminary; verify before final claims.
- 2411.03513v1 uses result/comparison language while still preliminary; verify before final claims.
- 2510.22228v1 uses result/comparison language while still preliminary; verify before final claims.
- 2602.14649v1 uses result/comparison language while still preliminary; verify before final claims.

**Notes**

- full_text_verified=0/5
- preliminary_or_abstract=5/5
- unclear_source_depth=0/5

### Round 3: Paper quality and framing — pass

**Notes**

- paper=workspace/cs-ai/deep-llm-layer-redundancy/paper/main.md
- finding_sections=3
- filled_evidence_rows=5

### Round 4: Coverage, taxonomy leakage, and missing-literature risk — pass

**Notes**

- triage_rows=5
- claimed_evidence_rows=5
- target_categories=cs.AI, cs.CL, cs.LG
- Coverage gaps still require human/domain reviewer search beyond arXiv metadata.

### Round 5: Claim calibration and submission readiness — needs work

**Warnings**

- 5 preliminary-linked claim(s) remain; do not promote to final support.

**Notes**

- claim_rows=5
- supported_claims=0
- preliminary_claims=5
- draft_only_claims=0
- unsupported_claims=0

### Round 6: Positive-signal floor — needs work

**Warnings**

- Only 1 evidence row(s) carry numerical results; need at least 2. Zero-signal submissions cannot be marked ready.
- Correctness score 0.145 below floor 0.5; quotes weakly match PDF or claims weakly match quotes.

**Notes**

- numeric_result_rows=1/5 (floor=2)
- comparative_rows=2/5 (floor=1)
- unique_cited_papers=5 (floor=3)
- correctness_score=0.145 (floor=0.5)
- novelty_score=0.974 (floor=0.35)

### Round 7: Academic format and scholarly correctness — pass

**Notes**

- paper=workspace/cs-ai/deep-llm-layer-redundancy/paper/main.md
- abstract_words=109
- total_words=2510
- references_listed=5
- missing_format_sections=none

### Round 8: Demo and proof (independent re-verification) — needs work

**Warnings**

- Demo proof score 0.0 below required floor 0.8 (0/5 claims independently re-verified against cached PDFs).
- Demo proof verdict=fail.
- Claim not independently re-verified: 2604.24938v2 (overlap=0.0, substring=False).
- Claim not independently re-verified: 2411.03513v1 (overlap=0.0, substring=False).
- Claim not independently re-verified: 2510.22228v1 (overlap=0.0, substring=False).
- Claim not independently re-verified: 2406.07929v1 (overlap=0.0, substring=False).
- Claim not independently re-verified: 2602.14649v1 (overlap=0.0, substring=False).

**Notes**

- demo=workspace/cs-ai/deep-llm-layer-redundancy/paper/demo.py
- proof=workspace/cs-ai/deep-llm-layer-redundancy/paper/proof.json
- proof_score=0.0
- passed=0/5
- verdict=fail

### Round 9: Direction coherence (anti-boilerplate-leak) — pass

**Notes**

- direction_id=auto-deep-llms-layer
- family=deep-llm-layer-redundancy
- keywords_checked=10
- keyword_hits=8
- cross_family_leaks=0

### Round 10: Research-value (gap/contradiction/surprise/recency) — pass

**Notes**

- value_score=0.14 threshold=0.35
- gap_count=1 contradictions=0 surprises=0 recent_papers=3/5
- components gap=0.125 contradiction=0.0 surprise=0.0 recency=0.6
- ADVISORY: research-value 0.14 below threshold 0.35 — this direction mines few open problems / contradictions / surprises in the current corpus. Not blocking (truth gates already passed); use `--min-value 0.35` on produce-direction to hard-gate at production time, or `propose-directions` for higher-value angles.

### Round 11: System correctness (claim→quote→PDF) — needs work

**Warnings**

- Correctness score 0.145 below floor 0.55; claim→quote→PDF chain is too weak for product trust.
- 5 evidence row(s) cannot be checked because cached PDFs are missing.
- correctness detail: 2604.24938v2: no cached PDF — quote unverifiable
- correctness detail: 2604.24938v2: missing source_quote/page/checked_at
- correctness detail: 2411.03513v1: no cached PDF — quote unverifiable
- correctness detail: 2411.03513v1: missing source_quote/page/checked_at
- correctness detail: 2510.22228v1: no cached PDF — quote unverifiable

**Notes**

- correctness_score=0.145 floor=0.55
- rows_scored=5
- pdfs_missing=5
- quote_in_pdf_avg=0.0
- claim_support_avg=0.483
- locator_present_avg=0.0

### Round 12: Cross-model reviewer committee — pass

**Notes**

- LLM disabled — cross-model jury skipped (deterministic baseline).

### Round 13: Citation integrity (cited→cached metadata) — pass

**Notes**

- citations_checked=5
- fabricated=0 year_mismatch=0 title_drift=0
- cached_corpus_size=12
- This round is deterministic: it cross-checks printed citations against cached arXiv metadata only.

## Interpretation

- `pass` means the deterministic audit found no structural, source-depth, taxonomy, or paper-quality warnings.
- `needs work` means the draft is traceable but still needs full-text/source-depth, taxonomy, or quality cleanup.
- `unsupported` means claims or required artifacts are missing or inconsistent enough to block trust.
- `submission_readiness=blocked` means the draft must not be treated as final or deployed as ready, even if it is useful as a transparent draft.
