Open Problems in During Inference: An Evidence-Ledger Investigation

paper_id	claim	claim_status	evidence_status	source_depth	source_quote	page_or_section	taxonomy_fit	audit_status
2309.05605v3	We propose a lightweight memory injection method that can be employed to correct a multi-hop reasoning failure during inference.	preliminary-linked	has evidence row	full-text	by employing our method to inject the memory of 'The Great Barrier Reef' into the multi-hop prompt 'The largest coral reef system in the world is located off the coast of. . . ' during inference, we increase the probability of the next token 'Australia' by 189%; refer to Fig. 3 for details.	Abstract	in-scope: LLM extractor confirmed direction match	needs work; filled but source-depth unclear; report=audit_report.md
2511.11834v1	VC effectively reflects performance degradation without requiring labeled data.	preliminary-linked	has evidence row	full-text	Our results reveal a strong negative correlation between classification accuracy and log(VC) (correlation ρ < −0.90 in most cases), suggesting that VC effectively reflects performance degradation without requiring labeled data.	Abstract	in-scope: LLM extractor confirmed direction match	needs work; filled but source-depth unclear; report=audit_report.md
1811.10649v1	We model the analog noise of neuromorphic circuits as additive and multiplicative Gaussian noise.	supported	has evidence row	full-text	the accuracy has been further increased to (99 :5%;89:1%;89:6%) for the three datasets when noise power equals the signal power.	18	in-scope: LLM extractor confirmed direction match	needs work; full-text verified; report=audit_report.md
1807.06555v1	One of our contributions is to apply the noise injection method during both training and inference of RNNs to realize that the noisy computation problem in neuromorphic computing can be largely mitigated by this method.	preliminary-linked	has evidence row	full-text	Experiments on the MNIST dataset reveal that with the presence of noise during computation and for all test RNN architectures, including LSTMs and vanilla RNNs, validation accuracy can be improved from (12:5%;10:5%;15%) to over (98%;92%;94%) , respectively.	9	in-scope: LLM extractor confirmed direction match	needs work; filled but source-depth unclear; report=audit_report.md
2403.02181v3	AdaInfer can achieve an average of 17.8% pruning ratio, and up to 43% on sentiment tasks, with nearly no performance drop (<1%)	preliminary-linked	has evidence row	full-text	Observation 1. Not all layers of LLMs are necessary during inference: Early Stopping works.	3.2	in-scope: LLM extractor confirmed direction match	needs work; filled but source-depth unclear; report=audit_report.md