Evidence-Ledger Synthesis of Retrieval-Augmented Generation

paper_id	claim	claim_status	evidence_status	source_depth	source_quote	page_or_section	taxonomy_fit	audit_status
2510.22344v1	We introduce a novel agentic RAG architecture centered on an Iterative Refinement loop.	preliminary-linked	has evidence row	full-text	On HotpotQA, it achieves an F1-score of 0.453—an absolute improvement of 8.3 points over the strongest iterative baseline—establishing a new state-of-the-art for this class of methods on these benchmarks.	Abstract	in-scope: LLM extractor confirmed direction match	needs work; filled but source-depth unclear; report=audit_report.md
2502.01113v3	We introduce a graph foundation model for retrieval augmented generation (GFM-RAG), powered by a novel query-dependent GNN to enable efficient multi-hop retrieval within a single step.	supported	has evidence row	full-text	This supports the opinion that GPT-4o-mini generally outperforms GPT-3.5-turbo in constructing high quality KG-index, which is crucial for the graph-enhanced retrieval.	1	in-scope: LLM extractor confirmed direction match	needs work; full-text verified; report=audit_report.md
2210.15133v1	The proposed ROM enables term importance information to help language model pre-training thus achieving better performance on multiple passage retrieval benchmarks.	supported	has evidence row	full-text	How- ever, the language model trained by the random masking strategy is ﬂawed. 3.3 Retrieval Oriented Masking As mentioned above, term importance is instruc- tive for passage retrieval.	4.4 Evaluation Results	in-scope: LLM extractor confirmed direction match	needs work; full-text verified; report=audit_report.md
2002.08909v1	We demonstrate the effectiveness of Retrieval-Augmented Language Model pre-training (REALM) by fine-tuning on the challenging task of Open-domain Question Answering (Open-QA).	preliminary-linked	has evidence row	full-text	we find that we outperform all previous methods by a significant margin (4-16% absolute accuracy)	Abstract	in-scope: LLM extractor confirmed direction match	needs work; filled but source-depth unclear; report=audit_report.md
2505.18906v2	This paper presents the first systematic mapping study of Federated RAG, covering literature published between 2020 and 2025.	supported	has evidence row	full-text	+12.7% QA accuracy (59.8 →72.5) using SGX	Extended Resources and Comparative Synthesis	in-scope: LLM extractor confirmed direction match	needs work; full-text verified; report=audit_report.md