| 2602.07488v2 | We provide the first such theory in the case of data-limited scaling laws. | supported | has evidence row | full-text | Overall, this work unravels, for the first time, adi- rectlink between the shape of neural scaling laws and the statistical structure of language itself. 1.1. | 16 | in-scope: LLM extractor confirmed direction match | needs work; full-text verified; report=audit_report.md |
| 2509.24882v2 | We provide a sharp characterization of the excess risk achieved by empirical risk minimization for both diagonal linear networks and quadratic networks in the regime n, d≫ 1 with p≥d, under a power-law design for the target function and varying regularization strength λ. | supported | has evidence row | full-text | Together, these results provide a comprehensive theoretical and empirical understanding of scaling laws for feature learning in simple network models. 1.2 Further Relevant work Scaling laws —A large body of work has studied scaling laws in the lazy regime, where the features remain fixed. | 1.1 Main Results | in-scope: LLM extractor confirmed direction match | needs work; full-text verified; report=audit_report.md |
| 2605.26248v1 | A functional form that accurately models and extrapolates the scaling behaviors of deep neural networks as multiple dimensions all vary simultaneously. | supported | has evidence row | full-text | When compared to other functional forms for neural scaling, this functional form yields extrapolationsof scaling behavior that are considerably more accurate on this set. 1 INTRODUCTION Training today’s state-of-the-art neural networks requires significant amounts of computational resources and training data. | 1 | in-scope: LLM extractor confirmed direction match | needs work; full-text verified; report=audit_report.md |
| 2411.17691v2 | We reveal that low-bit quantization favors undertrained LLMs but suffers from significant quantization-induced degradation (QiD) when applied to fully trained LLMs. | supported | has evidence row | full-text | The contributions of this work are threefold: •We reveal that low-bit quantization favors undertrained LLMs but suffers from significant quantization-induced degradation (QiD) when applied to fully trained LLMs. | Abstract | in-scope: LLM extractor confirmed direction match | needs work; full-text verified; report=audit_report.md |
| 2602.02593v1 | We propose a unified framework that conceptualizes learning as the progressive advancement of an Effective Frontier k⋆ in the rank space. | preliminary-linked | has evidence row | full-text | Theorem 3.3(Universal Scaling Principle). Under Assumption 2.1 ∼2.2, if a resource R induces a effective frontier k⋆(R)(Definition 3.2), the reducible loss scales as: ∆L(R)≍k ⋆(R)−(α−1). | Section 3 | in-scope: LLM extractor confirmed direction match | needs work; filled but source-depth unclear; report=audit_report.md |