| 2603.17771v2 | The abstract reports: Attention sinks and massive activations are recurring and closely related phenomena in Transformer models. | preliminary-linked | has evidence row | abstract | Attention sinks and massive activations are recurring and closely related phenomena in Transformer models. | abstract | in-scope: taxonomy category match | needs work; preliminary / abstract-derived; report=audit_report.md |
| 2504.20966v4 | The abstract reports: We introduce softpick, a rectified, not sum-to-one, drop-in replacement for softmax in transformer attention mechanisms that eliminates attention sink and massive activations. | preliminary-linked | has evidence row | abstract | We introduce softpick, a rectified, not sum-to-one, drop-in replacement for softmax in transformer attention mechanisms that eliminates attention sink and massive activations. | abstract | in-scope: taxonomy category match | needs work; preliminary / abstract-derived; report=audit_report.md |
| 2605.00968v1 | The abstract reports: Positional encoding plays a pivotal role in determin?ing the extrapolation and generalization performance of wireless foundation models for channel state information (CSI) modeling, latent characterization, and task-specific prediction. | preliminary-linked | has evidence row | abstract | Positional encoding plays a pivotal role in determin?ing the extrapolation and generalization performance of wireless foundation models for channel state information (CSI) modeling, latent characterization, and task-specific prediction. | abstract | in-scope: taxonomy category match | needs work; preliminary / abstract-derived; report=audit_report.md |
| 2402.17762v2 | The abstract reports: We observe an empirical phenomenon in Large Language Models (LLMs) -- very few activations exhibit significantly larger values than others (e.g., 100,000 times larger). | preliminary-linked | has evidence row | abstract | We observe an empirical phenomenon in Large Language Models (LLMs) -- very few activations exhibit significantly larger values than others (e.g., 100,000 times larger). | abstract | in-scope: taxonomy category match | needs work; preliminary / abstract-derived; report=audit_report.md |
| 2509.05218v2 | The abstract reports: Positional encoding mechanisms enable Transformers to model sequential structure and long-range dependencies in text. | preliminary-linked | has evidence row | abstract | Positional encoding mechanisms enable Transformers to model sequential structure and long-range dependencies in text. | abstract | in-scope: taxonomy category match | needs work; preliminary / abstract-derived; report=audit_report.md |