GenerativeAdapter On-The-Fly Adaptation
🔗 Source: arXiv
GENERATIVE ADAPTER: CONTEXTUALIZING LANGUAGE MODELS IN PARAMETERS WITH A SINGLE FORWARD PASS
🚀 Technical Novelty
- Mechanism: Trains a lightweight adapter generator network that projects accumulated context hidden states into layer-wise additive delta weights for a frozen base LM using only forward passes.
- Nuance: Bypasses gradient-based fine-tuning and the KV-cache scaling of standard prompting by dynamically updating model parameters on-the-fly with constant inference cost regardless of input length.
💡 Yield
- Achieves 63.5% F1 improvement over supervised fine-tuning on StreamingQA for 32K-token contexts.
- Outperforms base model accuracy on MetaICL in-context learning across 26 diverse tasks.
- Cuts computation and memory costs by 4x compared to full conversation prompting for user personalization.
- Maintains high fact recall on long documents while avoiding the storage overhead of KV caches.
⚠️ Limitations
- Requires extensive self-supervised pretraining of the adapter generator before it can be deployed.
- Performance may degrade on highly dynamic or out-of-distribution streaming contexts not captured during generator training.
- Restricted to updating only linear projection layers, leaving other LM components (e.g., normalization, embeddings) unadapted.