Context-to-LoRA Hypernetwork
🔗 Source: arXiv
SHINE: A Scalable In-Context Hypernetwork for Mapping Context to LoRA in a Single Pass
🚀 Technical Novelty
- Mechanism: Extracts multi-layer memory states from the backbone LLM and processes them through a lightweight, bidirectional M2P Transformer to generate full-rank LoRA adapters in one pass.
- Nuance: Eliminates the restrictive MLP bottlenecks and subset-layer generation of prior hypernetworks by leveraging global self-attention across all layers, fully exploiting the LLM’s pre-trained inductive bias for coherent weight coordination.
💡 Yield
- Achieves performance parity with In-Context Learning and surpasses SFT baselines on multi-hop QA tasks while reducing amortizable generation time to ~0.3s.
- Demonstrates consistent scaling across backbone LLM sizes and hypernetwork parameters without capacity saturation, trained on 6B tokens.
⚠️ Limitations
- Performance degrades on multi-turn conversations as the original context is discarded after adapter generation, limiting long-context reasoning during extended interactions.
- Requires extensive pretraining (6B tokens) and curated instruction-tuning data, posing high upfront computational costs for deployment.