MetaTPT Adaptive Prompt Tuning
🔗 Source: arXiv
MetaTPT: Meta Test-time Prompt Tuning for Vision-Language Models
🚀 Technical Novelty
- Mechanism: Dual-loop meta-learning framework that jointly optimizes a self-supervised auxiliary task for dynamic, parameterized affine augmentations and learnable prompts via cross-view consistency enforcement.
- Nuance: Replaces the fixed, handcrafted data augmentations used in prior TTA methods (like TPT) with differentiable, sample-specific transformations that adapt online to capture nuanced domain features.
💡 Yield
- Achieves state-of-the-art performance on cross-dataset and domain generalization benchmarks across multiple VLMs (CLIP, CoOp, MaPLe, MMRL).
- Ablations confirm learnable augmentations consistently outperform fixed ones, dual-loop optimization surpasses one-stage joint training, and online adaptation beats offline variants.
⚠️ Limitations
- Online per-sample meta-adaptation inherently increases inference latency and memory overhead compared to static augmentation methods; computational trade-offs are not explicitly quantified in the provided text.