Reflective Prompt Evolution
🔗 Source: arXiv
GEPA: Reflective Prompt Evolution Can Outperform Reinforcement Learning
🚀 Technical Novelty
- Mechanism: Iteratively mutates compound AI system prompts by analyzing serialized natural language trajectories, applying multi-objective evolutionary search to maintain a Pareto front of diverse candidate prompts.
- Nuance: Bypasses GRPO’s reliance on thousands of scalar reward rollouts by leveraging LLMs’ native language priors for high-level rule extraction, avoiding greedy local optima through stochastic diversity maintenance rather than single-path gradient updates.
💡 Yield
- Achieves up to 20% higher accuracy than GRPO (avg +6%) across six benchmarks while using up to 35× fewer rollouts and surpassing MIPROv2 by >10%.
- Demonstrates strong sample efficiency, requiring only dozens of reflection LLM calls per task to converge on robust, generalizable prompt instructions.
⚠️ Limitations
- Optimization phase depends on external reflection LLM calls, introducing computational latency and API costs during the prompt tuning stage.
- Primarily validated on text-heavy reasoning and instruction-following workflows; scalability to highly parameterized or non-textual domains remains untested.