Refactor high-complexity React components in Dify frontend. Use when `pnpm analyze-component...
npx skills add OmidZamani/dspy-skills --skill "dspy-simba-optimizer"
Install specific skill from multi-skill repository
# Description
This skill should be used when the user asks to "optimize with SIMBA", "use Bayesian optimization", "optimize agents with custom feedback", mentions "SIMBA optimizer", "mini-batch optimization", "statistical optimization", "lightweight optimizer", or needs an alternative to MIPROv2/GEPA for programs with rich feedback signals.
# SKILL.md
name: dspy-simba-optimizer
version: "1.0.0"
dspy-compatibility: "3.1.2"
description: This skill should be used when the user asks to "optimize with SIMBA", "use Bayesian optimization", "optimize agents with custom feedback", mentions "SIMBA optimizer", "mini-batch optimization", "statistical optimization", "lightweight optimizer", or needs an alternative to MIPROv2/GEPA for programs with rich feedback signals.
allowed-tools:
- Read
- Write
- Glob
- Grep
DSPy SIMBA Optimizer
Goal
Optimize DSPy programs using mini-batch Bayesian optimization with statistical analysis of feedback signals.
When to Use
- Need lighter-weight alternative to GEPA
- Have custom feedback metrics (not just accuracy)
- Agentic tasks with rich failure signals
- Budget-conscious optimization (fewer eval calls)
- Programs where few-shot examples aren't critical
Related Skills
- Alternative optimizers: dspy-miprov2-optimizer, dspy-gepa-reflective
- Agent optimization: dspy-react-agent-builder
- Evaluation: dspy-evaluation-suite
Inputs
| Input | Type | Description |
|---|---|---|
program |
dspy.Module |
Program to optimize |
trainset |
list[dspy.Example] |
Training examples |
metric |
callable |
Returns float or dspy.Prediction(score=..., feedback=...) |
max_steps |
int |
Number of optimization steps |
bsize |
int |
Mini-batch size |
Outputs
| Output | Type | Description |
|---|---|---|
optimized_program |
dspy.Module |
SIMBA-optimized program |
Workflow
Phase 1: Understand SIMBA
SIMBA (Stochastic Introspective Mini-Batch Ascent):
- Iterative prompt optimization with mini-batch sampling
- Identifies challenging examples with high output variability
- Generates self-reflective rules or adds successful demonstrations
- Lighter than GEPA (no reflection LM)
- More flexible than Bootstrap (uses feedback)
Comparison:
- MIPROv2: Best accuracy, lots of data
- GEPA: Agentic systems, expensive
- SIMBA: Custom feedback, budget-friendly
- Bootstrap: Simplest, demo-based
Phase 2: Basic SIMBA Optimization
import dspy
dspy.configure(lm=dspy.LM("openai/gpt-4o-mini"))
# Program to optimize
class QAPipeline(dspy.Module):
def __init__(self):
self.generate = dspy.ChainOfThought("question -> answer")
def forward(self, question):
return self.generate(question=question)
# Metric (can return just score or (score, feedback))
def qa_metric(example, pred, trace=None):
correct = example.answer.lower() in pred.answer.lower()
return 1.0 if correct else 0.0
# SIMBA optimizer
optimizer = dspy.SIMBA(
metric=qa_metric,
max_steps=10, # Optimization iterations
bsize=5 # Mini-batch size
)
program = QAPipeline()
compiled = optimizer.compile(program, trainset=trainset)
compiled.save("qa_simba.json")
Phase 3: SIMBA with Feedback Signals
SIMBA works best with rich feedback:
import dspy
def detailed_metric(example, pred, trace=None):
"""Metric with feedback signal."""
expected = example.answer.lower()
actual = pred.answer.lower()
if expected == actual:
return dspy.Prediction(score=1.0, feedback="Perfect match")
elif expected in actual:
return dspy.Prediction(score=0.7, feedback=f"Contains answer but verbose: '{actual}'")
else:
overlap = len(set(expected.split()) & set(actual.split()))
if overlap > 0:
return dspy.Prediction(score=0.3, feedback=f"Partial overlap: {overlap} words")
return dspy.Prediction(score=0.0, feedback=f"No match. Expected '{expected}'")
optimizer = dspy.SIMBA(
metric=detailed_metric,
max_steps=20, # Optimization iterations
bsize=8 # Mini-batch size
)
compiled = optimizer.compile(program, trainset=trainset)
Phase 4: Production Agent Optimization
import dspy
from dspy.evaluate import Evaluate
import logging
logger = logging.getLogger(__name__)
# Define tools as functions
def search(query: str) -> str:
"""Search knowledge base for relevant information."""
retriever = dspy.ColBERTv2(url='http://20.102.90.50:2017/wiki17_abstracts')
results = retriever(query, k=3)
return "\n".join([r['text'] for r in results])
def calculate(expr: str) -> str:
"""Evaluate Python expressions safely."""
try:
with dspy.PythonInterpreter() as interp:
return str(interp.execute(expr))
except Exception as e:
return f"Error: {e}"
class ResearchAgent(dspy.Module):
def __init__(self):
self.agent = dspy.ReAct(
"question -> answer",
tools=[search, calculate]
)
def forward(self, question):
return self.agent(question=question)
def agent_metric(example, pred, trace=None):
"""Rich metric for agent optimization."""
expected = example.answer.lower().strip()
actual = pred.answer.lower().strip() if pred.answer else ""
# Exact match
if expected == actual:
return dspy.Prediction(score=1.0, feedback="Correct answer")
# Partial match
if expected in actual:
return dspy.Prediction(score=0.7, feedback="Answer contains expected result")
# Check key terms
expected_terms = set(expected.split())
actual_terms = set(actual.split())
overlap = len(expected_terms & actual_terms)
if overlap >= len(expected_terms) * 0.5:
return dspy.Prediction(score=0.5, feedback=f"50%+ term overlap")
return dspy.Prediction(score=0.0, feedback=f"Incorrect: expected '{example.answer}'")
def optimize_agent(trainset, devset):
"""Full SIMBA optimization pipeline."""
dspy.configure(lm=dspy.LM("openai/gpt-4o-mini"))
agent = ResearchAgent()
# Baseline evaluation
eval_metric = lambda ex, pred, trace: agent_metric(ex, pred, trace).score
evaluator = dspy.Evaluate(devset=devset, metric=eval_metric, num_threads=4)
baseline = evaluator(agent)
logger.info(f"Baseline: {baseline:.2%}")
# SIMBA optimization
optimizer = dspy.SIMBA(
metric=agent_metric,
max_steps=25, # Optimization iterations
bsize=6 # Mini-batch size
)
compiled = optimizer.compile(agent, trainset=trainset)
# Evaluate optimized
optimized = evaluator(compiled)
logger.info(f"SIMBA optimized: {optimized:.2%}")
compiled.save("research_agent_simba.json")
return compiled
Configuration
optimizer = dspy.SIMBA(
metric=metric_fn,
max_steps=20, # Optimization iterations
bsize=32, # Mini-batch size (default: 32)
num_candidates=6, # Candidates per iteration (default: 6)
max_demos=4, # Max demos per predictor (default: 4)
temperature_for_sampling=0.2, # Sampling temperature (default: 0.2)
temperature_for_candidates=0.2 # Candidate selection temperature (default: 0.2)
)
Best Practices
- Use feedback signals - SIMBA benefits from
dspy.Prediction(score=..., feedback=...)objects - Balance parameters - Adjust
bsize(default 32) andmax_steps(default 8) based on dataset size - Patience - SIMBA is slower than Bootstrap, faster than GEPA
- Custom metrics - Best for scenarios with nuanced scoring (not binary)
- Tune temperatures - Lower temperatures (0.1-0.3) for exploitation, higher (0.5-1.0) for exploration
Limitations
- Newer optimizer, less battle-tested than MIPROv2
- Requires thoughtful metric design (garbage in, garbage out)
- Not as thorough as GEPA for agent optimization
- Mini-batch sampling adds variance to results
- No automatic prompt reflection like GEPA
# Supported AI Coding Agents
This skill is compatible with the SKILL.md standard and works with all major AI coding agents:
Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.