Implementing Intelligent LLM Routing in 5 Minutes
Quick tutorial to set up intelligent routing between GPT-4, Claude, and Gemini based on cost, quality, and latency. Copy-paste code included.
Optra AI Team
Engineering
Want to reduce your LLM costs by 30-40% with just a few lines of code? Intelligent routing is your answer. In this quick tutorial, we'll show you how to implement a production-ready routing system in less than 5 minutes.
What is Intelligent Routing?
Intelligent routing automatically selects the best LLM model for each request based on:
5-Minute Implementation
Step 1: Install Dependencies
npm install openai @anthropic-ai/sdk @google/generative-aiStep 2: Create the Router
interface RoutingDecision {
provider: 'openai' | 'anthropic' | 'google'
model: string
reasoning: string
estimatedCost: number
}
class IntelligentRouter {
route(prompt: string, options?: RoutingOptions): RoutingDecision {
const complexity = this.analyzeComplexity(prompt)
// Simple queries → cheap models
if (complexity < 30) {
return {
provider: 'openai',
model: 'gpt-3.5-turbo',
reasoning: 'Low complexity, cost-optimized',
estimatedCost: 0.0002
}
}
// Medium complexity → balanced model
if (complexity < 70) {
return {
provider: 'anthropic',
model: 'claude-3-sonnet',
reasoning: 'Medium complexity, balanced approach',
estimatedCost: 0.0018
}
}
// High complexity → premium model
return {
provider: 'openai',
model: 'gpt-4-turbo',
reasoning: 'High complexity, maximum quality',
estimatedCost: 0.0200
}
}
private analyzeComplexity(prompt: string): number {
let score = 0
// Length factor
if (prompt.length > 500) score += 30
else if (prompt.length > 200) score += 15
// Technical keywords
const technicalKeywords = ['code', 'implement', 'algorithm', 'debug']
if (technicalKeywords.some(kw => prompt.toLowerCase().includes(kw))) {
score += 40
}
// Question complexity
if (prompt.includes('why') || prompt.includes('how')) score += 20
return Math.min(score, 100)
}
}Step 3: Use the Router
const router = new IntelligentRouter()
async function callLLMWithRouting(prompt: string) {
const decision = router.route(prompt)
console.log(\`Routing to \${decision.model}: \${decision.reasoning}\`)
// Call the selected provider
switch (decision.provider) {
case 'openai':
return callOpenAI(decision.model, prompt)
case 'anthropic':
return callAnthropic(decision.model, prompt)
case 'google':
return callGoogle(decision.model, prompt)
}
}Advanced Routing Strategies
Cost-Based Routing
route(prompt: string, maxCost: number = 0.01): RoutingDecision {
const candidates = [
{ model: 'gpt-3.5-turbo', cost: 0.0002, quality: 0.85 },
{ model: 'claude-3-sonnet', cost: 0.0018, quality: 0.92 },
{ model: 'gpt-4-turbo', cost: 0.0200, quality: 0.95 }
]
// Filter by budget
const affordable = candidates.filter(c => c.cost <= maxCost)
// Pick highest quality within budget
return affordable.sort((a, b) => b.quality - a.quality)[0]
}Latency-Based Routing
route(prompt: string, maxLatency: number = 500): RoutingDecision {
const candidates = [
{ model: 'claude-3-haiku', latency: 250 },
{ model: 'gpt-3.5-turbo', latency: 320 },
{ model: 'gpt-4-turbo', latency: 420 }
]
return candidates.filter(c => c.latency <= maxLatency)[0]
}Testing Your Router
// Test with different query types
const testQueries = [
"What's the capital of France?", // Simple → GPT-3.5
"Explain how neural networks work", // Medium → Claude Sonnet
"Write a production-ready authentication system" // Complex → GPT-4
]
for (const query of testQueries) {
const decision = router.route(query)
console.log(\`"\${query}" → \${decision.model}\`)
}Common Pitfalls
Next Steps
Start with this simple implementation and iterate based on your actual usage patterns. Most teams see 30-40% cost reduction immediately.