Building a Hybrid LLM Platform on EKS, Part 6: The Hybrid Router
Part 6 of our hands-on EKS series. We build a FastAPI router that sits in front of both vLLM and the Anthropic API, routes each request to the right backend based on model name and complexity heuristics, and falls back to cloud when the local model is cold-starting.