Agents and SLMs Are Eating the World (While LLMs Fight for Relevance)
![]() |
Copyright: Sanjay Basu |
Remember that Anthropic piece from December about building effective agents? The one that told us to keep things simple, avoid over-engineering, and that agents were just “LLMs using tools based on environmental feedback in a loop”? Well, eight months later, that advice feels both prescient and already dated.
The landscape has shifted dramatically. What we’re seeing now isn’t just the maturation of agentic systems. It’s a fundamental restructuring of how we think about AI deployment. And the big, general-purpose LLMs? They’re scrambling to justify their existence in ways we didn’t see coming.
The SLM Revolution Nobody Saw Coming (Except Everyone)
Here’s what’s changed: Small Language Models aren’t just “good enough” anymore, they’re often better for specific tasks. That routing pattern Anthropic described, where you send easy queries to smaller models? That’s become the entire playbook for a new generation of companies.
We’re seeing 1–3B parameter models absolutely crushing it in narrow domains. Customer support? A fine-tuned 2B model knows your product documentation better than GPT-4o ever will. Code completion for your specific tech stack? That specialized model trained on your codebase beats Claude Opus hands down. The key insight isn’t that smaller is better. Remember my article from 5 months back — https://www.linkedin.com/pulse/small-models-big-impact-sanjay-basu-phd-kl98c/. It’s that specialized is better, and smaller models are infinitely more specializable.
The economics are brutal for the big players. Running a 400B+ parameter model for simple classification tasks is like using a Ferrari for grocery runs. Companies have figured this out, and they’re voting with their wallets.
Agents Moving From Experiment to Infrastructure
That framework skepticism from the December article? Still valid, but the market has moved on. We’re not debating whether to use LangGraph or build from scratch anymore. We’re deploying agents that run entire business functions.
The orchestrator-worker pattern has evolved into something more sophisticated: hierarchical agent swarms where specialized SLMs handle specific subtasks while a larger model (sometimes) coordinates. But here’s the kicker. Even the orchestrator role is getting disrupted. Why use a massive LLM to coordinate when a fine-tuned 7B model with the right scaffolding does it better?
What’s really wild is how agents have made the jump from “interesting demo” to “critical infrastructure.” We’re seeing agents that:
- Run 24/7 monitoring operations with failure rates lower than human teams
- Manage entire codebases with minimal human oversight
- Handle complex multi-vendor negotiations in supply chains
- Execute trading strategies that adapt to market conditions in real-time
The “trust” threshold Anthropic mentioned? We’ve blown past it. Not because agents got more trustworthy, but because we got better at building guardrails, checkpoints, and rollback mechanisms.
The LLM Empire Strikes Back
So how are the Claude 4s and GPT-5s of the world responding? Three main strategies.
1. The “Kitchen Sink” Approach: Throw every capability at the wall. Multimodal everything. Code execution. Web browsing. Voice. Video. The bet is that convenience and integration win over performance. It’s the Microsoft Office strategy. Nobody uses all of Excel, but everyone needs some of it.
2. The “Reasoning” Play: This is where the big models are doubling down hardest. Complex, multi-step reasoning that requires holding massive contexts and making non-obvious connections. The stuff that genuinely needs 400B+ parameters. Think scientific research, complex legal analysis, strategic planning. The problem? The market for this is smaller than they hoped.
3. The “Platform” Pivot: Stop competing on model performance, start competing on ecosystem. OpenAI’s GPTs, Anthropic’s Claude Projects, Google’s Vertex AI, they’re all trying to become the AWS of AI, where the model is just one part of a larger platform play.
The Inconvenient Truth
Here’s what nobody wants to admit: most business problems don’t need artificial general intelligence. They need artificial specific intelligence. A model that can write poetry, code in 50 languages, and explain quantum physics is impressive, but your customer support team just needs something that knows your refund policy inside and out.
The December article got it right about keeping things simple. But “simple” now means a constellation of specialized models and agents, each doing one thing extremely well, rather than one massive model trying to do everything.
What’s Next?
We’re heading toward a bifurcated market. On one end, commodity SLMs and agents handling 90% of practical AI applications. On the other, a handful of frontier models competing for the remaining 10%. The genuinely hard problems that need massive scale and generalization. The companies that win won’t be the ones with the biggest models. They’ll be the ones who figure out how to orchestrate armies of specialized agents, each running on the smallest possible model that gets the job done.
That advice from December about finding the simplest solution? Still golden. It’s just that “simple” now means something very different than it did eight months ago.
The real question isn’t whether agents and SLMs are taking center stage. They already have. It’s whether the general-purpose LLMs can find a sustainable niche before they price themselves out of existence.
And of course, the next wave of Physical AI — with Artificial minds (LLMs?) and embodiment of Agents.
![]() |
Copyright: Sanjay Basu |
Comments
Post a Comment