← All Posts
Engineering

The AI Stack Every Indian Builder Is Debating Right Now

Loveneesh Dhir
The AI Stack Every Indian Builder Is Debating Right Now

Ask ten senior AI engineers at Indian product companies what stack they're running and you'll get eleven different opinions and a twenty-minute argument. Good. That means the field is actually moving.

The debates inside India's best AI teams right now are sharper and more opinionated than anything you'll find in a vendor blog or a conference talk. This is an attempt to lay out the live ones: the questions without clean answers yet, where genuinely smart people are landing in different places.

Build vs Buy vs Fine-tune

Everyone assumed this question would be settled by now. It isn't.

The "just use the API" camp is still strong for good reason. Frontier model capabilities are moving fast enough that two months of fine-tuning work often gets leapfrogged by the next API release. The ROI math is genuinely hard to justify in a lot of cases.

But the "we need to own the model" camp has better arguments than it did a year ago:

The live debate

Camp A: Fine-tuning is a trap. You're optimising for yesterday's ceiling. Camp B: Prompt engineering your way to production reliability at scale is a fantasy. At some point you have to own what you ship.

The honest answer nobody says out loud is that it depends on your use case, your team's actual ML depth, the quality of your data, and how stable your requirements are over the next year. The people worth listening to on this are the ones who've shipped both and can tell you specifically where each one broke.

RAG: Simple vs Sophisticated

RAG went from exciting research concept to production standard to overengineered mess in about eighteen months. Classic AI build cycle.

The basic pipeline: chunk documents, embed them, retrieve relevant chunks at query time, put them in context. It works. Works well enough that most teams built it, shipped it, and called it done. Then they hit production and found out that "works" and "works reliably with real user queries at scale" are very different things.

What teams are actually arguing about

"Simple RAG is easy to build and hard to make good. Complex RAG is hard to build and hard to debug. Pick your suffering."

The teams shipping the best RAG in India right now share one thing: they've put serious work into evaluation infrastructure. They know exactly what their system gets wrong, how often, and why. That feedback loop is what separates production-grade from demo-grade. The architecture choices matter less than most people think.

The Agent Question

The agent debate is the loudest one in the Indian AI builder community right now, and part of why it's so noisy is that the term means completely different things to different people.

For some teams it's a ReAct loop where a model reasons and uses tools. For others it's multi-agent orchestration with specialised sub-agents. For others it's a marketing label on a function-calling wrapper.

The questions that actually matter:

The thing nobody wants to say

Most "agent" systems in production are not actually agentic in any real sense. They're deterministic pipelines with LLM steps dressed up with agent vocabulary. And that's fine. A reliable pipeline that does one thing well will beat a flaky agent trying to do everything, every time.

Observability: The Unglamorous Problem That Kills Products

The most underrated debate in the Indian AI builder community has nothing to do with models or architectures. It's about observability.

Traditional software observability (metrics, logs, traces) is necessary but nowhere near sufficient for AI systems. You can have perfect infrastructure monitoring and still have no idea why your LLM app is producing bad outputs for 8% of queries.

The teams that have shipped AI to production and kept it alive all have some version of the same practices:

Nobody gets excited about this work. There's no interesting architecture to show off. It's just engineering discipline applied to a non-deterministic system. And it's what separates a demo that impresses a conference from a product that users actually trust.

The Indian Context

The cost constraint runs through all of these debates in the Indian context. By necessity, India's AI builders are more cost-conscious than teams at well-funded US startups. That constraint is producing real engineering creativity.

The teams doing the most interesting work on inference optimisation, on smaller fine-tuned models, on caching strategies that cut API costs by 60 to 70 percent: a lot of them are Indian teams who couldn't afford to build the expensive way and had to find the smarter way instead.

That's not a handicap. It's a forcing function that produces leaner, more defensible systems. The Indian AI stack coming out of this period is going to look different from the American one. More cost-opinionated, more pragmatic, and probably more resilient for it.

These are the conversations happening inside the Cabal. If you're deep in any of this, you should be in the room.

// Want in?
Supercharge your AI journey with us.

Partner with AI Cabal to reach India's most serious builders, engineers and leaders across 8 cities. Sponsorships, events and community programs now open.

Partner With Us →
← Previous Post Next Post →