How We Built a Hinglish Chatbot for 2 Million+ Users — When No Hinglish NLP Model Existed

In 2018, Sony MAX came to us with a brief that sounded deceptively simple: build a Facebook Messenger chatbot for Bollywood fans.

I was running Cedex Technologies at the time — a software company I founded, focused on chatbots, AI, and new-age product experiences. We had built conversational products before. But this one was different.

The chatbot — which would become Filmykaant, the "Ultimate Bollywood Fan" — needed to answer questions about films, run trivia quizzes, share TV schedules, and keep fans hooked with gamified content. It needed to feel natural, fun, and distinctly Bollywood.

What we didn't fully appreciate at the start was what "natural" actually meant for this audience. It meant Hinglish. And Hinglish was a language that no NLP model in 2018 could reliably understand.

The Problem Nobody Had Solved

Hinglish is Hindi written in English script. It isn't a formal language with standardised grammar or consistent spelling. It's simply how hundreds of millions of people in India type — in WhatsApp messages, in comments, in search bars, in chatbots.

Here's what real user queries looked like:

"Yaar kal Sony MAX par kya aayega?"
(Friend, what's coming on Sony MAX tomorrow?)

"Bhai Amitabh ki best movie kaunsi hai?"
(Brother, what's Amitabh's best movie?)

In 2018, none of the major NLP platforms — not Dialogflow, not IBM Watson, not Wit.ai — had meaningful Hinglish support. There was no training data. No pre-built vocabulary. No existing model to fine-tune.

We had two options: build a rigid, keyword-based rule system and accept a stilted user experience, or build the NLP capability ourselves from scratch.

We chose the harder path.

Building the NLP Pipeline from Zero

The first challenge was data. You can't train a model without training examples, and there were none for Hinglish at scale.

We ran structured data collection exercises — surveys asking real Bollywood fans to type the kinds of questions they'd actually ask a chatbot. We gathered thousands of examples across different regions, age groups, and levels of Hindi fluency. Because Hinglish isn't one dialect; it's a spectrum. A 19-year-old in Mumbai types very differently from a 35-year-old in Lucknow.

From this corpus, we identified the core intent categories: film queries, actor queries, TV schedule queries, quiz participation, greetings and smalltalk, and a long tail of everything else.

Then came the painstaking part: hand-labelling every single utterance with intent tags and entity annotations — film title, actor name, broadcast date. No shortcut existed. We just did it.

The model we built combined intent classification with entity extraction. Given the mixed-script, phonetically-varied nature of Hinglish, a big chunk of the engineering work went into normalisation — handling spelling variations, abbreviations, and mid-sentence code-switching between Hindi and English.

Designing for Failure (Gracefully)

Any NLP system built on a closed intent set will encounter queries it can't handle. The question isn't if users will go off-script — they always do. The question is what happens when they do.

We built a three-tier fallback architecture:

Tier 1 — Confident match: High-confidence intent result. The bot responds directly.

Tier 2 — Low-confidence match: Result returned below the confidence threshold. The bot asks a clarifying question before acting.

Tier 3 — No match: No reliable intent found. The bot acknowledges gracefully, offers a menu of popular actions, and — crucially — logs the query.

That last point mattered more than it might seem. Tier 3 wasn't treated as a failure mode. It was treated as a data collection mechanism. Every unhandled query was reviewed and, where appropriate, folded back into the training data. The model got meaningfully better over the deployment period because of this loop.

The Product: Filmykaant

Technical capability is necessary, but it isn't sufficient. The character of the bot mattered just as much as its ability to parse queries.

We named the bot Filmykaant — a play on filmy (film-obsessed) and a classic Bollywood name. The persona was the "Ultimate Bollywood Fan": enthusiastic, trivia-obsessed, slightly dramatic, always ready with a famous dialogue.

The content we built around this persona included:

A curated database of film and actor information
Real-time TV schedule integration with Sony MAX's broadcast feed
A Bollywood trivia quiz with daily challenges and a live leaderboard
Gamified engagement mechanics: streaks, achievement badges, and competitive rankings

The quiz became the viral engine. Users challenged friends. Leaderboard positions got shared on social media. The daily format created a genuine reason to return every single day — which is the hardest thing to build in any consumer product.

The Numbers

At peak, Filmykaant handled:

Metric	Number
Unique users (campaign period)	2 million+
Conversations per day (peak)	1 million+
Daily active users (sustained peak)	25,000+

These weren't numbers we engineered for a press release. They were the outcome of a product that genuinely worked for its audience.

Facebook recognised Filmykaant as a Top 10 Global Chatbot and selected the project for the FBStart programme — their support initiative for the most promising Messenger applications worldwide.

What We Got Wrong

No honest case study skips the hard parts.

We underestimated the operational burden of content freshness. Film schedules change. Bollywood news moves fast. We had built the integration architecture, but keeping the content current required more ongoing effort than we'd scoped. In hindsight, we should have built better self-service tooling for the Sony MAX team from day one.

We over-indexed on the quiz initially. The quiz drove strong engagement numbers, but users who came for practical schedule information found the product too game-like. We recalibrated over time — but it would have been better to balance this in the original design.

The Tier 3 review process stayed manual for too long. Reviewing unhandled queries by hand was valuable, but it didn't scale. Building a semi-automated review pipeline earlier would have accelerated model improvement significantly. I should have prioritised that sooner.

What I Carry Forward

Filmykaant is one of the projects I'm most proud of in 12 years of building technology products.

Not because it was the most technically complex thing we built. But because it required us to solve a genuinely novel problem — a problem that no vendor, no open-source library, and no available dataset had addressed — and the solution we built served two million people well.

The lesson I've carried into every project since: the constraint is often the product. The absence of a Hinglish NLP model didn't mean we couldn't build the product. It meant we had to build the model first. And in doing so, we built something no competitor could easily replicate.

That's usually how the best products happen.

I'm Mahroof K — a technology leader with 12 years of experience delivering AI, chatbot, and digital product initiatives. I work with companies navigating complex technical builds, AI adoption, and product strategy as a consultant, Program Manager, or Product Manager.

If you're working on something where the constraint feels like a blocker — let's talk.

How We Built a Hinglish Chatbot for 2 Million+ Users — When No Hinglish NLP Model Existed

The Problem Nobody Had Solved

Building the NLP Pipeline from Zero

Designing for Failure (Gracefully)

The Product: Filmykaant

The Numbers

What We Got Wrong

What I Carry Forward

Related articles

What I learned building an AI company before GPT existed

SaaS Is Not Dying. It's Being Rewritten.

Why Most AI Products Fail (And How to Actually Deliver Value)