Build a Multilingual Chatbot for Hindi + English + Tamil (Without Spending Lakhs)

A Coimbatore textile exporter we work with serves Tamil-only retailers, Hinglish-typing wholesalers, and English-speaking export agents — sometimes all three in the same WhatsApp thread. Hiring 14 multilingual support agents was running them ₹6.8 lakh/month. We replaced 70% of that with a Claude + AI4Bharat IndicXlit pipeline that costs ₹38,000/month at 4,200 conversations/day. This post is exactly how it works, where it breaks, and the cost math at 10k, 100k, and 1M conversations/month. ## TL;DR — what a multilingual Indian chatbot actually costs in 2026 Claude Haiku 4.5 + AI4Bharat IndicXlit (free, open-source) + WhatsApp Cloud API gets you a production Hindi/Tamil/Hinglish chatbot. Cost per conversation: ₹1.20 at 10k/month, ₹0.95 at 100k/month, ₹0.84 at 1M/month — all in. The big spender is WhatsApp marketing-message pricing (₹0.8631/message), not the LLM.

Scheduled Indic Languages (AI4Bharat IndicTrans2)

₹0.84

Cost per Conversation at 1M/month

73%

Indian Mobile Users Typing in Roman Script

11M

Params in IndicXlit Transliteration Model

## Why this matters now (April 2026) Three things shifted. Claude Sonnet 4.6 and Haiku 4.5 both now handle Hindi and Tamil at near-English quality on chat tasks — measurable on IndicXNLI and Belebele benchmarks. AI4Bharat released IndicTrans2 with full Hugging Face inference support and RoPE-extended 2048-token context. And WhatsApp's January 2026 pricing shift from per-conversation to per-template-message dropped effective costs ~30% for utility-heavy flows. A multilingual SMB chatbot that cost ₹2 lakh/month to run in 2024 now costs ₹40,000. ## The actual hard problem — code-switching, not translation Translation is the easy part. The hard part is Hinglish — a message like "Bhai 50 packets ka rate kya hai? Delivery Mumbai Bandra mein chahiye, urgent hai please." This is Hindi grammar in Roman script, Devanagari-free, with English nouns sprinkled in. Naïve language detection routes it to Hindi (because of "kya", "hai"), translation pipelines fail (the source isn't really Hindi), and English-only LLMs get half the meaning. The fix is a four-stage pipeline:

🔤

1. Script Detection

Character-level script classifier (Devanagari / Tamil / Latin / mixed). Runs in <5ms per message using simple unicode-range buckets.

🔁

2. Transliteration (Roman → Native)

AI4Bharat IndicXlit converts Roman-script Hindi/Tamil back to native script when needed. 11M-param transformer, runs on CPU at ~80ms/message.

🧠

3. Claude with Language Pin

System prompt pins Claude to "answer in the same script and language the user wrote in" — Haiku 4.5 handles this natively for 14 Indic languages.

🗣️

4. Tamil-Specific Post-Processing

Tamil agglutination still trips Claude on long compounds. We run an IndicNLP morphology check on outputs longer than 60 chars to catch joining errors.

## Cost at three SMB scales — the math your CFO actually wants Breakeven against a single human support agent (~₹35k/mo total cost including infra) is reached at roughly 4,000 conversations/month. Below that, hiring is cheaper. Above 10k/month, the bot wins decisively and quality stops scaling with team headcount. ## The DIY walkthrough — code that actually works We run this on a Hetzner CCX23 (₹2,400/month, has 4 vCPU for IndicXlit) running Ubuntu 24.04 + Python 3.11. Plus a WhatsApp Cloud API account (Meta-direct, no BSP markup) and an Anthropic API key. ### Step 1 — install dependencies

bash

pip install anthropic==0.42.0 \
              transformers==4.45.0 \
              torch==2.5.0 \
              indic-nlp-library==0.92 \
              fastapi==0.115.0 \
              uvicorn==0.32.0 \
              httpx==0.27.2

### Step 2 — script detection (fast, deterministic)

python

# script_detect.py
  import unicodedata
  from collections import Counter
  
  SCRIPT_RANGES = {
      "devanagari": (0x0900, 0x097F),  # Hindi, Marathi, Sanskrit
      "tamil": (0x0B80, 0x0BFF),
      "telugu": (0x0C00, 0x0C7F),
      "bengali": (0x0980, 0x09FF),
      "latin": (0x0000, 0x024F),
  }
  
  def detect_script(text: str) -> str:
      counts = Counter()
      for ch in text:
          if ch.isspace() or ch.isdigit():
              continue
          cp = ord(ch)
          for name, (lo, hi) in SCRIPT_RANGES.items():
              if lo <= cp <= hi:
                  counts[name] += 1
                  break
      if not counts:
          return "unknown"
      top = counts.most_common(1)[0][0]
      if top == "latin" and counts.get("latin", 0) < len(text) * 0.8:
          return "mixed"
      return top
  
  # Example
  print(detect_script("Bhai 50 packets ka rate?"))  # 'latin' -> Hinglish candidate
  print(detect_script("भाई 50 पैकेट का रेट?"))   # 'devanagari'
  print(detect_script("50 பேக்கட் விலை?"))      # 'tamil'

### Step 3 — IndicXlit transliteration (Roman → Devanagari/Tamil when needed)

python

# transliterate.py
  from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
  
  # One-time load (~400MB)
  MODEL_NAME = "ai4bharat/indic-xlit-roman-to-indic"
  tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME, trust_remote_code=True)
  model = AutoModelForSeq2SeqLM.from_pretrained(MODEL_NAME, trust_remote_code=True)
  
  def roman_to_indic(text: str, target: str = "hin_Deva") -> str:
      """Convert Roman-script Hindi/Tamil into native script."""
      inputs = tokenizer(text, return_tensors="pt", src_lang="eng_Latn", tgt_lang=target)
      outputs = model.generate(inputs, max_new_tokens=128)
      return tokenizer.decode(outputs[0], skip_special_tokens=True)
  
  # We only call this when we want to embed/log native script
  # Claude itself handles the Roman input fine in May 2026

For most flows we don't transliterate before sending to Claude — Haiku 4.5 understands Hinglish in Roman script natively. We transliterate only when we want to log queries in canonical form for analytics or to feed a search index. ### Step 4 — the Claude system prompt that handles all three languages

python

# claude_handler.py
  from anthropic import Anthropic
  import os
  
  client = Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
  
  SYSTEM_PROMPT = """You are a customer support assistant for [Client] textile exports.
  
  Language rules — CRITICAL:
  - Detect the user's language and script from their message.
  - ALWAYS reply in the SAME language and SAME script the user used.
  - If the user typed Hindi in Roman script ("Bhai kya rate hai"), reply in Hindi in Roman script.
  - If the user typed Tamil in Tamil script, reply in Tamil in Tamil script.
  - If the user mixes English + Hindi (Hinglish), match their mix proportion.
  - NEVER translate the user's message back to them — just answer.
  
  Knowledge rules:
  - Answer only from the catalog context provided below.
  - If you don't know, say so in the user's language. Suggest they ask a human.
  - Never invent prices, dates, or stock numbers.
  
  Tone rules:
  - Address the user as "bhai" / "ji" / "saar" matching their register.
  - Keep replies under 60 words unless asked for detail.
  
  Catalog context:
  {context}
  
  Conversation so far:
  {history}
  
  User: {message}
  Assistant:"""
  
  def reply(message: str, context: str, history: str = "") -> str:
      response = client.messages.create(
          model="claude-haiku-4-5",
          max_tokens=300,
          temperature=0.2,
          system=SYSTEM_PROMPT.format(context=context, history=history, message=message),
          messages=[{"role": "user", "content": message}],
      )
      return response.content[0].text

The "match the user's script" rule is the one that moves trust scores. Without it, Tamil retailers see English replies and bounce. With it, our Coimbatore client's CSAT moved from 3.4/5 (when we tested an English-only bot for two weeks) to 4.6/5 within the first month. ### Step 5 — WhatsApp Cloud API webhook (the entry point)

python

# webhook.py
  from fastapi import FastAPI, Request
  from script_detect import detect_script
  from claude_handler import reply
  import httpx, os
  
  app = FastAPI()
  WHATSAPP_TOKEN = os.environ["WA_TOKEN"]
  PHONE_ID = os.environ["WA_PHONE_ID"]
  
  @app.post("/webhook")
  async def webhook(req: Request):
      body = await req.json()
      msg = body["entry"][0]["changes"][0]["value"]["messages"][0]
      user_text = msg["text"]["body"]
      user_id = msg["from"]
      script = detect_script(user_text)
  
      # Pull catalog context (RAG retrieval from previous post)
      context = retrieve_catalog_context(user_text)
  
      answer = reply(user_text, context)
  
      # Send via WhatsApp Cloud API
      async with httpx.AsyncClient() as cli:
          await cli.post(
              f"https://graph.facebook.com/v20.0/{PHONE_ID}/messages",
              headers={"Authorization": f"Bearer {WHATSAPP_TOKEN}"},
              json={
                  "messaging_product": "whatsapp",
                  "to": user_id,
                  "text": {"body": answer},
              },
          )
      return {"status": "ok"}

You should now see a working end-to-end flow: customer sends Hinglish/Tamil/Hindi/English on WhatsApp → your server detects script → retrieves catalog context → Claude replies in matching language → reply lands back on WhatsApp. ## Common mistakes — five we keep cleaning up Mistake 1 — Routing transliterated text to "Hindi-only" LLMs. Older Indic-tuned models like XLM-R-Indic fail on code-switching. Claude Haiku 4.5 handles Hinglish natively. Use the latest commercial models for code-mixed input. Mistake 2 — Translating back to English internally, then re-translating to user language. Every round trip loses semantic precision. Process in the user's language end-to-end. Only translate when logging for analytics. Mistake 3 — Ignoring Tamil compound words. Tamil is agglutinative — "வாங்கிக்கொண்டிருக்கிறேன்" is a single word meaning "I am buying". LLMs sometimes break compounds mid-word in long answers. Run an IndicNLP morphology check on Tamil outputs over 60 chars. Mistake 4 — Skipping Romanized output for older buyers. Some Hindi-speaking buyers above 50 prefer to see replies in Devanagari, even if they typed in Roman. Build a per-user preference store. Don't over-rely on script detection alone. Mistake 5 — Charging WhatsApp marketing rates for utility flows. A "your order has shipped" message is a utility message (~₹0.115). If you send it via a marketing template, you pay ~₹0.86 — 7.5x more. Classify the message category correctly before sending.

PII gotcha: WhatsApp messages on the Cloud API hit Meta servers — DPDP Act (2023) classifies them as cross-border data transfer. For Indian customers, ensure your privacy policy discloses Meta as a sub-processor. For high-sensitivity data (banking, medical), use WhatsApp's On-Premises API or move to a domestic-hosted channel.

## Real example — Coimbatore textile exporter, 4,200 conv/day Our client ships sarees and home textiles to retailers across Tamil Nadu, Karnataka, and Maharashtra. WhatsApp is their primary order channel. Before the bot: 14 multilingual support agents, ₹6.8L/month, 9–11 hour response time on weekends. After the bot (April 2026 build): bot handles 70% of conversations end-to-end, 30% escalate to 4 senior agents. Total monthly cost: ₹40,000 (bot) + ₹1.4L (agents) = ₹1.8L. Savings: ₹5L/month. Response time: under 12 seconds for 92% of bot-handled conversations. This pattern came from the same voice-AI work we do on our in-house product [TalkDrill](https://talkdrill.com), where multilingual conversation handling is the core problem.

You serve at least two of: Hindi, Tamil, Telugu, Bengali, Marathi, English speakers
You handle ≥4,000 conversations/month (below this, hire a person)
You have catalog / FAQ / policy content to ground answers (RAG corpus)
You have a WhatsApp Business account approved (or willingness to wait 7–14 days)
You added language-match rule to your system prompt and tested with 30 real Hinglish samples
You built a fallback to a human within 2 message turns for unresolved queries
You logged the first 500 conversations and reviewed them before going live
You set up a per-user language preference store for repeat buyers

## FAQ ### Which LLM is best for Hindi and Tamil in May 2026? For Hinglish and clean Hindi, Claude Haiku 4.5 and GPT-5.5 are roughly tied. For Tamil specifically, Claude Sonnet 4.6 has the edge on agglutinative outputs. Gemini 2.5 Flash is the cheapest but trips more often on Tamil compounds. Build with Haiku, route Tamil-only to Sonnet for important answers. ### Do I need AI4Bharat models if Claude handles Hindi natively? For chat replies, no. For two specific cases yes — (a) converting Roman-typed Hindi/Tamil to native script for storage/search/analytics, and (b) translating long-form content where you need a smaller domestic model for cost reasons. IndicTrans2 is the open-source benchmark for Indic-to-Indic translation. ### How does WhatsApp Cloud API pricing actually break down? Meta charges per template message: ₹0.8631 for marketing, ₹0.115 for utility/authentication. Service replies (within the 24-hour customer window) are free. If you use a BSP like AiSensy, Wati, or Interakt, add ~12-26% markup on top of Meta's rates. For SMBs sending ≥50k messages/month, Meta-direct via Cloud API is cheaper than any BSP. ### Can the same bot handle 10 Indian languages without retraining? Claude Haiku 4.5 handles Hindi, Bengali, Tamil, Telugu, Marathi, Gujarati, Punjabi, Kannada, Malayalam, and English at usable quality without any fine-tuning. The architecture stays identical — only the system prompt's "language match" rule and a language-specific catalog context change. Voice IO is a different problem; see our voice-IVR post. ### What's the right escalation rule from bot to human? Three triggers: (1) Claude's response indicates uncertainty ("I don't have that info"); (2) sentiment classifier flags frustration; (3) user explicitly says "agent" / "human" / equivalent in their language. We keep a regex list of escalation phrases in 8 Indic languages. Escalation happens within 2 turns to avoid the "stuck talking to a bot" rage. ### How long does this build take end-to-end? For a fluent Python team, 6 working days. Day 1-2: webhook + WhatsApp integration. Day 3: script detection + transliteration. Day 4: Claude prompt + RAG corpus. Day 5: eval set + escalation logic. Day 6: load test + soft launch on internal users. Two more days for production polish and on-call setup. ### Where can I see a working multilingual chatbot in the wild? Our in-house English-speaking app [TalkDrill](https://talkdrill.com) — 5,000+ active Indian users — uses the same multilingual conversation infrastructure (with voice on top). The retail chatbot pattern is a stripped-down version of that stack. We borrow telemetry tooling between the two projects.

Want a Multilingual Support Bot for Your Indian Customers?

We build Hindi + Tamil + English (and 7 other Indic languages) WhatsApp chatbots for Indian SMBs in 14 working days. Typical project: ₹85,000–₹1,80,000 depending on integrations (Tally, Zoho, CRM). You own the code. Monthly run cost from ₹12,000 at 10k conversations.

Book a 20-min Call

Tags:

Multilingual ChatbotHindi NLPTamilAI4BharatClaude APIWhatsApp BusinessIndic NLP

Share this post:

Khushi Singh

UI/UX Designer at Softechinfra focused on crafting intuitive, user-friendly digital experiences.

Back to Blog

pip install anthropic==0.42.0 \ transformers==4.45.0 \ torch==2.5.0 \ indic-nlp-library==0.92 \ fastapi==0.115.0 \ uvicorn==0.32.0 \ httpx==0.27.2

# script_detect.py import unicodedata from collections import Counter SCRIPT_RANGES = { "devanagari": (0x0900, 0x097F), # Hindi, Marathi, Sanskrit "tamil": (0x0B80, 0x0BFF), "telugu": (0x0C00, 0x0C7F), "bengali": (0x0980, 0x09FF), "latin": (0x0000, 0x024F), } def detect_script(text: str) -> str: counts = Counter() for ch in text: if ch.isspace() or ch.isdigit(): continue cp = ord(ch) for name, (lo, hi) in SCRIPT_RANGES.items(): if lo <= cp <= hi: counts[name] += 1 break if not counts: return "unknown" top = counts.most_common(1)[0][0] if top == "latin" and counts.get("latin", 0) < len(text) * 0.8: return "mixed" return top # Example print(detect_script("Bhai 50 packets ka rate?")) # 'latin' -> Hinglish candidate print(detect_script("भाई 50 पैकेट का रेट?")) # 'devanagari' print(detect_script("50 பேக்கட் விலை?")) # 'tamil'

# transliterate.py from transformers import AutoModelForSeq2SeqLM, AutoTokenizer # One-time load (~400MB) MODEL_NAME = "ai4bharat/indic-xlit-roman-to-indic" tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME, trust_remote_code=True) model = AutoModelForSeq2SeqLM.from_pretrained(MODEL_NAME, trust_remote_code=True) def roman_to_indic(text: str, target: str = "hin_Deva") -> str: """Convert Roman-script Hindi/Tamil into native script.""" inputs = tokenizer(text, return_tensors="pt", src_lang="eng_Latn", tgt_lang=target) outputs = model.generate(inputs, max_new_tokens=128) return tokenizer.decode(outputs[0], skip_special_tokens=True) # We only call this when we want to embed/log native script # Claude itself handles the Roman input fine in May 2026

# claude_handler.py from anthropic import Anthropic import os client = Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"]) SYSTEM_PROMPT = """You are a customer support assistant for [Client] textile exports. Language rules — CRITICAL: - Detect the user's language and script from their message. - ALWAYS reply in the SAME language and SAME script the user used. - If the user typed Hindi in Roman script ("Bhai kya rate hai"), reply in Hindi in Roman script. - If the user typed Tamil in Tamil script, reply in Tamil in Tamil script. - If the user mixes English + Hindi (Hinglish), match their mix proportion. - NEVER translate the user's message back to them — just answer. Knowledge rules: - Answer only from the catalog context provided below. - If you don't know, say so in the user's language. Suggest they ask a human. - Never invent prices, dates, or stock numbers. Tone rules: - Address the user as "bhai" / "ji" / "saar" matching their register. - Keep replies under 60 words unless asked for detail. Catalog context: {context} Conversation so far: {history} User: {message} Assistant:""" def reply(message: str, context: str, history: str = "") -> str: response = client.messages.create( model="claude-haiku-4-5", max_tokens=300, temperature=0.2, system=SYSTEM_PROMPT.format(context=context, history=history, message=message), messages=[{"role": "user", "content": message}], ) return response.content[0].text

# webhook.py from fastapi import FastAPI, Request from script_detect import detect_script from claude_handler import reply import httpx, os app = FastAPI() WHATSAPP_TOKEN = os.environ["WA_TOKEN"] PHONE_ID = os.environ["WA_PHONE_ID"] @app.post("/webhook") async def webhook(req: Request): body = await req.json() msg = body["entry"][0]["changes"][0]["value"]["messages"][0] user_text = msg["text"]["body"] user_id = msg["from"] script = detect_script(user_text) # Pull catalog context (RAG retrieval from previous post) context = retrieve_catalog_context(user_text) answer = reply(user_text, context) # Send via WhatsApp Cloud API async with httpx.AsyncClient() as cli: await cli.post( f"https://graph.facebook.com/v20.0/{PHONE_ID}/messages", headers={"Authorization": f"Bearer {WHATSAPP_TOKEN}"}, json={ "messaging_product": "whatsapp", "to": user_id, "text": {"body": answer}, }, ) return {"status": "ok"}

Build a Multilingual Chatbot for Hindi + English + Tamil (Without Spending Lakhs)

Want a Multilingual Support Bot for Your Indian Customers?

Khushi Singh

Related Posts

UPI Collect Is Dead: We Migrated 4 Indian Apps to Intent + QR Flows — Here's the Playbook

Prompt Eval Pipelines: 200 Changes a Week Without Breaking TalkDrill

Scaling PenLeap: 60 to 600 Concurrent Writers, Same Number of Servers

Want More Insights?

Build a Multilingual Chatbot for Hindi + English + Tamil (Without Spending Lakhs)

Want a Multilingual Support Bot for Your Indian Customers?

Khushi Singh

Related Posts

UPI Collect Is Dead: We Migrated 4 Indian Apps to Intent + QR Flows — Here's the Playbook

Prompt Eval Pipelines: 200 Changes a Week Without Breaking TalkDrill

Scaling PenLeap: 60 to 600 Concurrent Writers, Same Number of Servers

Want More Insights?