Democratizing the Hinglish Lexicon.

HinglishKosh is an open-source linguistic project dedicated to documenting the vibrant, evolving landscape of Hinglish. We bridge the gap between formal Hindi and English by providing a structured, academically rigorous database for the colloquialisms that define modern South Asian communication.

Total Entries
209k
From Hindi WordNet (IIT Bombay) and Wiktionary
Synset Relations
40k+
Broader, narrower, and same-synset links
Data Sources
2
Hindi WordNet (CFILT, IIT Bombay) · Wiktionary

Our Methodology

Every entry in HinglishKosh is sourced from two authoritative datasets. The core comes from Hindi WordNet (153K entries), developed by CFILT at IIT Bombay — a comprehensive lexical database for Hindi with synset-based semantic relations. Wiktionary (56K entries) provides English definitions, usage examples, and etymological data.

Accessibility First

Every entry is available with three definition formats: English (for non-Hindi speakers), Hinglish (Roman-script Hindi for diaspora and learners), and Hindi (Devanagari) (the original scholarly definition). This trilingual approach ensures the dictionary serves every audience.

TECH STACK

CLOUDFLARE PAGES D1 DATABASE FTS5 SEARCH TAILWIND CDN PYTHON SQLITE

Built with Cloudflare Pages Functions for server-side rendering, D1 as the serverless SQL database with FTS5 full-text search, and Tailwind CSS via CDN for zero-build-step styling. The entire pipeline is Python-based for data enrichment and seed operations.

Project Credits

DATA SOURCES

Hindi WordNet (CFILT, IIT Bombay)

Wiktionary

Indo-WordNet hypernymy data

LICENSE

GNU GPL v3

Free to use, modify, and share

Attribution required

STACK

Cloudflare Pages · D1 · FTS5

Tailwind CSS · Python

SQLite · JavaScript

CONTRIBUTE

GitHub Repository

Report an Issue

Discussions