Democratizing the Hinglish Lexicon.
HinglishKosh is an open-source linguistic project dedicated to documenting the vibrant, evolving landscape of Hinglish. We bridge the gap between formal Hindi and English by providing a structured, academically rigorous database for the colloquialisms that define modern South Asian communication.
Our Methodology
Every entry in HinglishKosh is sourced from two authoritative datasets. The core comes from Hindi WordNet (153K entries), developed by CFILT at IIT Bombay — a comprehensive lexical database for Hindi with synset-based semantic relations. Wiktionary (56K entries) provides English definitions, usage examples, and etymological data.
Accessibility First
Every entry is available with three definition formats: English (for non-Hindi speakers), Hinglish (Roman-script Hindi for diaspora and learners), and Hindi (Devanagari) (the original scholarly definition). This trilingual approach ensures the dictionary serves every audience.
TECH STACK
Built with Cloudflare Pages Functions for server-side rendering, D1 as the serverless SQL database with FTS5 full-text search, and Tailwind CSS via CDN for zero-build-step styling. The entire pipeline is Python-based for data enrichment and seed operations.
Project Credits
DATA SOURCES
Hindi WordNet (CFILT, IIT Bombay)
Wiktionary
Indo-WordNet hypernymy data
LICENSE
GNU GPL v3
Free to use, modify, and share
Attribution required
STACK
Cloudflare Pages · D1 · FTS5
Tailwind CSS · Python
SQLite · JavaScript