Projects

Every project solves a real problem. Research-backed, deployed where possible.

RealityCheck

Improved answer-level accuracy by up to 30 percentage points and hallucination recall from under 45% to over 80% across three LLMs, without modifying a single model weight, without adding another LLM call, and with overcorrection rates kept below 7%.

PythonNLPNatural Language InferenceSentence TransformersXGBoostWikipedia MediaWiki APIIBM WatsonXMeta LlamaMistralAIHuggingFaceTruthfulQA

Every LLM hallucinates. ChatGPT, Llama, Mistral, they all generate factually wrong content with complete confidence, and most users have no way to know. RealityCheck is a modular, model-agnostic six-phase hallucination correction pipeline that sits as an external verification layer over any LLM. It takes the LLM's response, breaks it into atomic factual claims, retrieves Wikipedia-grounded evidence, verifies each claim through NLI + semantic alignment + rule-based reasoning, and delivers a corrected response, all before the answer reaches the user. Evaluated on TruthfulQA across IBM WatsonX Granite, Meta Llama, and MistralAI Mistral, it improved answer accuracy by up to 30 percentage points and hallucination recall from 37–45% to 78–83%.

GitHubCase study →

VitaAI

Won the Deloitte Capstone Project Award. Prediction model validated, boosted, and alignment-tested by 16 real-world doctors across end-to-end testing.

PythonXGBoostSHAPIBM WatsonXPyTorchSentence TransformersReactNode.jsExpress.jsMongoDBJWT

Doctors make life-or-death decisions under time pressure with incomplete information. VitaAI is an AI-assisted medical diagnosis platform built for doctors, not patients. Input a patient's symptoms, and the system predicts the top 10 most likely diseases ranked by severity, generates SHAP-based clinical reasoning for each prediction, and translates that reasoning into a plain-language medical report via IBM WatsonX. Doctors can generate prescriptions directly. Patients can schedule appointments and access their reports. Validated and tested end-to-end by 16 real-world doctors. Won the Deloitte Capstone Project Award.

GitHubCase study →

SATORI

A production-grade RAG system that handles scanned PDFs, embedded equations, multi-turn conversation, and session isolation, with a dual-mode architecture that gives users document accuracy and LLM depth in the same interface.

PythonChromaDBBGE-large EmbeddingsCross-Encoder RerankingIBM WatsonXMeta Llama 3.3 70BPyMuPDFTesseract OCRReactTypeScriptNode.jsExpress.jsTailwind CSSVite

Most RAG systems force a choice: answer only from your documents (safe but limited) or answer from an LLM (powerful but hallucination-prone). SATORI refuses the tradeoff. Upload up to 20 PDFs, including scanned documents and equations, and SATORI builds a personal, session-isolated knowledge bank using BGE-large embeddings and ChromaDB. In Strict mode, every answer comes only from your PDFs with source citations and page numbers. In LLM Tutor mode, your PDF excerpts are sent as grounding context to Llama 3.3 70B on IBM WatsonX, which expands and elaborates without losing the document anchor. Context-based recall and follow-up detection make it feel like a conversation, not a search engine.

GitHubCase study →

AgriVerse

Deployed production system serving real crop recommendations for Indian farmers, 99.80% accuracy across 22 crops, with explainability that tells you not just what to grow but why and how to adjust your soil to grow something better.

PythonXGBoostSHAPReactNode.jsExpress.jsscikit-learnChart.jsOpen-Meteo API

Indian farmers lose billions every year growing the wrong crop in the wrong soil, not because they're careless, but because traditional knowledge can't account for real-time soil chemistry and live weather patterns. AgriVerse is a deployed, full-stack explainable AI system that takes a farmer's soil parameters (N, P, K, pH) and live location, fetches real-time weather data, and recommends the top 3 crops with probability scores, SHAP-based explanations, counterfactual alternatives, and a trust score. Built on XGBoost with 99.80% classification accuracy across 22 crop types.

GitHub Live DemoCase study →

SkillAI

Deployed career recommendation engine predicting top 10 job matches from 2025 O*NET data using a two-layer XGBoost + DNN pipeline with full Captum explainability.

PythonXGBoostPyTorchDeep Neural NetworksCaptumIBM WatsonXReactNode.jsExpress.jsO*NET Dataset

Most career recommendation tools match keywords to job titles. SkillAI goes deeper, it uses a two-layer AI architecture trained on the 2025 U.S. Department of Labor O*NET dataset to predict the top 10 most suitable careers for a user's specific skill set. An XGBoost model first clusters the occupation space, then a deep neural network searches within the identified cluster for the most precise career matches. Captum-based explainability maps exactly which skills drove each recommendation, and IBM WatsonX translates that into a plain-language career report.

GitHubCase study →