Mukur Gupta
Be yourself, everyone else is already taken.
Denoising & Diffusing Intelligence @ Granica
I am a researcher at Granica AI, where we are building foundation models for structured data by challenging the existing notions of discrete and continuous diffusion models. I completed my master's in Computer Science from Columbia University with the Advanced Master's in Research (AMR) advised by Prof. Kathleen McKeown. I completed my Bachelor of Technology from IIT Kharagpur, graduating as the highest GPA holder in my discipline.
As a Graduate Research Assistant in the NLP Lab at Columbia University, I contributed to a DARPA grant to build multi-modal approaches for Cross-Cultural Dialogue assistance in low-resource multilingual videos. Last year, I spent my summer at Apple working on multi-modal complex diagram reasoning models. Before grad school, I worked as an Associate Data Scientist at Gartner, where I designed and engineered solutions for question-answering, high-precision product recommendation, user-interest ranking, and multi-modal information retrieval use cases. I've been recognized and nominated for multiple excellence awards for my work at Gartner.
I am interested in solving challenging use cases on multi-modal video understanding, code generation, adversarial robustness, and diffusion-based generation models. My research has been presented at NAACL, CaLM @ NeurIPS'24, CL4Health @ COLING'24, and IEEE PerCom'24. Most recently, I served as a Technical Program Committee member for the SPT-IoT workshop at IEEE PerCom'26.
Education
-
M.S. in Computer Science (2023 - May 2025)
Columbia University, New York, NY
Advanced Master's in Research (AMR) Track advised by Prof. Kathleen McKeown
Fully funded under MS-GRA (awarded to < 1% students in Engineering School) -
B.Tech (Hons.) in Mechanical Engg (2017 - 2021)
Indian Institute of Technology (IIT) Kharagpur
Graduated as Department Rank 1 (highest GPA holder of stream)
Bachelor thesis advised by Dr. Pawan Goyal
Research
Sense and Sensitivity: Examining the Influence of Semantic Recall on Long Context Code Reasoning
Under Review, 2025
Proposed SemTrace, a novel technique to measure semantic code recall in LLMs, revealing significant drops in code reasoning accuracy as snippets approach the middle of input context. Identified a disconnect between lexical and semantic recall mechanisms, suggesting current code reasoning benchmarks may underestimate LLM challenges in leveraging in-context information.
Links: paper
XOXO: Stealthy Cross-Origin Context Poisoning Attacks against AI Coding Assistants
Under Review, 2025
Led to a Security Fix in a Major AI Coding AssistantIntroduced Cross-Origin Context Poisoning (XOXO), a novel attack using semantically equivalent adversarial code modifications to compromise AI coding assistants. Developed GCGS algorithm achieving 83.09% attack success rate across 11 models including GPT-4o and Claude 3.5 Sonnet, demonstrating ineffectiveness of existing defenses.
Links: paper
AdvSumm: Adversarial Training for Bias Mitigation in Text Summarization
NewSumm @ EMNLP, 2025
Introduced AdvSumm, a domain-agnostic adversarial training framework that mitigates bias in text summarization through gradient-guided perturbations at the embedding level. Demonstrated effective reduction of name-nationality and political framing biases without compromising summarization quality, outperforming standard transformers and data augmentation techniques.
Links: paper
CodeSCM: Causal Analysis for Multi-Modal Code Generation
NAACL, 2025 (Oral); CALM @ NeurIPS, 2024 (Poster)
Proposed CodeSCM, a Structural Causal Model for analyzing multi-modal code generation in LLMs using causal interventions and mediation analysis. Discovered that input-output examples significantly influence code generation alongside natural language instructions, quantifying spurious model leanings through latent mediator variables.
Links: paper
Intent Detection and Entity Extraction from Biomedical Literature
CL4Health @ LREC-COLING, 2024 (Oral)
Conducted comprehensive empirical evaluation showing supervised fine-tuned approaches remain more effective than general-purpose LLMs for biomedical NLP tasks. Demonstrated that PubMedBERT can surpass ChatGPT on NER tasks with only 5 supervised examples, challenging the efficacy of LLMs in domain-specific understanding.
Avenues in IoT with advances in Artificial Intelligence
SPT-IoT @ IEEE PerCom, 2024 (Oral)
Explored current challenges in Internet of Things and the transformative impact of AI, NLP, and machine learning for domain-specific solutions. Discussed future prospects of deeper fusion between NLP, machine learning, and AI to elevate IoT capabilities and reshape digital interactions.
Links: paper
Curriculum generation using Autoencoder based continuous optimization
arXiv Preprint, 2021
Presented Training Sequence Optimization (TSO), a novel curriculum learning approach using autoencoders and continuous optimization to learn optimal training data ordering. Achieved 2 AP empirical gain over random strategy on CIFAR-100 and CIFAR-10 datasets, outperforming existing state-of-the-art curriculum learning algorithms.
Links: paper
Experience
-
Research Scientist — Granica AI, CA
June 2025 - Present
Advised by Prof. Andrea Montanari, Stanford University. Developing foundation models for structured data by challenging the existing notions of discrete and continuous diffusion models. Applied Granica's ICLR award-winning data selection research (https://www.granica.ai/research)
to recommendation systems of a social media giant with 400 million monthly active users, driving significant gains in Click-Through Rate. -
Graduate Research Assistant — NLP Lab, Columbia University, NY
December 2023 - May 2025
Contributed to a DARPA grant. Developed multi-modal large language models (LLMs) for communication changepoint detection in cross-cultural conversation videos. Designed a video-reasoning model by pre-training LLaVa on a next-scene prediction objective, and generated large-scale synthetic video–language data to enhance pre-training. Showed improvements in abductive and counterfactual video reasoning tasks. -
Applied Scientist Intern — Apple, CA
June 2024 - August 2024
Designed a novel diagram question answering module for powering RAG pipeline on complex flowchart images by finetuning multimodal LLMs on synthetic dataset -
Associate Data Scientist — Gartner, India
August 2021 - June 2023
- Patent Under Review: Co-invented peer-based product timeline recommendation and value-tracking system
- Designed and engineered LLM-based (ChatGPT, Falcon, LLaMA) multi-agent Gartner research question-answering bot with toxicity and vendor bias filters
- Improved the Pearson correlation of Sentence Transformer by 2% on Gartner data, used in Semantic Textual Similarity and multi-modal search
- Engineered a multi-stage product recommendation engine using Deep Factorization Machine (Recall-30%) and XGBoost (NDCG@10 - 0.78)
- Improved NDCG@5 of user-interest ranking by 12% with State-of-the-art Variational Auto-Encoder-based topic modeling on unstructured client inquiry texts
- Designed a list-wise loss-based learning to rank (LTR) model for user interest ranking using user interaction features (MAP@5 - 0.48)
- Led rapid prototyping & model monitoring by deploying experimental setup to AWS Sagemaker using MLOps integrations
-
Machine Learning Intern — Hike, India
May 2020 - August 2020
Designed a novel autoencoder based Curriculum Learning algorithm with LSTM, led to faster model convergence -
Data Science Intern — Manthan (Algonomy), IN
May 2019 - July 2019
Using machine learning on the clickstream data of mobile app of a global pizza chain to build an intent prediction model to predict the next best time when a customer is most likely to open the mobile app to order a pizza and align all the communications (push notification, SMS, emails) at that time to boost the open rates and reduce the possibility of spamming