Kislay Aditya Oj
Computer Science and Engineering | MS · IIT Bombay | BTech · IIT Kanpur
About
I'm Kislay, currently pursuing an MS by Research in Computer Science & Engineering at IIT Bombay, where I'm part of the CFILT Lab. I completed my B.Tech from IIT Kanpur in 2025. I have a strong interest in reinforcement learning, machine learning and language technologies, and enjoy working on problems that involve learning theory, decision making and statistics.
Outside of academics, I really enjoy chess, I follow top games closely and play competitively (currently around 1900, though I have touched 2000 before… we don't talk about the rating drop, also it's chess com rating not FIDE, i'm not a genius). I also follow Formula 1, sketch in my free time, and enjoy reading books. Lately, I've been learning to play the piano as well. You can find me on X, check out my occasional blogs in the notes section, or browse through my projects to see what I've been working on.
Research
Theoretical Reinforcement Learning
I work on theoretical reinforcement learning with a focus on bandit problems involving hidden structure, such as latent user states or clustered populations. My research studies how offline models and shared structure can be combined with online exploration to design sample efficient algorithms with provable regret guarantees.
Mechanistic Interpretability
I investigate how neural networks, especially LLMs, implement internal computation mechanisms rather than just observing input output behavior. This includes studying representation geometry, transformer circuit structure, and causal mechanisms behind reasoning and figurative language, while exploring SVD-based decomposition of internal representations as a promising direction for understanding how such computations arise.
Reinforcement Learning for Adaptive LLM Agents
I study how reinforcement learning ideas can be used to help large language model (LLM) agents improve from feedback during real use. This work focuses on test time adaptation using reflection, memory, and personalization rather than model fine tuning. As part of the Flipkart–IIT Bombay collaboration, my goal is to build LLM agents that become more reliable and user aware over time.
Publications
working on it
Contact
Email: kislay@cse.iitb.ac.in
Lab: CFILT Lab, Room 401, Computer Centre
Department of Computer Science
Indian Institute of Technology, Bombay
Google Scholar · GitHub · Twitter · LinkedIn
Notes
Scattered thoughts and reflections. Click title to expand.
- Attending my first ConferenceDecember 2025Volunteering at AACL 2025 gave me a behind the scenes view of how conferences actually run, the chaos, coordination, people and conversations that never show up in papers or schedules.
- Book this week #1 - The AlchemistPaul Cohelo · December 2025A short and simple story about chasing dreams and learning from the journey. Predictable at times, but still a solid and meaningful read, especially for beginners.
- Tier List #2 - MoviesSeptember 2025Personal thoughts on movies and TV shows based on impact rather than objectivity. Some unforgettable, some enjoyable, some overhyped, all filtered through mood, timing and questionable taste.
- Tier List #1 - AnimeSeptember 2025A very subjective take on anime I've watched over the years, ranked less by technical quality and more by how much they stuck with me. Strongly biased by characters, long term impact and pure vibes.
Projects
Software · Datasets · Tools
- Analysis of EBMs — Boltzmann MachinesExperimental study of energy-based models (Boltzmann machines) investigating the effect of Contrastive Divergence (CD) steps on sample quality. Includes analysis scripts, configurations, and result outputs for varying CD schedules and sampling strategies.
- Know.Study.HelpA lightweight web platform for academic management and study workflows. The repo contains a Flask backend (app.py / server.py), HTML templates and static assets, and utilities to manage study-related content and pages. Built as a small full-stack project for course/organisational use.
- Transliteration (Roman → Devanagari)Code and experiments for Hindi transliteration (Roman to Devanagari). The repo includes LSTM and transformer checkpoints, sampling and evaluation scripts, and a short paper/report describing the approach. A demo script is provided for quick sampling; some LLM-based inference paths require an NVIDIA API key (noted in the README).
- Prompt TuningResearch-and-development code exploring prompt / prefix tuning approaches for LLMs. Contains experimental code, notebooks and utilities used to run prompt tuning experiments and compare prompt-based interventions across models.
- LLM-Based Scraper for Amazon Order InformationBuilt an automation pipeline to extract structured order history data from Amazon using Selenium. Integrated an open-source LLM (GPT-Neo) to process and structure raw HTML into JSON/CSV formats, with support for swapping in stronger models. Focused on robustness, security, and scalability.
- Curiosity-Driven Exploration via Self-Supervised PredictionImplemented the Intrinsic Curiosity Module (ICM) across DQN, A3C, and PPO to study exploration in sparse-reward environments. Demonstrated how intrinsic rewards improve learning efficiency and policy performance across different RL algorithms.
- Exploration and Analysis of Deep Reinforcement Learning MethodsA comprehensive study and implementation of classical and deep RL algorithms including bandits, Monte Carlo, TD methods, SARSA, Q-learning, DQN variants, PPO, TD3, and DDPG. Includes experiments, comparisons, and visual analysis.
- Computer Vision using CImplemented core image processing and computer vision algorithms in C with Python bindings. Covered filtering, convolutions, edge detection, hybrid images, and color space transformations, with visual validation through experiments.
- Tour De OAAR – Astronomy Club, IIT KanpurMentored a team of students on Python programming and automation projects for the campus observatory. Guided development of tools such as weather monitoring systems using APIs and JSON, alongside teaching astronomy fundamentals and telescope operations.