Kislay Aditya Oj

Computer Science and Engineering  |  MS · IIT Bombay  |  BTech · IIT Kanpur

About

I'm Kislay, currently pursuing an MS by Research in Computer Science & Engineering at IIT Bombay, where I'm part of the CFILT Lab. I completed my B.Tech from IIT Kanpur in 2025. I have a strong interest in reinforcement learning, machine learning and language technologies, and enjoy working on problems that involve learning theory, decision making and statistics.

Outside of academics, I really enjoy chess, I follow top games closely and play competitively (currently around 1900, though I have touched 2000 before… we don't talk about the rating drop, also it's chess com rating not FIDE, i'm not a genius). I also follow Formula 1, sketch in my free time, and enjoy reading books. Lately, I've been learning to play the piano as well. You can find me on X, check out my occasional blogs in the notes section, or browse through my projects to see what I've been working on.

Research

Theoretical Reinforcement Learning

I work on theoretical reinforcement learning with a focus on bandit problems involving hidden structure, such as latent user states or clustered populations. My research studies how offline models and shared structure can be combined with online exploration to design sample efficient algorithms with provable regret guarantees.

Mechanistic Interpretability

I investigate how neural networks, especially LLMs, implement internal computation mechanisms rather than just observing input output behavior. This includes studying representation geometry, transformer circuit structure, and causal mechanisms behind reasoning and figurative language, while exploring SVD-based decomposition of internal representations as a promising direction for understanding how such computations arise.

Reinforcement Learning for Adaptive LLM Agents

I study how reinforcement learning ideas can be used to help large language model (LLM) agents improve from feedback during real use. This work focuses on test time adaptation using reflection, memory, and personalization rather than model fine tuning. As part of the Flipkart–IIT Bombay collaboration, my goal is to build LLM agents that become more reliable and user aware over time.

Publications

working on it

Contact

Email: kislay@cse.iitb.ac.in

Lab: CFILT Lab, Room 401, Computer Centre
Department of Computer Science
Indian Institute of Technology, Bombay

Google Scholar · GitHub · Twitter · LinkedIn

Last Updated - December, 2025

Notes

Scattered thoughts and reflections. Click title to expand.

  • Attending my first Conference
    December 2025
    Volunteering at AACL 2025 gave me a behind the scenes view of how conferences actually run, the chaos, coordination, people and conversations that never show up in papers or schedules.
  • Book this week #1 - The Alchemist
    Paul Cohelo · December 2025
    A short and simple story about chasing dreams and learning from the journey. Predictable at times, but still a solid and meaningful read, especially for beginners.
  • Tier List #2 - Movies
    September 2025
    Personal thoughts on movies and TV shows based on impact rather than objectivity. Some unforgettable, some enjoyable, some overhyped, all filtered through mood, timing and questionable taste.
  • Tier List #1 - Anime
    September 2025
    A very subjective take on anime I've watched over the years, ranked less by technical quality and more by how much they stuck with me. Strongly biased by characters, long term impact and pure vibes.

Last Updated - December, 2025

Projects

Software · Datasets · Tools

  • Analysis of EBMs — Boltzmann Machines
    Energy-Based Models · Python · 2025
    Experimental study of energy-based models (Boltzmann machines) investigating the effect of Contrastive Divergence (CD) steps on sample quality. Includes analysis scripts, configurations, and result outputs for varying CD schedules and sampling strategies.
  • Know.Study.Help
    Python · Flask · Web · 2025
    A lightweight web platform for academic management and study workflows. The repo contains a Flask backend (app.py / server.py), HTML templates and static assets, and utilities to manage study-related content and pages. Built as a small full-stack project for course/organisational use.
  • Transliteration (Roman → Devanagari)
    Transliteration · LSTM / Transformer · Python · 2025
    Code and experiments for Hindi transliteration (Roman to Devanagari). The repo includes LSTM and transformer checkpoints, sampling and evaluation scripts, and a short paper/report describing the approach. A demo script is provided for quick sampling; some LLM-based inference paths require an NVIDIA API key (noted in the README).
  • Prompt Tuning
    Prompt tuning · Experimental · Python · 2025
    Research-and-development code exploring prompt / prefix tuning approaches for LLMs. Contains experimental code, notebooks and utilities used to run prompt tuning experiments and compare prompt-based interventions across models.
  • LLM-Based Scraper for Amazon Order Information
    Python · Selenium · LLMs · 2024
    Built an automation pipeline to extract structured order history data from Amazon using Selenium. Integrated an open-source LLM (GPT-Neo) to process and structure raw HTML into JSON/CSV formats, with support for swapping in stronger models. Focused on robustness, security, and scalability.
  • Curiosity-Driven Exploration via Self-Supervised Prediction
    Reinforcement Learning · PyTorch · 2024
    Implemented the Intrinsic Curiosity Module (ICM) across DQN, A3C, and PPO to study exploration in sparse-reward environments. Demonstrated how intrinsic rewards improve learning efficiency and policy performance across different RL algorithms.
  • Exploration and Analysis of Deep Reinforcement Learning Methods
    Deep RL · PyTorch · 2024
    A comprehensive study and implementation of classical and deep RL algorithms including bandits, Monte Carlo, TD methods, SARSA, Q-learning, DQN variants, PPO, TD3, and DDPG. Includes experiments, comparisons, and visual analysis.
  • Computer Vision using C
    C · Python · Computer Vision · 2023
    Implemented core image processing and computer vision algorithms in C with Python bindings. Covered filtering, convolutions, edge detection, hybrid images, and color space transformations, with visual validation through experiments.
  • Tour De OAAR – Astronomy Club, IIT Kanpur
    Mentorship · Python · Web APIs · 2023
    Mentored a team of students on Python programming and automation projects for the campus observatory. Guided development of tools such as weather monitoring systems using APIs and JSON, alongside teaching astronomy fundamentals and telescope operations.