Bowen Fang
Ph.D. Student at Columbia University.
I am a Ph.D. student at Columbia University, where I work with Data Science Institute (Smart Cities Center). I develop scalable systems and RL-based agentic reasoning frameworks for stochastic, continuous environments with complex topological constraints. My research interests span reinforcement learning, agentic AI, multimodal reasoning (LLM/VLM), systems resilience, and optimization.
Previously, I interned at AWS AI Labs as an Applied Scientist.
I hold an M.S. in Operations Research from Columbia University and a Bachelor in Big Data Management and Applications, minor in Economics from Peking University.
This website serves as a central hub for my publications, projects, and professional activities.
news
| May 27, 2025 | I will be starting my Applied Scientist Internship at AWS AI Lab this summer! |
|---|---|
| May 20, 2025 | I am honored to be a winner of the CS3 VALIDATE Accelerator program, which will provide continued funding for our work SINA. |
| Mar 06, 2025 | Our paper, Efficient Consistency Model Training for Policy Distillation in Reinforcement Learning, was accepted to the ICLR 2025 DeLTa Workshop as a poster presentation. |
| Aug 01, 2024 | I am excited to begin my Ph.D. studies at Columbia University. |
| Feb 12, 2024 | Our paper, SLAMuZero: Plan and learn to Map for Joint SLAM and Navigation, was accepted to ICAPS 2024. |
latest posts
| Oct 29, 2023 | Some Recent Advancement Around MuZero |
|---|---|
| Jun 20, 2023 | MPC with a Differentiable Forward Model: An Implementation with Jax |
| May 12, 2023 | Adding MuZero into RL Toolkits at Ease |
selected publications
- Decaying Budget Forcing: A Simple and Effective Reinforcement Learning Approach for Balancing Accuracy and Capacity in Mathematical ReasoningIn submission, 2026
-
Do Math Reasoning LLMs Help Predict the Impact of Public Transit Events?Under review at Transportation Research Part C (Special Issue: Foundation Models and Large Language Models in Urban Mobility), 2025arXiv preprint -
Efficient Consistency Model Training for Policy Distillation in Reinforcement LearningIn ICLR 2025 Workshop on Deep Generative Model in Machine Learning: Theory, Principle and Efficacy, 2025 -
Learn to Tour: Operator Design for Solution Feasibility Mapping in Pickup-and-delivery Traveling Salesman ProblemIn Proceedings of the IEEE Intelligent Transportation Systems Conference (ITSC), 2025 -
TraveLLM: Could you plan my new public transit route in face of a network disruption?In Proceedings of the IEEE Intelligent Transportation Systems Conference (ITSC), 2025