Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Page Not Found

Page not found. Your pixels are in another canvas.

Jupyter notebook markdown generator

Posts

Some Recent Advancement Around MuZero

8 minute read

Published: October 29, 2023

Limitation of MuZero

MPC with a Differentiable Forward Model: An Implementation with Jax

11 minute read

Published: June 20, 2023

mpc control

Intro

In a recent project for MECS6616 Robot Learning, I got hands-on experience for Model Predictive Control (MPC). To solve the problem, the use of constant action and pseudo-gradient is a recommended method, and it truly provides simple yet good enough solutions. However, the project instructions also hinted at another prospect: a differentiable forward model could help, since you can always compute numerical gradients. This piqued my curiosity - could we directly compute the gradient with respect to action given the evaluation metric? And if so, how could we implement this practically?

Adding MuZero into RL Toolkits at Ease

3 minute read

Published: May 12, 2023

Gym environment examples: Cart Pole Gym environment examples: Lunar Lander

MUAX 😘

Muax provides help for using DeepMind’s mctx on gym-style environments.

“Hindsight” – An easy yet effective RL Technique HER with Pytorch implementation

22 minute read

Published: May 18, 2022

This week, I will share a paper published by OpenAI at NeurIPS 2017. The ideas presented in this paper are quite insightful, and it tackles a complex problem using only simple algorithmic improvements. I gained significant inspiration from this paper. At the end, I will also provide a brief implementation of HER (Hindsight Experience Replay).

What are the Effective Deep Learning Models for Tabular Data?

27 minute read

Published: March 13, 2022

This week, I would like to share a paper published at NeurIPS 2021. When dealing with tabular data, I often find myself perplexed. On one hand, I am unsure which deep learning frameworks are better suited for this task, and on the other hand, I am uncertain whether the time-consuming process of training a model can outperform the easily accessible GBDT family of models such as XGBoost and LightGBM. However, this paper provides a detailed and comprehensive comparison of deep learning algorithms and GBDT models on tabular data. It introduces new baselines and presents a novel architecture that outperforms other deep learning models. I have gained a lot from this paper and would like to share it with you.

Will DRL Make Profit in High-Frequency Trading?

10 minute read

Published: October 21, 2021

Can deep reinforcement learning algorithms be used to train a trading agent that can achieve long-term profitability using Limit Order Book (LOB) data? To answer this question, this article proposes a deep reinforcement learning framework for high-frequency trading and conducts experiments using limit order data from LOBSTER with the PPO algorithm. The results show that the agent is able to identify short-term patterns in the data and propose profitable trading strategies.

portfolio

MUAX

Autonomous Learning of Physical Environment through Neural Tree Search

Proposed a MCTS-based reinforcement learning algorithm to perform active slam.

Temporal Graph Attention Network Prediction on Ethereum Transaction Cost and Analysis on ‘The Merge’

Proposed a GNN model based on temporal transaction network to predict Ethereum Transaction Cost

Light Attention Vision Modules for Atari

Propose an attention-based vision policy that can play Atari games based on pixel input.

Reinforcement Learning for Goal-based Wealth Management: A study of behavior improvement through approximation and reward engineering

Proposed a reward engineering method and leveraged function approximation for value function.

publications

Learn to Tour: Operator Design For Feasible Solution Mapping

Published in , 2023

RL for operator sequential execution

Download here

SLAMuZero: Plan and Learn to Map for Joint SLAM and Navigation

Published in ICAPS, 2024

SLAM + MuZero for navigation

Download here

TraveLLM: Could you plan my new public transit route in face of a network disruption?

Published in , 2024

LLM for Route Recommendation

Download here

talks

Learn to Tour: Operator Design for Solution Feasibility Mapping

Published: October 15, 2023

session: OR via Reinforcement Learning and Beyond, INFORMS 2023

SLAMuZero: Plan and learn to Map for Joint SLAM and Navigation

Published: June 05, 2024

session: Robots and Space, ICAPS 2024

teaching

Graduate Optimization Models and Methods, Teaching Assistant

Graduate course, Columbia University, IEOR, 2023

Take the role of Teaching Assistant for Graduate Optimization Models and Methods, topics include linear programming, the simplex method, duality, nonlinear, integer and dynamic programming. Duties included:

Graded homework and course project and provided detailed feedback
Revised solutions
improved the final project Moving Object Detection coding part.

Bowen Fang

Sitemap

Pages

Posts

Intro

MUAX 😘

portfolio

publications

talks

teaching