Fine-tuning Reinforcement Learning Models is Secretly a Forgetting Mitigation Problem

https://arxiv.org/pdf/2402.02868v3

image-20250316230434709

image-20250316231300369

Lifelong Reinforcement Learning with Modulating Masks

https://arxiv.org/pdf/2212.11110

This has some connection with LLM + adapter Policy model.

image-20250316231607110