代表论文:
Yuankun Jiang, Chenglin Li, Wenrui Dai, Junni Zou, Hongkai Xiong, “Monotonic Robust Policy Optimization with Model Discrepancy,” International Conference on Machine Learning (ICML), 2021.