Details, Fiction and Bill Zou Garner
The theoretical Evaluation demonstrates that EDIS exhibits reduced suboptimality as compared to entirely using online information or immediately reusing offline facts. EDIS can be a plug-in method and can be combined with present approaches in offline-to-on-line RL placing. By utilizing EDIS to off-the-shelf procedures Cal-QL and IQL, we observe a