Qisong Yang, Thiago D. Simão, Nils Jansen, Simon H. Tindemans, Matthijs T.J. Spaan (2023), Reinforcement Learning by Guided Safe Exploration, Kobi Gal, Kobi Gal, Ann Nowe, Grzegorz J. Nalepa, Roy Fairstein, Roxana Radulescu (Eds.), In ECAI 2023 - 26th European Conference on Artificial Intelligence, including 12th Conference on Prestigious Applications of Intelligent Systems, PAIS 2023 - Proceedings p.2858 - 2865.
T. D. Simão (2023), Safe Online and Offline Reinforcement Learning, PhD Thesis Delft University of Technology.
Danial Kamran, Thiago D. Simão, Qisong Yang, Canmanie T. Ponnambalam, Johannes Fischer, Matthijs T.J. Spaan, Martin Lauer (2022), A Modern Perspective on Safe Automated Driving for Different Traffic Dynamics using Constrained Reinforcement Learning, In Proceedings of the IEEE International Conference on Intelligent Transportation Systems p.4017-4023, IEEE.
Canmanie Ponnambalam, Danial Kamran, Thiago D. Simão, Frans A. Oliehoek, Matthijs T.J. Spaan (2022), Back to the Future: Solving Hidden Parameter MDPs with Hindsight.
Q. Yang, T. D. Simão, Simon H. Tindemans, M.T.J. Spaan (2022), Refined Risk Management in Safe Reinforcement Learning with a Distributional Safety Critic, David Bossens, Stephen Giguere, Roderick Bloem, Bettina Koenighofer (Eds.), In Safe RL Workshop at IJCAI 2022.
Qisong Yang, Thiago D Simão, Simon H. Tindemans, Matthijs T.J. Spaan (2022), Safety-constrained reinforcement learning with a distributional safety critic, In Machine Learning Volume 112 p.859-887.
Q. Yang, T. D. Simão, Nils Jansen, Simon H. Tindemans, M.T.J. Spaan (2022), Training and Transferring Safe Policies in Reinforcement Learning, Hayes Cruz , Santos da Silva (Eds.), In Proceedings of the Adaptive and Learning Agents Workshop.
T. D. Simão, Nils Jansen, M.T.J. Spaan (2021), AlwaysSafe: Reinforcement Learning without Safety Constraint Violations during Training, In Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems p.1226-1235, International Foundation for Autonomous Agents and Multiagent Systems.
Q. Yang, T. D. Simão, S.H. Tindemans, M.T.J. Spaan (2021), WCSAC: Worst-Case Soft Actor Critic for Safety-Constrained Reinforcement Learning, In Proceedings of the 35th AAAI Conference on Artificial Intelligence (AAAI-21) p.10639-10646.
Thiago D. Simão, Romain Laroche, Rémi Tachet des Combes (2020), Safe Policy Improvement with an Estimated Baseline Policy, In Proceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems p.1269–1277.