Incorporating causal factors into reinforcement learning for dynamic treatment regimes in HIV | BMC Medical Informatics and Decision Making