Policy | Weight of reward function |
---|---|
Ď€ BL | [1/7,1/7,1/7,1/7,1/7,1/7,1/7] |
\(\phantom {\dot {i}\!}\pi _{{BL}_{1}}\) | [0.14,0.24,0.15,0.19,0.07,0.07,0.14] |
\(\phantom {\dot {i}\!}\pi _{{BL}_{2}}\) | [0.08,0.17,0.16,0.18,0.29,0.10,0.02] |
\(\phantom {\dot {i}\!}\pi _{{BL}_{3}}\) | [0.07,0.19,0.12,0.21,0.26,0.04,0.11] |