State | Action | Influence |
---|---|---|
\(\varvec{s}\) | \(\varvec{a}\) | \(Q^{\pi }(\varvec{s},\varvec{a})\) |
\(\left( \,\begin{matrix} 0 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 \end{matrix}\,\right)\) | \(\left( \,\begin{matrix} 1 &{} 0 &{} 0 &{} 0 \\ 1 &{} 0 &{} 0 &{} 0 \end{matrix}\,\right)\) | 3.27869 |
\(\left( \,\begin{matrix} 1 &{} 0 &{} 0 &{} 0 \\ 0 &{} 1 &{} 0 &{} 0 \end{matrix}\,\right)\) | 3.09836 | |
\(\left( \,\begin{matrix} 1 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 1 &{} 0 \end{matrix}\,\right)\) | 3.22414 | |
\(\vdots\) | \(\vdots\) | |
\(\left( \,\begin{matrix} 0 &{} 0 &{} 0 &{} 1 \\ 0 &{} 0 &{} 0 &{} 1 \end{matrix}\,\right)\) | 3.90909 |