Code covered by the BSD License  

Highlights from
Markov Decision Processes (MDP) Toolbox

image thumbnail
from Markov Decision Processes (MDP) Toolbox by Marie-Josee Cros
Functions related to the resolution of discrete-time Markov Decision Processes.

mdp_eval_policy_matrix.html
mdp_eval_policy_matrix description
MDP Toolbox for MATLAB

mdp_eval_policy_matrix

Evaluates a policy using matrix operation.

Syntax

Vpolicy = mdp_eval_policy_matrix (P, R, discount, policy)

Description

mdp_eval_policy_matrix evaluates the value fonction associated with a policy.

Arguments

  • P : transition probability array.
P can be a 3 dimensions array (SxSxA) or a cell array (1xA), each cell containing a sparse matrix (SxS).
  • R : reward array.
R can be a 3 dimensions array (SxSxA) or a cell array (1xA), each cell containing a sparse matrix (SxS) or a 2D array (SxA) possibly sparse.
  • discount : discount factor.
discount is a real which belongs to [0; 1].
  • policy : a policy.
policy is a (Sx1) vector. Each element is an integer corresponding to an action.

Evaluation

  • Vpolicy : value fonction.
Vpolicy is a (Sx1) vector.

Example

>> P(:,:,1) = [ 0.5 0.5;   0.8 0.2 ];
>> P(:,:,2) = [ 0 1;   0.1 0.9 ];
>> R = [ 5 10;   -1 2 ];

>> Vpolicy = mdp_eval_policy_matrix(P, R, 0.9, [1; 2])
Vpolicy =
   28.9063
   24.2188

In the above example, P can be a cell array containing sparse matrices:
>> P{1} = sparse([ 0.5 0.5;  0.8 0.2 ]);
>> P{2} = sparse([ 0 1;  0.1 0.9 ]);
The function call is unchanged.


MDP Toolbox for MATLAB



MDPtoolbox/documentation/mdp_eval_policy_matrix.html
Page created on July 31, 2001. Last update on August 31, 2009.

Contact us at files@mathworks.com