Code covered by the BSD License
-
[V,mean_discrepancy]=mdp_eval...
mdp_eval_policy_TD_0 Evaluation of the value function, using the TD(0) algorithm
-
mdp_LP(P, R, discount)
mdp_LP Resolution of discounted MDP with linear programming
-
mdp_Q_learning(P, R, discount...
mdp_Q_learning Evaluation of the matrix Q, using the Q learning algorithm
-
mdp_bellman_operator(P, PR, d...
mdp_bellman_operator Applies the Bellman operator on the value function Vprev
-
mdp_check(P , R)
mdp_check Check if the matrices P and R define a Markov Decision Process
-
mdp_check_square_stochastic( ...
mdp_check_square_stochastic Check if Z is a square stochastic matrix
-
mdp_computePR(P,R)
mdp_computePR Computes the reward for the system in one state
-
mdp_computePpolicyPRpolicy(P,...
mdp_computePpolicyPRpolicy Computes the transition matrix and the reward matrix for a policy
-
mdp_eval_policy_iterative(P, ...
mdp_eval_policy_iterative Policy evaluation using iteration.
-
mdp_eval_policy_matrix(P, R, ...
mdp_eval_policy_matrix Evaluation of the value function of a policy
-
mdp_eval_policy_optimality(P,...
mdp_eval_policy_optimality Eval if near optimum actions exists for each state
-
mdp_example_forest (S, r1, r2...
mdp_example_forest Generate a Markov Decision Process example based on
-
mdp_example_rand (S, A, is_sp...
mdp_example_rand Generate a random Markov Decision Process
-
mdp_finite_horizon(P, R, disc...
mdp_finite_horizon Reolution of finite-horizon MDP with backwards induction
-
mdp_policy_iteration(P, R, di...
mdp_policy_iteration Resolution of discounted MDP
-
mdp_policy_iteration_modified...
mdp_policy_iteration_modified Resolution of discounted MDP
-
mdp_relative_value_iteration(...
mdp_relative_value_iteration Resolution of MDP with average reward
-
mdp_silent()
mdp_silent Ask for running resolution functions of the MDP Toolbox
-
mdp_span(W)
mdp_span Returns the span of W
-
mdp_value_iteration(P, R, dis...
mdp_value_iteration Resolution of discounted MDP with value iteration algorithm
-
mdp_value_iterationGS(P, R, d...
mdp_value_iterationGS Resolution of discounted MDP with value iteration Gauss-Seidel algorithm
-
mdp_value_iteration_bound_ite...
mdp_value_iteration_bound_iter Computes a bound for the number of iterations
-
mdp_verbose()
mdp_verbose Ask for running resolution functions of the MDP Toolbox
-
Presentation of MDP toolbox d...
-
index_alphabetic.html
-
index_category.html
-
mdp_LP.html
-
mdp_Q_learning.html
-
mdp_bellman_operator.html
-
mdp_check.html
-
mdp_check_square_stochastic.h...
-
mdp_computePR.html
-
mdp_computePpolicyPRpolicy.ht...
-
mdp_eval_policy_TD_0.html
-
mdp_eval_policy_iterative.html
-
mdp_eval_policy_matrix.html
-
mdp_eval_policy_optimality.ht...
-
mdp_example_forest.html
-
mdp_example_rand.html
-
mdp_finite_horizon.html
-
mdp_policy_iteration.html
-
mdp_policy_iteration_modified...
-
mdp_relative_value_iteration....
-
mdp_span.html
-
mdp_value_iteration.html
-
mdp_value_iterationGS.html
-
mdp_value_iteration_bound_ite...
-
mdp_verbose_silent.html
-
View all files
from
Markov Decision Processes (MDP) Toolbox
by Marie-Josee Cros
Functions related to the resolution of discrete-time Markov Decision Processes.
|
| Presentation of MDP toolbox documentation |
Presentation of MDP toolbox documentation
Markov Decision Processes (MDP) Toolbox v3.0 for
MATLAB
CONTENTS
The MDP toolbox proposes functions related to the resolution of
discrete-time
Markov Decision Processes: backwards induction, value iteration, policy
iteration,
linear programming algorithms with some variants.
The functions (m-functions) were developped with MATLAB
(one of the functions requires the Mathworks
Optimization Toolbox) by the Biometry
and Artificial Intelligence Unit of INRA
Toulouse (France).
The version 2.0 (February 2005) handles sparse matrices and adds an
example.
The version 3.0 (September 2009) adds several functions related to Reinforcement Learning and improve the handling of sparse matrices. For more detail see the README file.
FUNCTIONS DESCRIPTION
NOTATION
- states: set {1, 2, ..., S}
- actions: set {1, 2, ..., A}
- transitions: P(s,s',a) is the probability to reach state s' when the system is in state s and action a is performed by the decision maker
- rewards: R(s,s',a) is the reward obtained when the system is in state s on decision epoch t and is in state s' at decision epoch t+1, with action a performed
R(s,a): reward when the system is in state s at decision epoch t and
action a is performed by the decision maker
USE OF DOCUMENTATION
The documentation pages can be displayed with the MATLAB navigator (used for
the MATLAB help). In the File menu, choose the Open
item. Open the sub-directory documentation. Select the item
'All Files (*.*)' for the attribut Files of type. Then select
the file DOCUMENTATION.html and open it.
The sub-directory documentation contains all the pages
describing the m-functions in HTML.
mdp_toolbox/documentation/DOCUMENTATION.html
Page created on July 31, 2009. Last update on August 31, 2009.
|
|
Contact us at files@mathworks.com