Value Function Iteration Code Not Converging
Show older comments
Hello all, I am currently coding a value/policy function iteration code to solve an economic problem. This is the Bellman equation I am trying to solve:

The variables are defined as follows:
V = the value function
S, R = the state variables stating the number of safe and risky assets owned in a given round, respectively
S', R' = the control variables stating the numer of safe and risky assets held the next round
Pi_S and Pi_R = the returns of the safe and risky assets, respectively
Delta = the continuation probability of the round
The expectation is taken with respect to the probability distribution of the risky asset's returns
We solve this equation by first starting off with a guess of V at every state of S and R. The code then chooses the S' and R' that result in the maximum value of the right side of the equation at every possible state S and R. Once a maximum value is determined for every state, V is updated and this process repeats until V and the new V generated are within a certain convergence criteria which is calculated by taking the log difference of the new and old value functions. Once the convergence criterion is reached, the program is completed successfully. To make the prolem easier on the computer, we limit S and R to 100 and conduct linear approximations if S' and R' could be past 100.
The issue I am running into right now is that V is only converging at very specific value of delta. If delta is anything else, V does not converge, and the difference between V and the new V gets stuck at one value. I cannot seem to figure out why. My code is below. Any help on what could possibly prevent it from converging would be greatly appreciated. Thank you all in advance!
clc;
clear;
% Model Parameters and initalize matrices
numRisky = 100;
numSafe = 100;
v = (1:numSafe)' + (1:numRisky);
vnew=ones(numSafe, numRisky);
policy = zeros(numSafe, numRisky);
delta=0.99;
ctol=0.001;
ntol=200;
count=0;
norm=1;
vx = ones(numSafe);
vy = ones(numRisky);
alpha = 1;
while norm>ctol && count<ntol
for R=1:numRisky
riskyReturns = 1:1:3;
numReturns = numel(riskyReturns);
pReturns = (1/numReturns)*ones(1, numReturns);
u = zeros(1, numReturns);
p = zeros(1, numReturns);
for S = 1:numSafe
safeReturn = 2;
for r = 1:numReturns
safeIncome = S*safeReturn;
riskyIncome = R*riskyReturns(r);
posI = safeIncome + riskyIncome;
U = alpha*log(S+R);
W = zeros(1, posI);
for i = 1:posI
if S + (posI-i) > numSafe && R + i <= numRisky
vBar = v(numSafe,R+i) + vx(R+i)*(S+posI-i-numSafe); %conduct linear approximation
W(i) = (1-delta)*U + delta*vBar;
elseif R + i > numRisky && S + (posI-i) <= numSafe
vBar = v(S+posI-i,numRisky) + vy(S+posI-i)*(R+i-numRisky);
W(i) = (1-delta)*U + delta*vBar; %conduct linear approximation
elseif R + i > numRisky && S + (posI-i) > numSafe
vBar = v(numSafe,numRisky) + vx(numRisky)*(S+posI-i-numSafe) + vy(numSafe)*(R+i-numRisky);
W(i) = (1-delta)*U + delta*vBar; %conduct linear approximation
else
W(i) = (1-delta)*U + delta*v(S + (posI - i), R + i); %no linear approximation needed here
end
end
[w, ind] = max(W);
u(r) = w;
p(r) = ind/posI;
end
vnew(S, R) = dot(u, pReturns);
policy(S, R) = dot(p, pReturns);
end
end
%calculate convergence criteria and update value function
count = count + 1;
norm = max(abs(log(vnew(:)) - log(v(:))));
v = vnew;
%calculate slopes for linear approximation if S' and R' are greater than numSafe and numRisky, respectively
for n = 1:numSafe
vx(n) = v(numSafe,n)-v(numSafe-1,n);
end
for m = 1:numRisky
vy(m) = v(m,numRisky)-v(m,numRisky-1);
end
disp('Iteration:')
disp(count)
disp('Error Term:')
disp(norm)
end
7 Comments
Torsten
on 1 Feb 2025
It would help if you could give a mathematical description by formulae, not a description by words.
Benjamin
on 1 Feb 2025
Benjamin
on 1 Feb 2025
Is this really the way stochastic dynamic programming problems are solved ? Maybe these codes can help to improve your solution algorithm:
Benjamin
on 2 Feb 2025
I must admit that I don't understand what you are trying to do in your code.
I guess that in round 0, you start with a number S0 and R0 of safe and risky assets and value V(S0,R0). Then you fix a certain number of rounds (say N) and the aim is to maximize V(SN,RN) by buying and/or selling safe and risky assets in the course of time - thus getting a sequence (S1,R1),(S2,R2),...,(S_N-1,R_N-1),(SN,RN). But I can't find how you take care of this recursive dependence of V(Si,Vi) from V(S_i-1,V_i-1) for 1<=i<=N in your code.
But maybe I completely misunderstand the economic problem in the background.
Answers (0)
Categories
Find more on Numerical Integration and Differential Equations in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!