Create a dynamic function for solving bandit problem.
1 view (last 30 days)
Show older comments
Hello, I want to evaluate (not maximize) the function that is inside of the brackets in the image, for the most simple case of N=1. To do this, apparently it is required to use dynamic programming: evaluating first the last term (which is fixed in (s + alpha) /(s + f + alpha + beta)) then the previous one, and so on; as shown in the function.
I wrote this code, but is not working. I do not know how to define functions in this way, this is what I was able to do:
% code
function [ out ] = future_expected_reward(s,f,alpha,beta,k,l)
if k==l %"l" is the game length
out = (s + alpha) /(s + f + alpha + beta);
else
out = ( (s + alpha) /(s + f + alpha + beta) ) * future_expected_reward(s+1,f,alpha,beta,k+1,l) + ...
((f + beta) /(s + f + alpha + beta)) * future_expected_reward(s,f+1,alpha,beta,k+1,l);
end
end
I want to evaluate the function at trial "k", of a total of "l" trials, with "alpha" and "beta" fixed (and since N=1 for my case, you should ignore the i's).
I really need you help! Thanks!!
0 Comments
Answers (0)
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!