Create a dynamic function for solving bandit problem.

1 view (last 30 days)
Hello, I want to evaluate (not maximize) the function that is inside of the brackets in the image, for the most simple case of N=1. To do this, apparently it is required to use dynamic programming: evaluating first the last term (which is fixed in (s + alpha) /(s + f + alpha + beta)) then the previous one, and so on; as shown in the function.
I wrote this code, but is not working. I do not know how to define functions in this way, this is what I was able to do:
% code
function [ out ] = future_expected_reward(s,f,alpha,beta,k,l)
if k==l %"l" is the game length
out = (s + alpha) /(s + f + alpha + beta);
else
out = ( (s + alpha) /(s + f + alpha + beta) ) * future_expected_reward(s+1,f,alpha,beta,k+1,l) + ...
((f + beta) /(s + f + alpha + beta)) * future_expected_reward(s,f+1,alpha,beta,k+1,l);
end
end
I want to evaluate the function at trial "k", of a total of "l" trials, with "alpha" and "beta" fixed (and since N=1 for my case, you should ignore the i's).
I really need you help! Thanks!!

Answers (0)

Categories

Find more on Just for fun in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!