A Simple Loop For Bandit Problem

2 views (last 30 days)
Zahra kamkar
Zahra kamkar on 13 May 2014
Answered: Manikanta Aditya on 20 Jun 2022
Hello. I wanna write a code for Bandit problem. My problem: We have 5 machines. We wanna play each of these machines 1000 times (we call each 1000 plays a TASK). In each time, the machine gives us a reward (randomly). We wanna check that the mean reward of which one of these TASKs is more than others (maximum). I wrote this code but it doesn't work well. Why?
for j=1:5
h=0;
for i=1:1000
rew=randn(1);
sum=rew+h;
end
mean(j)=sum./1000; [value,index]=max(mean(j))
end end
Thanks in advance

Answers (1)

Manikanta Aditya
Manikanta Aditya on 20 Jun 2022
Hi Zahra,
The calculation of mean is to be done in the inner loop, where as I see it is being done in the outer loop in your code. This is the reason for the errors. Please refer to the code below :
mean = [];
for i = 1 : 5
m = 0;
for j = 1 : 1000
s = 0;
r = randn(1);
s = s + r;
m = s / 1000;
end
mean(i) = m;
end
value = max(mean);
index = find(mean == max(mean));
disp(mean);
disp(value);
disp(index);

Categories

Find more on 3-D Scene Control in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!