A Simple Loop For Bandit Problem
14 views (last 30 days)
Show older comments
Hello. I wanna write a code for Bandit problem. My problem: We have 5 machines. We wanna play each of these machines 1000 times (we call each 1000 plays a TASK). In each time, the machine gives us a reward (randomly). We wanna check that the mean reward of which one of these TASKs is more than others (maximum). I wrote this code but it doesn't work well. Why?
for j=1:5
h=0;
for i=1:1000
rew=randn(1);
sum=rew+h;
end
mean(j)=sum./1000; [value,index]=max(mean(j))
end end
Thanks in advance
0 Comments
Answers (1)
Manikanta Aditya
on 20 Jun 2022
Hi Zahra,
The calculation of mean is to be done in the inner loop, where as I see it is being done in the outer loop in your code. This is the reason for the errors. Please refer to the code below :
mean = [];
for i = 1 : 5
m = 0;
for j = 1 : 1000
s = 0;
r = randn(1);
s = s + r;
m = s / 1000;
end
mean(i) = m;
end
value = max(mean);
index = find(mean == max(mean));
disp(mean);
disp(value);
disp(index);
0 Comments
See Also
Categories
Find more on Parallel and Cloud in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!