Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
Spring 2010 MATLAB Contest, April 28 - May 5

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Helen Chen

Date: 28 Apr, 2010 14:47:04

Message: 1 of 159

Hello Community Members!

This is your reminder that the Spring MATLAB Programming Contest will start today at noon Eastern time. We will announce the contest start on this thread and also on the contest blog.

Please use this thread if you have questions or comments during the contest.

Talk to you soon!
Helen and the MATLAB Central Contest Team

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Helen Chen

Date: 28 Apr, 2010 16:15:21

Message: 2 of 159

The contest is now live. Go to the Sensor Contest Rules to start. http://www.mathworks.com/matlabcentral/contest/contests/2/rules

The contest website has been redesigned. You now need to log into your MATLAB Central account in order to submit or to comment on an existing submission.

Good luck to everyone! Have a great week!

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Will Dampier

Date: 28 Apr, 2010 16:30:26

Message: 3 of 159

"Helen Chen" <helen.chen@mathworks.com> wrote in message <hr9hp8$sth$1@fred.mathworks.com>...
> Hello Community Members!
>
> This is your reminder that the Spring MATLAB Programming Contest will start today at noon Eastern time. We will announce the contest start on this thread and also on the contest blog.
>
> Please use this thread if you have questions or comments during the contest.
>
> Talk to you soon!
> Helen and the MATLAB Central Contest Team

I've downloaded the example files and it seems like the testsuite_sample.mat file is corrupt and won't unzip properly. Is anyone else having this problem or just me?

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Alan Chalker

Date: 28 Apr, 2010 16:42:04

Message: 4 of 159

"Will Dampier" <walldo2@gmail.com> wrote in message <hr9nr2$ia2$1@fred.mathworks.com>...
> "Helen Chen" <helen.chen@mathworks.com> wrote in message <hr9hp8$sth$1@fred.mathworks.com>...
> > Hello Community Members!
> >
> > This is your reminder that the Spring MATLAB Programming Contest will start today at noon Eastern time. We will announce the contest start on this thread and also on the contest blog.
> >
> > Please use this thread if you have questions or comments during the contest.
> >
> > Talk to you soon!
> > Helen and the MATLAB Central Contest Team
>
> I've downloaded the example files and it seems like the testsuite_sample.mat file is corrupt and won't unzip properly. Is anyone else having this problem or just me?

I was able to open it just fine.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Alan Chalker

Date: 28 Apr, 2010 16:44:05

Message: 5 of 159

"Helen Chen" <helen.chen@mathworks.com> wrote in message <hr9mup$j84$1@fred.mathworks.com>...
> The contest is now live. Go to the Sensor Contest Rules to start. http://www.mathworks.com/matlabcentral/contest/contests/2/rules
>
> The contest website has been redesigned. You now need to log into your MATLAB Central account in order to submit or to comment on an existing submission.
>
> Good luck to everyone! Have a great week!

I'm note sure if you intended this or not, but I can see the actual code of an entry I submitted, even though we are in darkness.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Alan Chalker

Date: 28 Apr, 2010 16:46:05

Message: 6 of 159

"Helen Chen" <helen.chen@mathworks.com> wrote in message <hr9mup$j84$1@fred.mathworks.com>...
> The contest is now live. Go to the Sensor Contest Rules to start. http://www.mathworks.com/matlabcentral/contest/contests/2/rules
>
> The contest website has been redesigned. You now need to log into your MATLAB Central account in order to submit or to comment on an existing submission.
>
> Good luck to everyone! Have a great week!

I also just noticed that Jan's entry is showing up as a 'failed' result, instead of that being hidden. I hope this is intentional, since that was one of the major suggestions in the fall.

Can you somehow summarize the major changes to the contest? There were a lot of ideas bounced around last year and I'm curious as to what you ended up implementing.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Helen Chen

Date: 28 Apr, 2010 17:07:04

Message: 7 of 159

"Will Dampier" <walldo2@gmail.com> wrote in message
> I've downloaded the example files and it seems like the testsuite_sample.mat file is corrupt and won't unzip properly. Is anyone else having this problem or just me?

Will - Can you download a new file? As Alan noted, the file seems to open ok for us, but Ned resaved the file and reposted just in case...

Thanks,
Helen

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Helen Chen

Date: 28 Apr, 2010 17:11:05

Message: 8 of 159

"Alan Chalker" <alancNOSPAM@osc.edu> wrote in message <hr9okl$bim$1@fred.mathworks.com>...
> I'm note sure if you intended this or not, but I can see the actual code of an entry I submitted, even though we are in darkness.

Hi Alan - As long as the code you are looking at is yours, it is ok that you see it. It is also ok that you see someone else's failed entry in the queue, since you can't see the score.

Helen

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Dario Ringach

Date: 28 Apr, 2010 17:45:21

Message: 9 of 159


Helen,

Are these supposed to be actual natural images (pictures) or just any arbitrary random matrix?

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Lucio Cetto

Date: 28 Apr, 2010 18:36:04

Message: 10 of 159

Dario:
We are using a sample of real images, at different resolutions and different themes.
Lucio

"Dario Ringach" <dario@ucla.edu> wrote in message <hr9s7h$9b7$1@fred.mathworks.com>...
>
> Helen,
>
> Are these supposed to be actual natural images (pictures) or just any arbitrary random matrix?

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: John

Date: 28 Apr, 2010 21:19:06

Message: 11 of 159

 would there be some type of gui created for this contest ? Looks to be just a compression matrix overlay on sensors .

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Helen Chen

Date: 28 Apr, 2010 22:26:04

Message: 12 of 159

"Alan Chalker" <alancNOSPAM@osc.edu> wrote in message
> Can you somehow summarize the major changes to the contest? There were a lot of ideas bounced around last year and I'm curious as to what you ended up implementing.

Hi Alan -

I posted a response to this question on the contest blog so that we could make this a mini-discussion if this is a popular topic.

http://blogs.mathworks.com/contest/2010/04/28/new-contest-website-features/

There have been 74 submissions made by 40 players so far. Any feedback on the updated look and feel?

Helen

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Alan Chalker

Date: 28 Apr, 2010 22:34:04

Message: 13 of 159


>
> Hi Alan -
>
> I posted a response to this question on the contest blog so that we could make this a mini-discussion if this is a popular topic.
>
> http://blogs.mathworks.com/contest/2010/04/28/new-contest-website-features/
>
> There have been 74 submissions made by 40 players so far. Any feedback on the updated look and feel?
>
> Helen

Thanks Helen. I really like the new look and feel. It'll be interesting to see how it looks once we are in daylight and all the features are visible.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Helen Chen

Date: 29 Apr, 2010 00:54:05

Message: 14 of 159

"Alan Chalker" <alancNOSPAM@osc.edu> wrote in message <hrad4s$ksc$1@fred.mathworks.com>...
> Thanks Helen. I really like the new look and feel. It'll be interesting to see how it looks once we are in daylight and all the features are visible.

Thanks Alan! :-)

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Seth Popinchalk

Date: 29 Apr, 2010 03:23:05

Message: 15 of 159

"John" <someonesemail2004@yahoo.com> wrote in message <hra8oa$5ru$1@fred.mathworks.com>...
> would there be some type of gui created for this contest ? Looks to be just a compression matrix overlay on sensors .

Hi John,
I don't know of any GUI created for this contest. If you make one based on the test suite, be sure to submit it to the file exchange so others can try it out!

-Seth

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: mavs favs

Date: 29 Apr, 2010 04:56:04

Message: 16 of 159

can we use interp2 function for the solver?

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Daniel Armyr

Date: 29 Apr, 2010 07:27:03

Message: 17 of 159

Hi.
Unfortunately, I can't squeeze in the time to participate in this year's contest.

I would just like to share my thought that this seems to essentially be a deconvolution problem. Maybe you guys allready know this, but if not, I would seriously look into how deconvolution can be applied.

My possibly worthless 0.02SEK.

//DA

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: the cyclist

Date: 29 Apr, 2010 15:44:04

Message: 18 of 159

"John" <someonesemail2004@yahoo.com> wrote in message <hra8oa$5ru$1@fred.mathworks.com>...
> would there be some type of gui created for this contest ? Looks to be just a compression matrix overlay on sensors .

If you run "runcontest" with the option argument

>> runcontest(true)

you will see graphical output of your solver.

the cyclist

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Seth Popinchalk

Date: 29 Apr, 2010 15:45:21

Message: 19 of 159

"mavs favs" <devroymato@gmail.com> wrote in message <hrb3h4$9u5$1@fred.mathworks.com>...
> can we use interp2 function for the solver?

Interp2 is part of core MATLAB under $matlab/toolbox/matlab

>> which interp2
C:\Program Files\MATLAB\R2010a\toolbox\matlab\polyfun\interp2.m

What is allowed and not allowed is explained in the fine print here:

http://www.mathworks.com/matlabcentral/contest/contests/2/rules#fineprint

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Daniel

Date: 29 Apr, 2010 16:03:04

Message: 20 of 159

"Helen Chen" <helen.chen@mathworks.com> wrote in message <hr9hp8$sth$1@fred.mathworks.com>...
> Hello Community Members!
>
> This is your reminder that the Spring MATLAB Programming Contest will start today at noon Eastern time. We will announce the contest start on this thread and also on the contest blog.
>
> Please use this thread if you have questions or comments during the contest.
>
> Talk to you soon!
> Helen and the MATLAB Central Contest Team

Hi,
at which times start/end the visibility periods?

Daniel

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Nicholas Howe

Date: 29 Apr, 2010 16:23:03

Message: 21 of 159

Is it just me, or are the timestamps on the submissions (and newsgroup posts) off by four hours? Maybe that's why the contest doesn't seem to be in twilight yet?

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Alan Chalker

Date: 29 Apr, 2010 16:25:22

Message: 22 of 159

"Nicholas Howe" <NikHow@hotmail.com> wrote in message <hrcbp7$mcl$1@fred.mathworks.com>...
> Is it just me, or are the timestamps on the submissions (and newsgroup posts) off by four hours? Maybe that's why the contest doesn't seem to be in twilight yet?

I think they are being displayed in GMT, not Eastern.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Helen Chen

Date: 29 Apr, 2010 18:28:04

Message: 23 of 159

"Nicholas Howe" <NikHow@hotmail.com> wrote in message <hrcbp7$mcl$1@fred.mathworks.com>...
> Is it just me, or are the timestamps on the submissions (and newsgroup posts) off by four hours? Maybe that's why the contest doesn't seem to be in twilight yet?

Sorry, I was late on switching that over. We are now in twilight!

About the hours, MATLAB Central is on UTC time. I will try to be better about announcing contest dates in UTC.

Helen

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Ravi

Date: 30 Apr, 2010 00:50:04

Message: 24 of 159

I'm a newbie to the MATLAB Contest
In the rules, when you say the code should not run for more than 180 seconds - do you mean each call to our solver, or the overall runcontest code, which calls the entire testsuite?

Ravi

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Sergey

Date: 30 Apr, 2010 01:43:04

Message: 25 of 159

"Ravi " <whereisravi@gmail.com> wrote in message <hrd9fs$92n$1@fred.mathworks.com>...
> I'm a newbie to the MATLAB Contest
> In the rules, when you say the code should not run for more than 180 seconds - do you mean each call to our solver, or the overall runcontest code, which calls the entire testsuite?
>
> Ravi


entire testsuite

Sergey (SY)

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: John

Date: 30 Apr, 2010 04:27:04

Message: 26 of 159

Thank you . Now i can See what type of a data set I'm working with and the type of algorithm I should create to make my program code run @ optimal efficiency .

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Oliver Woodford

Date: 30 Apr, 2010 16:12:04

Message: 27 of 159

I'm dropping out now - time to get back to work! Thanks to the contest team for another fun problem. I look forward to seeing the quality of images at the end of the week. Best of luck to everyone.

Oliver

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Nicholas Howe

Date: 30 Apr, 2010 21:31:19

Message: 28 of 159

"Oliver Woodford" <o.j.woodford.98@cantab.net> wrote in message <hrevgk$flv$1@fred.mathworks.com>...
> I'm dropping out now - time to get back to work! Thanks to the contest team for another fun problem. I look forward to seeing the quality of images at the end of the week. Best of luck to everyone.
>
> Oliver

Oliver,
If you're dropping out, would you mind sharing the key ideas behind your twilight winner?

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Alan Chalker

Date: 30 Apr, 2010 22:35:06

Message: 29 of 159

Is there a stats page with the new website like there used to be with the old one?

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Oliver Woodford

Date: 30 Apr, 2010 22:42:06

Message: 30 of 159

"Nicholas Howe" wrote
> Oliver,
> If you're dropping out, would you mind sharing the key ideas behind your twilight winner?

Hi Nick

Sure. There are 3 key concepts:

1. I made sure my queries gave a unique and exact solution for the sum of each region, the regions being the intersections of all query masks. Given n queries that means I can only solve this uniquely for n (or fewer) regions - any more regions and you need regularization. To keep things tractable at every iteration I make each new region a subset of a current region, i.e. each query divides a current region in two. The method I ended up selecting was to divide a given region in half across the middle, either vertically or horizontally. I also made the constraint that no region could be more than twice as long as it was wide. This constraint (and not dividing a region of length 1) is enforced using the value 300 (see code), so remove it to see why it's necessary.

2. The question this poses is which region to split next. You want to split regions that are textured, so you get more information out; splitting a textureless region gains you nothing. At every iteration (query) I kept a record of the maximum (average) colour difference between each region and its neighbours, in both the horizontal and vertical directions. I then split the region with the largest difference, in the direction appropriate to the largest difference, subject to the constraints already mentioned. This was a simple heuristic, but seemed to work well! However, I believe that improving this decision process is where the real improvements will come in this contest.

3. Finally, rather than assign each region the average value for the region at the end, I smooth the image (as real images tend to be smooth, not blocky), still making sure that each region's average is correct. This gives a significant improvement in quality, more so with the edge preserving bilateral filter. Interestingly, the regular sampling grid helped here. Having a non-regular grid messed up the smoothing (try adding the smoothing to my "Prime I" method to see what I mean).

Hope that's useful.
Oliver

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Oliver Woodford

Date: 30 Apr, 2010 22:58:03

Message: 31 of 159

"Oliver Woodford" wrote:
> The question this poses is which region to split next. You want to split regions that are textured, so you get more information out; splitting a textureless region gains you nothing. I believe that improving this decision process is where the real improvements will come in this contest.

Incidentally, I started work on a method that tries to focus on regions where the gradient is not spatially linear. Since regions with linear gradient are well reconstructed by the post-processing smoothing step, splitting them up is wasted effort. However, I couldn't get it to work well. Perhaps someone else will...

Oliver

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Oliver Woodford

Date: 30 Apr, 2010 23:22:04

Message: 32 of 159

"Oliver Woodford" wrote:
> I made sure my queries gave a unique and exact solution for the sum of each region, the regions being the intersections of all query masks. Given n queries that means I can only solve this uniquely for n (or fewer) regions - any more regions and you need regularization.

Lastly (really), num regions >> queryLimit (by having queries overlapping many other queries) combined with regularization/Bayesian inference could well prove to be another rich vein for improvement. It's what all the cutting edge compressive sensing algorithms do, after all. The only problem is squeezing it all into 180 seconds!

Oliver

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Amitabh Verma

Date: 1 May, 2010 15:54:04

Message: 33 of 159

Why is the node count affecting the score. I thought it didn't matter in computing the final score.

#1 less nodes? srach 14184133 57.361 28378.3 (cyc: 14, node: 2959)
http://www.mathworks.com/matlabcentral/contest/contests/2/submissions/1472

#2 *linear cycler the cyclist 14184133 57.216 28378.6 (cyc: 14, node: 3263)
http://www.mathworks.com/matlabcentral/contest/contests/2/submissions/1464

Even though the computing time is more in less nodes? just because of the lesser node count it ranks above.

Also I had seem some entries where the complexity at 10 resulted in no score variation to an entry and same code with a higher than 10 complexity.

Any thoughts ?

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: srach

Date: 1 May, 2010 18:39:05

Message: 34 of 159

"Amitabh Verma" <amtukv@gmail.com> wrote in message <hrhiqs$l63$1@fred.mathworks.com>...
> Why is the node count affecting the score. I thought it didn't matter in computing the final score.
>
> #1 less nodes? srach 14184133 57.361 28378.3 (cyc: 14, node: 2959)
> http://www.mathworks.com/matlabcentral/contest/contests/2/submissions/1472
>
> #2 *linear cycler the cyclist 14184133 57.216 28378.6 (cyc: 14, node: 3263)
> http://www.mathworks.com/matlabcentral/contest/contests/2/submissions/1464
>
> Even though the computing time is more in less nodes? just because of the lesser node count it ranks above.
>
> Also I had seem some entries where the complexity at 10 resulted in no score variation to an entry and same code with a higher than 10 complexity.
>
> Any thoughts ?

It is stated in the rules that the node count contributes to the score: http://www.mathworks.com/matlabcentral/contest/contests/2/rules#notes

Regards
srach

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Alan Chalker

Date: 2 May, 2010 15:18:04

Message: 35 of 159

I have been able to figure out the scoring formula and am posting it here as I traditionally do. As usual, it’s very similar to the recent contests:

score = k1*result + k2*e(k3*runtime) + k4*max(complexity-10,0) + k5*nodes

Where:

k1 = 0.002
k2 = 0.01
k3 = 0.1
k4 = 1
k5 = 0.001

The current leading entry has a time of 77s, result of 14064009, cyc of 23, and nodes of 3537. Here’s a breakdown of the current tradoffs:

-cyc and score are a 1:1 ratio (i.e. each point shaved off cyc is a point shaved off the score)
-time and score are a 1:2.2 ratio
-result and score are a 1:0.002 ratio
-node and score are a 1:0.001 ratio

As is common at this point in the contest, Abhisek Ukil’s entries have already settled in just below the ‘knee’ of the time exponential curve, which is rather flat until about ~85s. However, because of results are so high right now and change quite a bit with small tweaks, I think we are going to find more payoff in trying to reduce the results by processing the images for a bit longer, at least until the times get up around the 95s range. Unfortunately that also means that during the various contest end times the queue is going to get very backlogged, since each entry will take several minutes to execute.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Sergey Y.

Date: 3 May, 2010 23:25:21

Message: 36 of 159

Hi.
To avoid confusion with time zone change could you please clarify what will be the time of final Wednesday deadline?

(And do we need to expect one more sudden challenge on Tuesday?)

Sergey

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Matthew Simoneau

Date: 4 May, 2010 01:48:04

Message: 37 of 159

Sergey, the contest ends Wednesday at 16:00 UTC (or 12:00 EDT):

http://www.timeanddate.com/worldclock/fixedtime.html?month=5&day=5&year=2010&hour=16&min=0&sec=0&p1=0

(Remember the 10-entry-per-hour limit as we approach the deadline.)

With respect to additional mid-contest challenges, Ned announced a longevity prize for Tuesday on the contest blog.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Helen Chen (gmail)

Date: 4 May, 2010 14:20:21

Message: 38 of 159

Dimitri had a suggestion about the contest in a different thread:
"It would be usefull for players to be able to cancel their submission while the last is yet in qeue, if they realise that this submission has bugs or errors. With this capability the qeue list would shrinked giving a breath to mathworks contest computer and reducing the mean waiting time till submission evaluation. I hope mathworks contest team take this as a usefull peace of advice. Please reply me wether or not you agree with me. "

I'm interested in feedback from other players. Is this something that you would think would enhance your contest experience?

Thanks for your feedback!
Helen

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Amitabh Verma

Date: 4 May, 2010 14:40:06

Message: 39 of 159

"Helen Chen (gmail)" <hhchenma@gmail.com> wrote in message <hrpaf5$1nd$1@fred.mathworks.com>...
> Dimitri had a suggestion about the contest in a different thread:
> "It would be usefull for players to be able to cancel their submission while the last is yet in qeue, if they realise that this submission has bugs or errors. With this capability the qeue list would shrinked giving a breath to mathworks contest computer and reducing the mean waiting time till submission evaluation. I hope mathworks contest team take this as a usefull peace of advice. Please reply me wether or not you agree with me. "
>
> I'm interested in feedback from other players. Is this something that you would think would enhance your contest experience?
>
> Thanks for your feedback!
> Helen

I totally agree with Dimitri. It doesn't make much sense to wait for an entry which is not going to yield the expected result. I am definitely in favor of such an option coz it will only make the contest more fast paced.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Sergey Y.

Date: 4 May, 2010 14:49:04

Message: 40 of 159

I do not expect anybody to use it in real life if it does not affect submission counter.
If you remove deleted submission from submission counter, then life will be even more hectic – possible rush of submission deleting when new leader appears.

I am sorry for beating a dead horse, but I have only one dream – CAPTCHA.
 Sergey

"Helen Chen (gmail)" <hhchenma@gmail.com> wrote in message <hrpaf5$1nd$1@fred.mathworks.com>...
> Dimitri had a suggestion about the contest in a different thread:
> "It would be usefull for players to be able to cancel their submission while the last is yet in qeue, if they realise that this submission has bugs or errors. With this capability the qeue list would shrinked giving a breath to mathworks contest computer and reducing the mean waiting time till submission evaluation. I hope mathworks contest team take this as a usefull peace of advice. Please reply me wether or not you agree with me. "
>
> I'm interested in feedback from other players. Is this something that you would think would enhance your contest experience?
>
> Thanks for your feedback!
> Helen

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: the cyclist

Date: 4 May, 2010 15:03:04

Message: 41 of 159

"Helen Chen (gmail)" <hhchenma@gmail.com> wrote in message <hrpaf5$1nd$1@fred.mathworks.com>...
> Dimitri had a suggestion about the contest in a different thread:
> "It would be usefull for players to be able to cancel their submission while the last is yet in qeue, if they realise that this submission has bugs or errors. With this capability the qeue list would shrinked giving a breath to mathworks contest computer and reducing the mean waiting time till submission evaluation. I hope mathworks contest team take this as a usefull peace of advice. Please reply me wether or not you agree with me. "
>
> I'm interested in feedback from other players. Is this something that you would think would enhance your contest experience?
>
> Thanks for your feedback!
> Helen

I would personally almost never use such a utility. Unless lots of people are submitting buggy code at crunch times (doubtful), and they realize that between time of submission and time of processing (doubtful), it will have minimal impact on the contest.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Amitabh Verma

Date: 4 May, 2010 15:11:04

Message: 42 of 159

"Sergey Y." <ivssnn@yahoo.com> wrote in message <hrpc50$p6o$1@fred.mathworks.com>...
> I do not expect anybody to use it in real life if it does not affect submission counter.
> If you remove deleted submission from submission counter, then life will be even more hectic – possible rush of submission deleting when new leader appears.
>
> I am sorry for beating a dead horse, but I have only one dream – CAPTCHA.
> Sergey

Now that everyone has to login before submitting their code. Is it possible to only process 1 entry per user at a time so that others who are in queue do not have to wait. Of course CAPTCHA is there but will it still deter the determined ones. The 1 entry processing was one of suggestions from last year.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Sergey Y.

Date: 4 May, 2010 15:32:04

Message: 43 of 159

Yes, submission limit helps partially with queue lag. I am not saying I do not like it. My main problem is with automatic tweaking machines. If one wants to do tweaking then one has to do it manually.
Sergey

"Amitabh Verma" <amtukv@gmail.com> wrote in message <hrpde8$mvs$1@fred.mathworks.com>...
> Now that everyone has to login before submitting their code. Is it possible to only process 1 entry per user at a time so that others who are in queue do not have to wait. Of course CAPTCHA is there but will it still deter the determined ones. The 1 entry processing was one of suggestions from last year.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Alan Chalker

Date: 4 May, 2010 15:45:22

Message: 44 of 159

"Helen Chen (gmail)" <hhchenma@gmail.com> wrote in message <hrpaf5$1nd$1@fred.mathworks.com>...
> Dimitri had a suggestion about the contest in a different thread:
> "It would be usefull for players to be able to cancel their submission while the last is yet in qeue, if they realise that this submission has bugs or errors. With this capability the qeue list would shrinked giving a breath to mathworks contest computer and reducing the mean waiting time till submission evaluation. I hope mathworks contest team take this as a usefull peace of advice. Please reply me wether or not you agree with me. "
>
> I'm interested in feedback from other players. Is this something that you would think would enhance your contest experience?
>
> Thanks for your feedback!
> Helen

Yes, I can think of several situations where this would be nice to have. I've several times submitted some code only to realize I made some sort of infinite loop mistake. Instead of waiting for the 3 min limit to time the code out (and backing up the queue) it'd be nice to be able to dequeue it.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Alan Chalker

Date: 4 May, 2010 15:55:20

Message: 45 of 159

"Sergey Y." <ivssnn@yahoo.com> wrote in message <hrpelk$f8o$1@fred.mathworks.com>...
> Yes, submission limit helps partially with queue lag. I am not saying I do not like it. My main problem is with automatic tweaking machines. If one wants to do tweaking then one has to do it manually.
> Sergey
>

I'd like to point out that the contest has 3 distinct phases for a reason. Darkness and twilight mostly prevent tweaking and are almost always won by those contestants with an expert level understanding of the underlying algorithms. Many of those contestants deliberately don't participate during the daylight phase because of the nature of the contest completely shifts to tweaking and optimizing.

With the 10 entries / 10 mins limit now, there is no speed advantage of manual tweaking over auto tweaking - it's just a time commitment issue. If we are going to suggest arbitrary rules, I have a major issue with non-descriptive variable names and poorly documented code and think all leading entries need to be screened for that. But most competitors don't want to do that because again it's a time commitment issue.

One of the things I personally enjoy about the contest is developing code the deals with the contest mechanics (i.e. the auto submission code). I found it particularly challenging to create new code to handle the new site. In fact I spent most of the weekend doing that (which is unfortunately why I didn't get a chance to submit a documented leading solver like I usually do).

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Alan Chalker

Date: 4 May, 2010 16:31:07

Message: 46 of 159

I just noticed something interesting as part of my various tests: it doesn't appear that the code is being run twice during the testing. In the past each entry was always run twice to help with timing variations. However, if you look at my 'time test' series of entries you'll see this:

#1: Scored at: 2010-05-04 15:09:25 UTC , CPU Time: 174.418
#2: Scored at: 2010-05-04 15:12:33 UTC , CPU Time: 169.166
#3: Scored at: 2010-05-04 15:15:55 UTC , CPU Time: 173.046
#4, 5,6 etc etc etc.

The scored at time of these should be 6 mins apart not 3 mins apart if the code was being run twice. Was this an intentional change or an oversight in the new contest machinery?

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Matthew Simoneau

Date: 4 May, 2010 17:55:22

Message: 47 of 159

For the last few years, we haven't been running the entry through the entire test suite twice. We run it through a portion of the test suite to warm up the MATLAB, but only go through the whole thing once.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Nicholas Howe

Date: 4 May, 2010 18:25:22

Message: 48 of 159

I'd like to encourage more competitors to add a photo to their profile. For those who have won in a past contest there are already photos in the hall of fame, but it would be nice to be able to put a face to all the new names that keep showing up.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Helen Chen

Date: 4 May, 2010 19:16:04

Message: 49 of 159

"Nicholas Howe" <NikHow@hotmail.com> wrote in message <hrpoqi$1f1$1@fred.mathworks.com>...
> I'd like to encourage more competitors to add a photo to their profile. For those who have won in a past contest there are already photos in the hall of fame, but it would be nice to be able to put a face to all the new names that keep showing up.

That is such a really great suggestion, Nick! I second that request. :-)

Helen

Helen

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: the cyclist

Date: 4 May, 2010 19:25:05

Message: 50 of 159

"Alan Chalker" <alancNOSPAM@osc.edu> wrote in message <hrpg18$ffj$1@fred.mathworks.com>...
> "Sergey Y." <ivssnn@yahoo.com> wrote in message <hrpelk$f8o$1@fred.mathworks.com>...
> > Yes, submission limit helps partially with queue lag. I am not saying I do not like it. My main problem is with automatic tweaking machines. If one wants to do tweaking then one has to do it manually.
> > Sergey
> >
>
> I'd like to point out that the contest has 3 distinct phases for a reason. Darkness and twilight mostly prevent tweaking and are almost always won by those contestants with an expert level understanding of the underlying algorithms. Many of those contestants deliberately don't participate during the daylight phase because of the nature of the contest completely shifts to tweaking and optimizing.
>
> With the 10 entries / 10 mins limit now, there is no speed advantage of manual tweaking over auto tweaking - it's just a time commitment issue. If we are going to suggest arbitrary rules, I have a major issue with non-descriptive variable names and poorly documented code and think all leading entries need to be screened for that. But most competitors don't want to do that because again it's a time commitment issue.
>
> One of the things I personally enjoy about the contest is developing code the deals with the contest mechanics (i.e. the auto submission code). I found it particularly challenging to create new code to handle the new site. In fact I spent most of the weekend doing that (which is unfortunately why I didn't get a chance to submit a documented leading solver like I usually do).

Out of curiosity, Alan, will you just run your auto-generator continuously until the end of the contest? If so, it looks like we can safely assume that the queue will continue to grow until the end, since you can feed the queue at one entry per minute, and each entry requires more than that in processing time. (Note that this is not a complaint, just an observation.)

Also, out of curiosity (and only if you don't mind sharing, of course!): Does your code automatically grab the leader, and tweak any and all numerical parameters? Or do manually seek out more likely parameter for tweaking?

the cyclist

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Amitabh Verma

Date: 4 May, 2010 19:42:20

Message: 51 of 159

"The Tuesday Longevity Prize. Whoever stays in the lead with a single entry for the longest amount of time on Tuesday (midnight to midnight, UTC) wins the prize. The winner will need to take the lead any time on Tuesday, and will receive credit for all the time spent in the lead, including Wednesday. Any ties will be broken by whoever appeared most recently."


I had a few questions on the above from the contest blog. Does the Longevity contest end at midnight Tues, UTC or run into Wed. ?

Is the score the cumulative time spent in lead or the longest entry that was in lead ?

Thanks.

Best,
Amitabh

PS: I agree pics would be nice to have :)

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Sergey Y.

Date: 4 May, 2010 20:21:04

Message: 52 of 159

"the cyclist" <thecyclist@gmail.com> wrote in message
> Also, out of curiosity (and only if you don't mind sharing, of course!): Does your code automatically grab the leader, and tweak any and all numerical parameters? Or do manually seek out more likely parameter for tweaking?
>
> the cyclist

As we discussing technical part of automatic tweaking I am curious too (If it is not a secret of course) : does your software reads result and looks for the minimum in multidimensional space or just tries all parameter values?

Sergey

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Alan Chalker

Date: 4 May, 2010 22:12:04

Message: 53 of 159

"the cyclist" <thecyclist@gmail.com> wrote in message <hrpsah$t7e$1@fred.mathworks.com>...

>
> Out of curiosity, Alan, will you just run your auto-generator continuously until the end of the contest? If so, it looks like we can safely assume that the queue will continue to grow until the end, since you can feed the queue at one entry per minute, and each entry requires more than that in processing time. (Note that this is not a complaint, just an observation.)
>
> Also, out of curiosity (and only if you don't mind sharing, of course!): Does your code automatically grab the leader, and tweak any and all numerical parameters? Or do manually seek out more likely parameter for tweaking?
>
> the cyclist

No I won't run it overnight.. I'll take a break after the end of the current mini contest until about an hour before the ultimate contest end tomorrow. I have a special super duper strategy to implement then;)

I'm happy to share any and all details of my auto-submission code. Yes, I continuously scan to see who the current leader is. If it's my own entry I don't do anything at all. If it's another new code I grab the code, then have a ordered set of 'likely numeric parameters' to tweak. A lot of this is just gut feel from earlier in the contest. For example, say a key parameter to tweak is currently at 0.1234. Then my code would first submit an entry with that changed to 0.1235, the next one would be 0.1233, next 0.1236, then 0.1232, etc. etc. I repeat this about 10 times on average, trying to tweak slightly on both sides of the parameter. Then I go to the next likely parameter (unless of course the leader changes during this process, in which case it starts over again with the new code).

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Alan Chalker

Date: 4 May, 2010 22:15:08

Message: 54 of 159

"Sergey Y." <ivssnn@yahoo.com> wrote in message <hrpvjg$6hm$1@fred.mathworks.com>...
> "the cyclist" <thecyclist@gmail.com> wrote in message
> > Also, out of curiosity (and only if you don't mind sharing, of course!): Does your code automatically grab the leader, and tweak any and all numerical parameters? Or do manually seek out more likely parameter for tweaking?
> >
> > the cyclist
>
> As we discussing technical part of automatic tweaking I am curious too (If it is not a secret of course) : does your software reads result and looks for the minimum in multidimensional space or just tries all parameter values?
>
> Sergey

It's essentially just doing an educated parameter sweep. With the level of tweaking we are doing it's overfitting to the test suite, so there's no real way to truly 'algorithmically optimize'.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Helen Chen

Date: 4 May, 2010 22:27:04

Message: 55 of 159

Robert Macrea accidentally posted this to a separate thread. He asked me to repost it for him.

<snip>
A Third Way

It seems a shame that many of the competitive entries are still dominated by just two basic solvers that query non-overlapping regions. Why, when there are so many other approaches available?

To try to make the mix more interesting here is a third approach. In itself it is not competitive, but I think that if tweaked to the level of existing entries it would outperform them on some images, particularly where contrast is reasonably even across the whole image rather than being focussed around a few features.

A Third Way carries out an initial scan using two rectangular overlapping scans, with the second a grid offset by half a grid width. Between them these scans take up about 80% of the available queries, and they each give a different, coarse version of the image. Every pixel has two estimated values, and if you form a combined image from their means you see an image with pixel blocks half the size of the initial scans. In effect you have faked the resolution you could get with 1.6x the permitted number of scans, which sounds good. In particular it should make smoothing more effective.

The catch is that in place of exact answers you would have with a real 1.6x scan you only have two estimates for the result of each query. Clearly you have lost accuracy near any sharp edges, which is where the remaining 20% of queries come in. Because we have two estimates for each subarea we have a good idea where the uncertainties are highest, so we can query these subareas in order to improve the overall quality of the estimate. Not only do these queries pin down exact answers for the queried subarea, they also reduce uncertainties for all 8 adjacent subareas. In low contrast areas of course the two initial estimates will be similar so we won't waste queries there; queries cluster near high-contrast areas the two estimates differ most, and this is why we only need to query about 1/8 of the subareas to get a reasonable-quality image.

The result should be an image with reasonably accurate subarea estimates, and subareas smaller on average than can be produced by the two main approaches. The problem seems to be that this improvement is spread rather evenly across the whole image instead of being as strongly focussed on sharp edges as it is with subdivision.

I don't have time to take development any further, but if you want to have a play I'd suggest an number of areas for attention (on top of the sloppy coding).
  -- Ideally you don't want to query too many subareas that are very close together, as this is redundant. It may be worth recalculating uncertainties as subareas are queried?
  -- Subarea boundaries have been allocated evenly along both x and y axes, but if t is the usual edge of a subarea this results in some subareas that are (t+1)^2 pixels while most subareas are t^2 or t.(t+1). Might it be better to put all the large gaps along just one axis so the largest area is t.(t+1)?
  -- Smaller areas give better estimates; this should slightly modify the dispersion and mean estimates?
  -- The complexity of the code gives great scope for optimisation and bug hunts, so many hours of tweaking are available 8-)
 
Good luck to anyone who can get this into competitive form!


Incidentally, if anyone fancies a collaborative effort I am out of coding time but have a number of other ideas that might be interesting, a sketch for a fourth solver based on edge detection and a few others for improvements in late scans and post-processing. Email me at ff95 a t dial dot pipex dot com if you are interested.

</snip>

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Alan Chalker

Date: 5 May, 2010 02:56:05

Message: 56 of 159

"Amitabh Verma" <amtukv@gmail.com> wrote in message <hrptas$6q5$1@fred.mathworks.com>...
>
> I had a few questions on the above from the contest blog. Does the Longevity contest end at midnight Tues, UTC or run into Wed. ?
>
> Is the score the cumulative time spent in lead or the longest entry that was in lead ?
>
> Thanks.
>
> Best,
> Amitabh
>

Another question is whether it's based upon submission time or scoring time. Also, please note both the stats page and the twitter feed seem to be missing some of the leaders. For example, right now leader 259 is an entry from Yi Cao, and 260 is one of my entries. However, if you look at entry # 4012 submitted by Amitabh Verma, it was in the lead for a while and should be listed as 260.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Alan Chalker

Date: 5 May, 2010 04:50:04

Message: 57 of 159

"Alan Chalker" <alancNOSPAM@osc.edu> wrote in message <hrqmo5$rac$1@fred.mathworks.com>...

> Another question is whether it's based upon submission time or scoring time. Also, please note both the stats page and the twitter feed seem to be missing some of the leaders. For example, right now leader 259 is an entry from Yi Cao, and 260 is one of my entries. However, if you look at entry # 4012 submitted by Amitabh Verma, it was in the lead for a while and should be listed as 260.

Just to clarify, depending on what your actual meaning of the mini-contest rules is and whats going on with the stats page is likely to determine who the mini-contest winner is. As of right now, the stats page shows the 2 longest entries as mine and then Yi Cao's:

259 Yi Cao tomorrow is another day. Tue 20:47 0.02% 2.96
260 Alan Chalker Auto tweaker Tue 23:45 0.00% 4.14

However I'm not sure these are correctly indicating the actual durations.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Oliver Woodford

Date: 5 May, 2010 09:14:06

Message: 58 of 159

"Robert Macrea" wrote
> A Third Way carries out an initial scan using two rectangular overlapping scans, with the second a grid offset by half a grid width. Between them these scans take up about 80% of the available queries, and they each give a different, coarse version of the image. Every pixel has two estimated values, and if you form a combined image from their means you see an image with pixel blocks half the size of the initial scans. In effect you have faked the resolution you could get with 1.6x the permitted number of scans, which sounds good. In particular it should make smoothing more effective.

Robert, sounds like we're on the same wavelength here. You are suggesting, in more concrete form, an approach I have suggested a couple of times already. Indeed, my Super Slim entry implements the first stage of just such a method, but using two overlapping scans to get 9 times the resolution rather than your 4. I think it's definitely a winning approach, but like yourself I haven't the time to implement it fully.

Oliver

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Sergey Y.

Date: 5 May, 2010 12:03:04

Message: 59 of 159

"Matthew Simoneau" <matthew@mathworks.com> wrote in message <hrnuck$jsr$1@fred.mathworks.com>...
> Sergey, the contest ends Wednesday at 16:00 UTC (or 12:00 EDT):
>
> http://www.timeanddate.com/worldclock/fixedtime.html?month=5&day=5&year=2010&hour=16&min=0&sec=0&p1=0
>
> (Remember the 10-entry-per-hour limit as we approach the deadline.)
>
> With respect to additional mid-contest challenges, Ned announced a longevity prize for Tuesday on the contest blog.

Sorry for being so slow. What about 10-entry-per-hour limit?
I can not find it.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Helen Chen

Date: 5 May, 2010 12:59:07

Message: 60 of 159

"Sergey Y." <ivssnn@yahoo.com> wrote in message <hrrmpo$k7f$1@fred.mathworks.com>...
>
> Sorry for being so slow. What about 10-entry-per-hour limit?
> I can not find it.

Sergey - It is 10 entries per 10 minutes not per hour. If you submit 10 entries, you will see a message that tells this.

Helen

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Helen Chen

Date: 5 May, 2010 13:14:04

Message: 61 of 159

"Alan Chalker" <alancNOSPAM@osc.edu> wrote in message
>
 Also, please note both the stats page and the twitter feed seem to be missing some of the leaders. For example, right now leader 259 is an entry from Yi Cao, and 260 is one of my entries. However, if you look at entry # 4012 submitted by Amitabh Verma, it was in the lead for a while and should be listed as 260. <

Thanks Alan. I've forwarded this to Matt to look into.

Helen

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Amitabh Verma

Date: 5 May, 2010 13:38:06

Message: 62 of 159

"Helen Chen" <helen.chen@mathworks.com> wrote in message <hrrqus$pad$1@fred.mathworks.com>...
> "Alan Chalker" <alancNOSPAM@osc.edu> wrote in message
> >
> Also, please note both the stats page and the twitter feed seem to be missing some of the leaders. For example, right now leader 259 is an entry from Yi Cao, and 260 is one of my entries. However, if you look at entry # 4012 submitted by Amitabh Verma, it was in the lead for a while and should be listed as 260. <
>
> Thanks Alan. I've forwarded this to Matt to look into.
>
> Helen

Thanks Alan, for bringing it to notice.

Cheers !

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Nicholas Howe

Date: 5 May, 2010 14:38:04

Message: 63 of 159

"Oliver Woodford" <o.j.woodford.98@cantab.net> wrote in message <hrrcsu$a3j$1@fred.mathworks.com>...
> "Robert Macrea" wrote
> > A Third Way carries out an initial scan using two rectangular overlapping scans, with the second a grid offset by half a grid width. Between them these scans take up about 80% of the available queries, and they each give a different, coarse version of the image. Every pixel has two estimated values, and if you form a combined image from their means you see an image with pixel blocks half the size of the initial scans. In effect you have faked the resolution you could get with 1.6x the permitted number of scans, which sounds good. In particular it should make smoothing more effective.
>
> Robert, sounds like we're on the same wavelength here. You are suggesting, in more concrete form, an approach I have suggested a couple of times already. Indeed, my Super Slim entry implements the first stage of just such a method, but using two overlapping scans to get 9 times the resolution rather than your 4. I think it's definitely a winning approach, but like yourself I haven't the time to implement it fully.
>
> Oliver


I've often wished that there was some way to keep separate development threads alive at the same time, so that people could be rewarded for the best block-style solver, the best overlapping solver, etc. There were other contests where I've felt like a really interesting approach never saw the light of day because it was impossible for one person to compete with the combined optimizations of the group. With development on multiple fronts, you might see totally different solvers competing for the lead, and possibly cross-pollination of ideas. But I don't know how you would structure such a contest, without relying on people's subjective judgments that 'this is a X-type of solver".

In this contest, I have long wondered if we would see an entry based upon the recent research in compressive sensing. (I suspect that reading about this may have inspired the current competition.) In theory these allow excellent reconstruction of images with many fewer bits than the Nyquist limit would suggest. All the current solvers are essentially limited by Nyquist, so I think a CS solver would win.

The queries of such a solver look very different than either the block solver or the periodic solver outlined above. They are essentially random pixel sets from the image. Each one gives you a little information about all the pixels in the image, and you can recover the original by performing a convex optimization via linear programming. Personally I found the research results a little too diverse to condense into a specific contest entry in the amount of time I had, and I was not sure from my reading whether the required computation would even fit into the 3-minute window. But perhaps some other brave soul has succeeded in doing so, and will surprise us all in the final minutes of the contest.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Helen Chen

Date: 5 May, 2010 15:08:05

Message: 64 of 159

"Alan Chalker" <alancNOSPAM@osc.edu> wrote in message
> Just to clarify, depending on what your actual meaning of the mini-contest rules is and whats going on with the stats page is likely to determine who the mini-contest winner is. As of right now, the stats page shows the 2 longest entries as mine and then Yi Cao's:
>

I just heard from Ned and posted his decision on the blog.

Helen
ps. Time is always the time of submission.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Helen Chen

Date: 5 May, 2010 16:06:04

Message: 65 of 159

The Contest queue is now officially closed. There are 380 entries in the queue at this point, so after those entries are processed, we will have an answer to our big question - Who is the Grand Prize Winner for this contest!

ttys,
Helen

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Alan Chalker

Date: 5 May, 2010 16:06:04

Message: 66 of 159

Just to pre-emptively respond to any comments about me flooding the queue at the end, I'd like to point to the following blog post from a week ago from Seth:

http://blogs.mathworks.com/contest/2010/04/28/new-contest-website-features/#comment-6569

"Regarding the login requirements, there is no rule that a given player use only one login ID, but if we start giving awards for most prolific, or greatest improved, multiple IDs would hurt your chances. It might be interesting if your own submissions were competing for the top spot."

And I'll also point to the newsgroup conversations from past contests, in particular: http://www.mathworks.com/matlabcentral/newsreader/view_thread/238684

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Alan Chalker

Date: 5 May, 2010 16:13:04

Message: 67 of 159

"Helen Chen" <helen.chen@mathworks.com> wrote in message <hrs1kl$ovd$1@fred.mathworks.com>...

>
> Helen
> ps. Time is always the time of submission.

Helen: I'm glad to hear that. One of the tactics I tried yesterday morning was to wait until an entry of mine was in the lead, and the queue was empty, and then to fill the queue with lots of entries that take up almost the full 180 second limit (this was my Queue delay series of entries). The thought was that if it was scoring time that counts, those queue delay entries would significantly increase the time it took before other entries could be scored, artificially inflating my longevity time.

Interestingly, Sergey seems to have been wondering what I was doing and resubmitted several of those entries, even further increasing the queue delay. I thought it important to publicly disclose this tactic now, and the fact that it doesn't actually help, in order to avoid anyone else from trying it in future contests and unnecessarily lengthening the queue processing time.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Oliver Woodford

Date: 5 May, 2010 16:25:21

Message: 68 of 159

"Alan Chalker" wrote:
> Just to pre-emptively respond to any comments about me flooding the queue at the end, I'd like to point to the following blog post from a week ago from Seth:
>
> http://blogs.mathworks.com/contest/2010/04/28/new-contest-website-features/#comment-6569

True, but I'd like to post-emptively quote Ned's post from yesterday (http://blogs.mathworks.com/contest/2010/05/04/smaller-than-one-thousand/) which states "Sock puppet accounts are frowned upon". Consider this a stern frowning. :)

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Sergey Y.

Date: 5 May, 2010 16:32:04

Message: 69 of 159

"Alan Chalker" <alancNOSPAM@osc.edu> wrote in message <hrs5eg$cgt$1@fred.mathworks.com>...
> Interestingly, Sergey seems to have been wondering what I was doing and resubmitted several of those entries, even further increasing the queue delay. I thought it important to publicly disclose this tactic now, and the fact that it doesn't actually help, in order to avoid anyone else from trying it in future contests and unnecessarily lengthening the queue processing time.

Actually I was using similar tactics, however differently.
Couple times I strongly suspected that my code is better then current best. In that case I was trying to submit several submissions and then my code. That way code will be scored at earlier times, but becomes visible as best significantly later delaying start of your “tweaking machine gun”

Then I saw your slow code and tested it for the same purpose.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Robert Macrae

Date: 5 May, 2010 16:38:04

Message: 70 of 159

"Oliver Woodford" <o.j.woodford.98@cantab.net> wrote in message <hrrcsu$a3j$1@fred.mathworks.com>...
>
> Robert, sounds like we're on the same wavelength here. You are suggesting, in more concrete form, an approach I have suggested a couple of times already. Indeed, my Super Slim entry implements the first stage of just such a method, but using two overlapping scans to get 9 times the resolution rather than your 4. I think it's definitely a winning approach, but like yourself I haven't the time to implement it fully.

Hah! I liked the idea of using 3 or 4, but decided that I might hit problems with instability as sharp features would cause ripples two or 3 subareas away. Then I got bogged down in coding the edges and ran out of time; with hindsight I should not have hand-coded rectangular areas, the gain nothing like worth the hassle.

I think overlapping regions like this are potentially a step forward, but as Nick comments below there are others that may be better...

Robert Macrae

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Alan Chalker

Date: 5 May, 2010 16:40:22

Message: 71 of 159

"Sergey Y." <ivssnn@yahoo.com> wrote in message <hrs6i4$qhg$1@fred.mathworks.com>...

>
> Actually I was using similar tactics, however differently.
> Couple times I strongly suspected that my code is better then current best. In that case I was trying to submit several submissions and then my code. That way code will be scored at earlier times, but becomes visible as best significantly later delaying start of your “tweaking machine gun”
>
> Then I saw your slow code and tested it for the same purpose.

That's an excellent counter-tactic! I'm glad to see someone trying to 'beat me at my own game';)

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Alan Chalker

Date: 5 May, 2010 16:49:05

Message: 72 of 159

"Helen Chen" <helen.chen@mathworks.com> wrote in message <hrs51b$eeu$1@fred.mathworks.com>...
> The Contest queue is now officially closed. There are 380 entries in the queue at this point, so after those entries are processed, we will have an answer to our big question - Who is the Grand Prize Winner for this contest!
>
> ttys,
> Helen

Helen:

I'd like to extend my thanks again to you and the rest of your team for organizing and running this event. I always look forward to the contests and enjoy the various activities and community that has been built up around them. The new website and machinery is great (even if it did cause me a lot of extra work at the start of the contest to adapt;).

One minor suggestion: would it be possible to add an explicit hidden field to the submission form that indicates the id number of the 'based on code'? Right now you have this encoded somehow in the auth-token field. With auto-tweaking code it's not really possible to create that reference back to the original code, however I think there is a lot of value in being able to track the relationships between entries, particularly with the new abilities of the website to show both based on and is the basis for data.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Robert Macrae

Date: 5 May, 2010 17:01:07

Message: 73 of 159

"Nicholas Howe" <NikHow@hotmail.com> wrote in message

> I've often wished that there was some way to keep separate development threads alive at the same time, so that people could be rewarded for the best block-style solver, the best overlapping solver, etc.

Pretty much why I posted in the newsgroup; it makes it possible to find if anyone is interested, rather than being just one more noncompetitive entry.

> In this contest, I have long wondered if we would see an entry based upon the recent research in compressive sensing.

The problem with overlapping regions is that they have to be
   1) reasonably localised (so that they capture smooth features well) and
   2) reasonably orthogonal (so that the information from them can be combined without too much iteration.

It is quite hard to achieve both, but here are two sketches:
   1) Create square regions 8 pixels across. Its then easy to construct local orthogonal subregions 8x4 and 2x(8x2), with 1s and 0s arranged so that the 1s are all on the side expected to be lighter. A substantial subset of the subregions (or all of them) can then queried together, so we learn the average difference between the two sides and can update our estimates.
   2) Keep a list of the masks that have been used so far. To select a new query, construct a score (perhaps by smoothing the current image) of pixels expected to be brighter, and orthogonalise this score against previous masks. Query the brightest half. Again this allows us to update our estimates, though the orthogonality will not be perfect.

> ... But perhaps some other brave soul has succeeded in doing so, and will surprise us all in the final minutes of the contest.

Here is hoping 8-)

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Jan

Date: 5 May, 2010 18:46:05

Message: 74 of 159

"Robert Macrae" <robertREMOVEME@arcusinvest.com> wrote in message
> It is quite hard to achieve both, but here are two sketches:
> 1) Create square regions 8 pixels across. Its then easy to construct local orthogonal subregions 8x4 and 2x(8x2), with 1s and 0s arranged so that the 1s are all on the side expected to be lighter. A substantial subset of the subregions (or all of them) can then queried together, so we learn the average difference between the two sides and can update our estimates.
> 2) Keep a list of the masks that have been used so far. To select a new query, construct a score (perhaps by smoothing the current image) of pixels expected to be brighter, and orthogonalise this score against previous masks. Query the brightest half. Again this allows us to update our estimates, though the orthogonality will not be perfect.

First thanks for the great contest! I joined this contest for the first time now and enjoyed it a lot.

I think the problem was particularly prune towards overfitting. Although maybe if the players would only train their algorithms on the test data set and only now the official data set would be applied, that might have helpeda bit. Anyway, seeing the auto tweaking (although kind of foolish indeed) approaches was just as funny.

Also I liked the twilight phase most and would love to have it a bit elongated next time, definitely more than the daylight phase which could have been easily one day shorter in this case.

Anyway I wonder if there are solutions for this problem available from the Mathworks people? I wonder where they would stand:) Or if there is some working algorithm with good results along the lines of what Robert writes?

Regards, Jan

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Nicholas Howe

Date: 5 May, 2010 19:03:05

Message: 75 of 159

"Robert Macrae" <robertREMOVEME@arcusinvest.com> wrote in message
> The problem with overlapping regions is that they have to be
> 1) reasonably localised (so that they capture smooth features well) and
> 2) reasonably orthogonal (so that the information from them can be combined without too much iteration.

Robert,
I'm not an expert on this field, but the reading that I have done suggests that you actually want queries that are as random as possible (and hence entirely unlocalized). This runs counter to our usual intuition, but so does beating the Nyquist limit. If people are interested in reading more about this, there's a lot out there but here is one article that describes the basic idea:

http://dsp.rice.edu/sites/dsp.rice.edu/files/cs/CSintro.pdf

Somebody actually tried something along these line early in the contest (see entries commented concerning cosamp code) but I don't think they got it working.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Hannes Naudé

Date: 5 May, 2010 19:57:07

Message: 76 of 159

"Nicholas Howe" <NikHow@hotmail.com> wrote in message <hrsfd9$e8f$1@fred.mathworks.com>...
> "Robert Macrae" <robertREMOVEME@arcusinvest.com> wrote in message
> > The problem with overlapping regions is that they have to be
> > 1) reasonably localised (so that they capture smooth features well) and
> > 2) reasonably orthogonal (so that the information from them can be combined without too much iteration.
>
> Robert,
> I'm not an expert on this field, but the reading that I have done suggests that you actually want queries that are as random as possible (and hence entirely unlocalized). This runs counter to our usual intuition, but so does beating the Nyquist limit...

That's exactly right. Typical compressive sampling queries are just random masks. I experimented with some compressive sampling code in the darkness phase of the competition (from the L1 magic website as well as from David Wipf's homepage), but it wasn't competitive.

The reason why our simple block based solvers appear to outperform the cutting edge stuff from academia is the fact that the problem as stated here does not actually correspond exactly to the compressive sampling problem typically studied in the real world. In that case, one is not able to adjust your query pattern dynamically based on the results from earlier queries. Rather, the entire set of query masks need to be specified at the outset.

Being able to adjust our queries to zoom in on areas with detail is a significant (allthough not realistic) advantage and allows the custom developed code to outperform the off the shelf CS code by orders of magnitude.

Regards
Hannes

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Nicholas Howe

Date: 5 May, 2010 20:45:21

Message: 77 of 159

"Hannes Naudé" <naude.jj+matlab@gmail.com> wrote in message
> The reason why our simple block based solvers appear to outperform the cutting edge stuff from academia is the fact that the problem as stated here does not actually correspond exactly to the compressive sampling problem typically studied in the real world. In that case, one is not able to adjust your query pattern dynamically based on the results from earlier queries. Rather, the entire set of query masks need to be specified at the outset.
>
> Being able to adjust our queries to zoom in on areas with detail is a significant (allthough not realistic) advantage and allows the custom developed code to outperform the off the shelf CS code by orders of magnitude.


Thanks for clarifying this, Hannes. I feel better now about not putting the time into delving through that CS material! (As I write this your entry stands at the top of the heap, so it looks like you managed to do quite well with other techniques also.)

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Amitabh Verma

Date: 5 May, 2010 20:45:21

Message: 78 of 159

"Alan Chalker" <alancNOSPAM@osc.edu> wrote in message
> That's an excellent counter-tactic! I'm glad to see someone trying to 'beat me at my own game';)


Just an afterthought... about the possibilities of using a tool.

In Daylight using a tool/utility one can also query for actual pixel value checking against Result. If one runs it for 2 days with 10 ids. 48*60*10 attempts can yield (48*60*10)/255 true pixels.

255 is the max attempts required sequentially. Using a simple algorithm (2^n) one can find the true pixel value in <8 attempts. Highly increasing the remainder value >3600.

The queue will also not get backed up since this entry will run in ~2-3 sec.

Now the smallest image in the test suit had ~2500 pixels.

Now I have got one image accurate pixel to pixel. Wonder how much score difference that can make ?

My stand on this is as long as it is within the rules defined by the Matlab team, its fine. However, if a number of people start doing this, thing can go astray. Just my 2 cents.

PS: I used a modest total of 10 ids only who were named 'Uncle_Sam' 8-)

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Sergey Y.

Date: 5 May, 2010 20:56:04

Message: 79 of 159

As we can see some people were trying to use random mask approach.
I personally spend first half of darkness trying to implement that method
(It is widely used in neuroscience for receptive field mapping in visual cortex).
Unfortunately it did not look promising. Maybe using Gabor function as mask is better but I did not try it.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Sergey Y.

Date: 5 May, 2010 21:02:05

Message: 80 of 159

"Amitabh Verma" <amtukv@gmail.com> wrote in message <hrsld1$k15$1@fred.mathworks.com>...
> "Alan Chalker" <alancNOSPAM@osc.edu> wrote in message
> > That's an excellent counter-tactic! I'm glad to see someone trying to 'beat me at my own game';)
>
>
> Just an afterthought... about the possibilities of using a tool.
>
> In Daylight using a tool/utility one can also query for actual pixel value checking against Result. If one runs it for 2 days with 10 ids. 48*60*10 attempts can yield (48*60*10)/255 true pixels.
>
> 255 is the max attempts required sequentially. Using a simple algorithm (2^n) one can find the true pixel value in <8 attempts. Highly increasing the remainder value >3600.
>
> The queue will also not get backed up since this entry will run in ~2-3 sec.
>
> Now the smallest image in the test suit had ~2500 pixels.
>
> Now I have got one image accurate pixel to pixel. Wonder how much score difference that can make ?
>
> My stand on this is as long as it is within the rules defined by the Matlab team, its fine. However, if a number of people start doing this, thing can go astray. Just my 2 cents.
>
> PS: I used a modest total of 10 ids only who were named 'Uncle_Sam' 8-)



However, direct probing is against the rules.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Sergey Y.

Date: 5 May, 2010 21:08:04

Message: 81 of 159

"Alan Chalker" <alancNOSPAM@osc.edu> wrote in message <hrs7i1$3dt$1@fred.mathworks.com>...
> One minor suggestion: would it be possible to add an explicit hidden field to the submission form that indicates the id number of the 'based on code'?


If we go that way I would respectfully request “official Web API” :)


I lot of thanks to Matlab contest team for wonderful problem and new site.
Good luck to all participants. Hopefully I will see you in a half a year

Sergey

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Yi Cao

Date: 5 May, 2010 21:13:04

Message: 82 of 159

"Nicholas Howe" <NikHow@hotmail.com> wrote in message <hrsld1$k01$1@fred.mathworks.com>...
> "Hannes Naudé" <naude.jj+matlab@gmail.com> wrote in message
> > The reason why our simple block based solvers appear to outperform the cutting edge stuff from academia is the fact that the problem as stated here does not actually correspond exactly to the compressive sampling problem typically studied in the real world. In that case, one is not able to adjust your query pattern dynamically based on the results from earlier queries. Rather, the entire set of query masks need to be specified at the outset.
> >
> > Being able to adjust our queries to zoom in on areas with detail is a significant (allthough not realistic) advantage and allows the custom developed code to outperform the off the shelf CS code by orders of magnitude.
>
>
> Thanks for clarifying this, Hannes. I feel better now about not putting the time into delving through that CS material! (As I write this your entry stands at the top of the heap, so it looks like you managed to do quite well with other techniques also.)

I was thinking a zooming approach, even with diagonal masks but was not successful to compete with current approahes. My view is that the contest is a game. We cannot and should not treat it too academically.

It sounds like Hannes' random entry will finally beat my 'final effort' series. Well-done!

Yi

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Alan Chalker

Date: 5 May, 2010 21:22:04

Message: 83 of 159

"Sergey Y." <ivssnn@yahoo.com> wrote in message <hrsmcd$ol5$1@fred.mathworks.com>...
>
> However, direct probing is against the rules.

To reinforce that, here's the relevant section of the rules:

"Extraction of puzzles in the test suite by manipulating the score, runtime, or error conditions is also forbidden. In the small scale, this has been an element of many past contests, but in the Blockbuster Contest, Alan Chalker turned this into a science."

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Alan Chalker

Date: 5 May, 2010 21:26:05

Message: 84 of 159

"Alan Chalker" <alancNOSPAM@osc.edu> wrote in message
> "Extraction of puzzles in the test suite by manipulating the score, runtime, or error conditions is also forbidden. In the small scale, this has been an element of many past contests, but in the Blockbuster Contest, Alan Chalker turned this into a science."

Helen (and all): Just a minor error I never noticed before in the rules. It was the Blackbox contest, not the Blockbuster contest.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Robert Macrae

Date: 5 May, 2010 21:35:05

Message: 85 of 159

"Nicholas Howe" <NikHow@hotmail.com> wrote in message

> I'm not an expert on this field, but the reading that I have done suggests that you actually want queries that are as random as possible (and hence entirely unlocalized).

"Reading" -- dubious concept, it will never catch on

I had a look at random queries. They handle the "nearly orthogonal" bit well, but I don't think they can be best. The value of a query is (at least roughly) how different the answer is from your prior estimate. That is the argument for large overlapping queries, but it works best if you can pick pixels that for some reason seem likely to be all be higher or lower than their current means suggest (eg because they are all on the dark side edges). Pure random queries can't do this so the gain per query seems rather low... but I'll take a look at your reference, I'm probably missing something.

Robert Macrae

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Robert Macrae

Date: 5 May, 2010 21:39:04

Message: 86 of 159

"Hannes Naudé" <naude.jj+matlab@gmail.com> wrote in message

> The reason why our simple block based solvers appear to outperform the cutting edge stuff from academia is the fact that the problem as stated here does not actually correspond exactly to the compressive sampling problem typically studied in the real world. In that case, one is not able to adjust your query pattern dynamically based on the results from earlier queries. Rather, the entire set of query masks need to be specified at the outset.

That is exactly why I rejected random (and Hadamard). However I think the concept of large overlapping queries can still be made to work.

> Being able to adjust our queries to zoom in on areas with detail is a significant (allthough not realistic) advantage

Its a different problem, but I don't think its unrealistic.

Robert

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Amitabh Verma

Date: 5 May, 2010 21:47:03

Message: 87 of 159

"Alan Chalker" <alancNOSPAM@osc.edu> wrote in message <hrsnpd$rfa$1@fred.mathworks.com>...
> "Alan Chalker" <alancNOSPAM@osc.edu> wrote in message
> > "Extraction of puzzles in the test suite by manipulating the score, runtime, or error conditions is also forbidden. In the small scale, this has been an element of many past contests, but in the Blockbuster Contest, Alan Chalker turned this into a science."

Guess I have lots to learn :)

Thanks to all the contestants and the Matlab team, it was a real fun and educational trip !

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Hannes Naudé

Date: 5 May, 2010 22:20:19

Message: 88 of 159

> It sounds like Hannes' random entry will finally beat my 'final effort' series. Well-done!
>
> Yi

Take a closer look my friend, it's not as random as you might think. ;-)

Cheers
H

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Hannes Naudé

Date: 5 May, 2010 22:54:03

Message: 89 of 159

Robert:
> That is exactly why I rejected random (and Hadamard). However I think the concept of large overlapping queries can still be made to work.

Agree with you there. The algorhithms you and Oliver discussed here sounded very promising. However, I've learnt the hard way not to attempt any global changes after about sunday. Past a certain point in the contest the only edits that stand a chance are nested inside lots of conditions and fire only very rarely, so as not to disturb the solver from its resting place in the deep local minima. :-(

Another unimplemented idea is excluding the outer edge from the estimation process until the very end and then just assigning each pixel in the outer edge the value of its nearest neighbour on the inner. This gives us more resolution towards the center (where the detail is more likely to be in any case) in exchange for a minimal loss on the outer edge.

In a similar way one can leave open regions BETWEEN queries and then use standard image inpainting algorithms to fil them in.

> > Being able to adjust our queries to zoom in on areas with detail is a significant (allthough not realistic) advantage

Well, by unrealistic I meant unrealistic in typical compressive sensing applications where minimal processing power/battery life is available on the sensor end . I'm sure real-world applications which match the problem as studied here exist, I just don't know what they are.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Hannes Naudé

Date: 5 May, 2010 23:49:04

Message: 90 of 159

"Nicholas Howe" <NikHow@hotmail.com> wrote in message <hrsld1$k01$1@fred.mathworks.com>...
Thanks for clarifying this, Hannes. I feel better now about not putting the time into delving through that CS material! (As I write this your entry stands at the top of the heap, so it looks like you managed to do quite well with other techniques also.)

Thanks. Unfortunately, in this case, "other techniques" refers to intentionally overfitting the dataset, something for which I have a deep seated dislike. But hey if you can't beat them, join them.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Alan Chalker

Date: 6 May, 2010 00:12:06

Message: 91 of 159

"Hannes Naudé" <naude.jj+matlab@gmail.com> wrote in message <hrsqv3$m4a$1@fred.mathworks.com>...

>
> Take a closer look my friend, it's not as random as you might think. ;-)
>
> Cheers
> H

Hannes: Congrats on being the grand winner! It's nice to see that you were able to sneak in some fundamental solver changes at the last minute that leaped far ahead of everyone else (particularly since far too often last minute tweaking is what wins). Care to share any details of what you did?

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Nicholas Howe

Date: 6 May, 2010 01:03:21

Message: 92 of 159

"Hannes Naudé" <naude.jj+matlab@gmail.com> wrote in message
> Thanks. Unfortunately, in this case, "other techniques" refers to intentionally overfitting the dataset, something for which I have a deep seated dislike. But hey if you can't beat them, join them.

Well, whatever you did, you did it better than everyone else. Well done!

In the past they have sometimes awarded a generality prize by running the submissions on a brand new test set after the contest ends. I wonder if they'll do that this time around?

I want to add my thanks to the Mathworks contest team. I think this was a great contest topic, very accessible and lots of fun to work on. The new contest machinery is very nice too. My one suggestion: in the past I think it was easier to diff two submissions; now I find it harder to find the link to do that. (Once I have the page up I can just replace the submission numbers in the URL.) Anyway, kudos to everyone involved in setting up this spring's contest!

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Darren Rowland

Date: 6 May, 2010 01:59:05

Message: 93 of 159

Congratulations to Hannes and the other prize winners.

And thanks to the contest team for a well organised contest.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Hannes Naudé

Date: 6 May, 2010 04:54:03

Message: 94 of 159

"Alan Chalker" <alancNOSPAM@osc.edu> wrote in message <hrt1gm$me7$1@fred.mathworks.com>...
> "Hannes Naudé" <naude.jj+matlab@gmail.com> wrote in message <hrsqv3$m4a$1@fred.mathworks.com>...
>
> >
> > Take a closer look my friend, it's not as random as you might think. ;-)
> >
> > Cheers
> > H
>
> Hannes: Congrats on being the grand winner! It's nice to see that you were able to sneak in some fundamental solver changes at the last minute that leaped far ahead of everyone else (particularly since far too often last minute tweaking is what wins). Care to share any details of what you did?

I don't mind at all. But it'll take me a while. So expect a post in the next few hours, but in the meantime I'd imagine you can have a lot of fun seeing whos the first to figure it out ;-). I'm afraid there were no fundamental solver changes. Last minute tweaking did win, just not random tweaking.

My overfitting series of entries were designed to look like random variation, with the only parameter changing between them being a 20-bit binary key stored in a lookup table. One of them got a very good result and people are quick to accept that I got a lucky key. In fact there are only 8 keys in the entire keyspace that would have defeated Sergey's top entry. That's a 1 in 131 071 chance.

A clue that something was afoot can be found in the comment on the first line of the code which contains the result that it eventually achieved.

A last few clues for anyone taking up the challenge to reverse engineer: Overfitting was based on information collected using the "Random Randomness" probe series which was in turn based on "Random Change" by Magnus. Sorry for not crediting correctly. Heat of battle and all that.

Cheers
H

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: srach

Date: 6 May, 2010 07:21:05

Message: 95 of 159

Congratulations to Hannes for a impressing win and also to Sergey for a very good second place. After all the automatic parameter tweaking, I would not have thought that such improvements of the score are still possible. And also congratulations to all the mid-contest price winners.

It was again a great contest; I liked the problem very much as well as the new contest machinery. Especially the 10 entries per minute barrier is a nice improvement, although it did not seem to work for all of us. ;)

Many thanks to the people at mathworks who organized this great event and, of course, to all the participants who made it such an enjoyable time full of new things to learn.

On a side note: will there be high resolution versions of the contest badges suitable for tattooing? :D

Hope to see you all in Fall.

Best regards
Stefan

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Gerbert Myburgh

Date: 6 May, 2010 08:07:07

Message: 96 of 159

Congrats Hannes.

After seeing you being tweeked out in the last minutes before, I'm very glad you made it to the top this time.

(Whoops, accidently posted as new message thread)

Thank you very much to the contest team for a great contest. Like always it was fun to compete.

Just a comment.
It should be fairly obvious from the statistics page (number of participants per day) that there are in fact many many more people interested in the algorithmic solving than the daylight tweaking.
Yet darkness is 1 day, twilight is 1 day, and Daylight stretches out for a week.

Not only that, Both Darkness and Twilight falls in the middle of the week, when most of us have to work.

I don't suggest you change your unique competition format entirely, but I do think we will see a stronger start to Dayilght if the algortihm guys can have a weekend day to work on the problem. ahead of everyone else (particularly since far too often last minute tweaking is what wins). Care to share any details of what you did?
>
> I don't mind at all. But it'll take me a while. So expect a post in the next few hours, but in the meantime I'd imagine you can have a lot of fun seeing whos the first to figure it out ;-). I'm afraid there were no fundamental solver changes. Last minute tweaking did win, just not random tweaking.
>
> My overfitting series of entries were designed to look like random variation, with the only parameter changing between them being a 20-bit binary key stored in a lookup table. One of them got a very good result and people are quick to accept that I got a lucky key. In fact there are only 8 keys in the entire keyspace that would have defeated Sergey's top entry. That's a 1 in 131 071 chance.
>
> A clue that something was afoot can be found in the comment on the first line of the code which contains the result that it eventually achieved.
>
> A last few clues for anyone taking up the challenge to reverse engineer: Overfitting was based on information collected using the "Random Randomness" probe series which was in turn based on "Random Change" by Magnus. Sorry for not crediting correctly. Heat of battle and all that.
>
> Cheers
> H

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Oliver Woodford

Date: 6 May, 2010 08:58:05

Message: 97 of 159

"Hannes Naudé" wrote:
> The reason why our simple block based solvers appear to outperform the cutting edge stuff from academia is the fact that the problem as stated here does not actually correspond exactly to the compressive sampling problem typically studied in the real world. In that case, one is not able to adjust your query pattern dynamically based on the results from earlier queries. Rather, the entire set of query masks need to be specified at the outset.

Firstly, congrats on winning the grand prize, Hannes. Secondly, I disagree with this analysis of why "block based solvers" or "fully constrained solvers" (that's my term) performed best in this competition.

I spent about a day skim reading the literature (hence why I wasn't competitive during darkness). On the point of published compressive sensing algorithms providing the next query based on the current data, "Bayesian Compressive Sensing" (Ji, Xue, Carin, 2008) does exactly this. There are almost certainly more examples too.

During the course of the competition I implemented random, DCT and overlapping block queries (what I'll refer to as "under constrained solvers"). The problem with this type of approach, as I've said before, is that the value of each pixel, or rather each region (regions being the intersection of all the queries), is not fully constrained. This contrasts with the "fully constrained solvers", which give n regions from n queries, allowing the average value of each region to be computed in closed form. Because of the ambiguity in the "under constrained solvers" you need to add extra information to solve the problem, in the form of prior information on the nature of images. My PhD happened to be on the form of such priors and the techniques required to optimize them. Images are generally smooth, but the distribution of gradients is "heavy tailed", meaning there are more large gradient
discontinuities than a smooth (Gaussian) prior would predict. This makes the priors non-convex and difficult to optimize efficiently - certainly not possible in 180s. I tried to use a naive smooth prior in a very ad hoc way in my Super Slim entry, but you can see from Ned's mid-contest analysis that it over-smoothed the results. I also tried this approach on the random and DCT queries, but the smoothing generated worse results for both.

In short, "under constrained solvers" require prior information. Fast priors over-smooth and produce poor results. Good priors are slow to optimize, hence not feasible. Some of the test-suite images were distinctly unnatural anyway, also biasing against "under-constrained solvers" using natural image priors. This is my 2 cents on why the "fully constrained solvers" prevailed.

Oliver

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Hannes Naudé

Date: 6 May, 2010 09:08:04

Message: 98 of 159

Alan : Care to share any details of what you did?

Okay, here goes. If you are currently reverse engineering my probing entries, this is a SPOILER ALERT. Stop reading now.

Parameter tweakers typically operate by trial and error. That is, a parameter is adjusted and if the score improves the adjustment is kept, if the score gets worse, the entry is consigned to the nether regions of the scoreboard and forgotten. I wondered whether these "failed" attempts at tweaking could not somehow inform future tweaking endeavours. In most cases the answer appears to be no, since knowing that changing x from 0.37 to 0.38 worsens the score does not in general imply anything about the effect changing it to 0.36 (or any other value) will have, since score changes near the optimal settings of parameters are typically not linear (or even continuous) in any of the parameters.

Also, the parameters are not usually independent, that is, if keeping x constant and increasing y yields a score improvement and keeping y constant and increasing x yields a score improvement, it does not follow that increasing both x and y will yield the combined improvement.

However, the popularity of lines like:
if (e>=8 && e<=25) || (e>=100) || (e>=36 && e<=46) || (e>=60 && e<=66)
yields an important clue. This line is essentially a manually coded hash table with 0 or 1 entries and the parameter e functioning as the hashkey for indexing into the table. In this case the entries determine which solver will be run for a specific problem. The core observation here is that the implicit ones and zeros in this hash table are INDEPENT parameters affecting the result.

So, if adding an (e==7) clause above improves the result by 10 points and adding an (e==9) clause improves the result by 7.5 points , then adding both is guaranteed to improve the result by 17.5 points.

Now note that we can transform any parameter in the code (in my case i chose blockSize) to a set of independent parameters with code similar to the following:

rand('seed',SomeParameterThatIdentifiesTheProblem);
lookup=[ 0.5 1;1 0];
i=find(rand(1)<lookup(1,:),1,'first');
blockSize = ExistingFormula+lookup(2,i)*DELTA;

The pseudocode above implements a hashtable with 2 equally spaced buckets (In practice I used 20). The hashkey is a random value, which provides the benefit of being uniformly distributed over a known range. It should be simple to see now that one could probe out the effect of each individual bucket and then, at a time of your choosing activate all beneficial buckets at the same time to get the aggregate score improvement.

This is the basic concept, allthough I did not follow this route exactly for social engineering reasons. Firstly, if I probed out the leader (at that time) in this way, the first beneficial bucket that I hit would have placed me in top spot and attracted undue attention to what I was doing. Secondly, once someone cottoned on, it would be very simple for them to arrive at the same beneficial set of buckets that I had deduced by simply observing the scores of my entries.

For these reason I deduced the impact of each of my 20 buckets by submitting a set of 20 solvers each with a different random selection of buckets enabled. The individual bucket contributions to each score change can then be solved for by pre-multiplying the vector of score differentials by the inverse of the sampling matrix (where the sampling matrix is a 20x20 matrix with rows corresponding to the random 20 bit keys used in each of the solvers).

This meant that anyone trying to piggyback would have to first reconstruct my sampling matrix by opening each of my 20 entries and copy-pasting appropriately.
I was quite chuffed with this aspect until I shutdown thoughtlessly, losing my precious sampling matrix and having to recover it in the way just described. #@!%$. You can't get pills for stupid.

As a sidenote, since runtime is additive, just like result is, it is possible to predict the runtime (and therefore the final score) in the same way. However, the problem of finding the entry with the best score (given all bucket contributions to both result and runtime) is not a nice linear one like the problem of finding best result. Luckily Matlab can still handle this by brute-force for a 20 bit key but it might get messy at 30 bits. As it happens, this was irrelevant, since changing the blocksize has minimal impact on runtime and because my runtime estimates were noisy in any case since large values in the inverse of my sampling matrix amplified random timing variations in my probe set.

I initially came up with this scheme a few years ago after a contest (not sure which one, might have been blackbox). I have not used it until now, partly due to limited opportunity (would not have worked in Army Ants, I was unable to participate in Color Bridge) and partly due to concern for the impact widespread use of this and related strategies could have on the game. This will exacerbate the overfitting problem significantly. Twenty probes gave me the ability to predict exactly the results that would have been achieved by over a million hypothetical submissions. 30 probes would push that to over a billion. A person with an auto-submitter and the inclination to use it in this way could stall essentially all further algorithmic progress within hours of the start of daylight. Banning the approach, by any means other than manual inspection is not viable, since it does not depend on any
single function that could be banned. Having a bot win a mid-contest prize provided the push I needed to overcome my hesitation. After all, if bots are beating us, then how much more mindless can it get (no disrespect intended to Alan’s bot, I’m sure I’d like him if I got to know him ;-) ).

For this reason, I think we should seriously discuss options for promoting better generality. Simply running all entries through a validation set afterwards will not work, since the usefulness of a validation set stems from the fact that it is seen far fewer times than the test set. In this context the validation set will see just as many entries as the test set did and the validation winner is therefore also likely to be an overfitted solution. For validation to work one must restrict the number of solvers to be run against it. Just running the top X entries against the validation suite doesn't work either, because the truly general solutions end up WAAAY down in the rankings. One way would be if each contestant is allowed to nominate one or two entries to be evaluated against the validation set, perhaps once-off or perhaps daily. Unfortunately this might require non-trivial changes to
the interface. This also requires that sock puppet accounts be banned, not just frowned upon.

The only foolproof way I can think of in the current framework is to restrict the problems so that it is impossible to split the test suite into independent pieces. Examples of such problems are the Ants and Army Ants contests. Even though these were my favourite contests of all time, I consider this quite restrictive. Another way would be to have a king of the hill style contest where competitors' codes are pitted directly against one another ala Army Ants. A third option is to have daily test suite switches. These have the downside of introducing discontinuities in score, which messes with the stats page. That may or may not be an acceptable price to pay.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Jan

Date: 6 May, 2010 09:11:06

Message: 99 of 159

Just some more thoughts/feedback:

1. Generalization: why not using a different data set each day (or couple of days) and then in the end letting all algorithms run against all data sets and take the average score? So e.g. if there are three data sets and an algorithm is learning against one data sets it surely goes away from the other two and since they are not available at the same time one can never learn for them all. Should drive the contest more towards generality of concepts.

2. Submission restriction: instead of 10 per 10 minutes, maybe having an increasing limit in time like : 2 per minute, 4 per 10 minute, 8 per hour so the queue is loaded more evenly and not flooded by the tweaking gurus. One gets results faster and well, any new addition to an algorithm that does not consists of pure tweaking will certainly take 5 minutes.

3. Categorization: Since the leading 1000 or 2000 submissions are all the same algorithm with small changes its really difficult to get an overview of what really different approaches there are. Here in the newsgroup people use categories in their texts (block solver, ...) - so its only natural to have them. I would suggest, that participants categorize other submissions (although this could be difficult e.g. for hybrid algorithms) and maybe get a reward for it like a place in a priority queue for each categorization of another submission. Then have winners in each categorization. Actually what I want is some kind of clustering of algorithm not for similar performance but for similar usage of algorithms, maybe some kind of genealogical grouped tree.

Example here: categories: random blocks, equal blocks, blocks + importance sampling, blocks + importants sampling + problem overfitting, ...

Then I would suggest that the submission table is initially displayed grouped for categories. So spectators have a better overview.

4. Maybe it would make the contest more popular if there would be a small prize like the winner gets a free science book from amazon (which also has its problems since amazon doesn't ship everywhere in the world). Actually just taking part for the fun of it is absolutely sufficient, but the popularity and visibility maybe goes up.

Regards,
Jan

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Hannes Naudé

Date: 6 May, 2010 10:57:04

Message: 100 of 159

Oliver: I spent about a day skim reading the literature (hence why I wasn't competitive during darkness). On the point of published compressive sensing algorithms providing the next query based on the current data, "Bayesian Compressive Sensing" (Ji, Xue, Carin, 2008) does exactly this. There are almost certainly more examples too.

Interesting, I see they refer to this alternate construction as Adaptive Compressive sensing. I had not come across this before. I'd be very curious to see what the best is that can be achieved if the time consideration is neglected. If anyone investigates this, please publish your results on the file exchange.

Hannes

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: igorcarron@gmail.com

Date: 6 May, 2010 13:49:14

Message: 101 of 159

On May 6, 5:57 am, "Hannes Naudé" <naude.jj+mat...@gmail.com> wrote:
> Oliver: I spent about a day skim reading the literature (hence why I wasn't competitive during darkness). On the point of published compressive sensing algorithms providing the next query based on the current data, "Bayesian Compressive Sensing" (Ji, Xue, Carin, 2008) does exactly this. There are almost certainly more examples too.
>
> Interesting, I see they refer to this alternate construction as Adaptive Compressive sensing. I had not come across this before. I'd be very curious to see what the best is that can be achieved if the time consideration is neglected. If anyone investigates this, please publish your results on the file exchange.
>
> Hannes

Hannes,

First of all Congratulations on your win. As I said in my blog
( http://nuit-blanche.blogspot.com/2010/05/cs-matlab-programming-contest-results.html
)and to answer Jan's comment, I believe it would be interesting to see
how the rankingfare if these algorithms were given a new set of images/
problems. That way we could easily see where overfitting took place.
Some people in the compressive sensing community did go for the
contest but it looks like most were a little late to the game. However
a good effort yielded a 53000-ish result (
http://www.mathworks.com/matlabcentral/contest/contests/2/submissions/4452
). Some of us are a little bummed about this especially since it is
not a good as very simple schemes (that did not overfit).

To come back to your initial statement, there is a a subset of
compressive sensing that looks at adaptive measurements. However, the
idea of devising detectors based on compressed sensing is for these
detectors to be as dumb as possible (thereby reducing, the time for
sampling data and their energy needs as well as their computational
requirements) and so while adaptive is possible and is shown to do
well, it does not do extraordinarly well compared to methods like
JPEG. Non-adaptivity is really a virtue in this case.

Again Congratulations on your win.

Igor.
http://nuit-blanche.blogspot.com/search/label/CS

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: the cyclist

Date: 6 May, 2010 14:35:05

Message: 102 of 159

Congratulations on a great victory, Hannes, and on your long-overdue grand prize win. Maybe Cobus was dragging you down before? ;-)

Thanks again to the contest team. Great new machinery, and another really interesting and intuitive puzzle.

Has the contest suite been made available? I am dying to see it. I had a few of my usual small time improvements that I was adding to the queue at the end, and one that I expected to be huge. The function "kippen" can be vectorized extensively (see the code below), and when I did so, the test suite ran consistently 6 seconds faster. I thought I had a real contender to leap over the parameter-tweaking noise in the end.

However, when applied to the contest suite, it did absolutely nothing. It led me to believe that somehow "kippen" was not even being called any more, but that was not the case. I'm eager to see what's actually happening.

the cyclist

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function aA = kippen(aA,nnr,Areas,kx,ky)
n = size(aA,1);

AcorrX = zeros(n);
AcorrY = AcorrX ;
AB = zeros(n+2);
AB(2:end-1,2:end-1) = aA;
ab = reshape(cumsum(AB(:)),n+2,n+2);

AC = AB';
ac = reshape(cumsum(AC(:)),n+2,n+2);

dx = Areas(:,5)-Areas(:,4)+1;
dy = Areas(:,7)-Areas(:,6)+1;

sz = dx.*dy;
m = Areas(:,1) ./ sz;

ac_x11_index = sub2ind([n+2,n+2],Areas(:,7)+1,Areas(:,4) );
ac_x21_index = sub2ind([n+2,n+2],Areas(:,6), Areas(:,4) );
ac_x12_index = sub2ind([n+2,n+2],Areas(:,7)+1,Areas(:,5)+2);
ac_x22_index = sub2ind([n+2,n+2],Areas(:,6), Areas(:,5)+2);

ab_y11_index = sub2ind([n+2,n+2],Areas(:,5)+1,Areas(:,6) );
ab_y21_index = sub2ind([n+2,n+2],Areas(:,4), Areas(:,6) );
ab_y12_index = sub2ind([n+2,n+2],Areas(:,5)+1,Areas(:,7)+2);
ab_y22_index = sub2ind([n+2,n+2],Areas(:,4), Areas(:,7)+2);

dm1x = m - (ac(ac_x11_index) - ac(ac_x21_index))./dy;
dm2x = (ac(ac_x12_index) - ac(ac_x22_index))./dy - m;
dmx = (dm1x < 0 & dm2x < 0).*max(dm1x,dm2x) + (dm1x > 0 & dm2x > 0).*min(dm1x,dm2x);

dm1y = m - (ab(ab_y11_index) - ab(ab_y21_index))./dx;
dm2y = (ab(ab_y12_index) - ab(ab_y22_index))./dx - m;
dmy = (dm1y < 0 & dm2y < 0).*max(dm1y,dm2y) + (dm1y > 0 & dm2y > 0).*min(dm1y,dm2y);

for ir = 1:nnr
    
    if dx(ir) > 1 && dmx(ir) ~= 0
        x = (0:dx(ir)-1)'/(dx(ir)-1);
        AcorrX(Areas(ir,4):Areas(ir,5),Areas(ir,6):Areas(ir,7)) = dmx(ir)*(x - kx)*ones(1,dy(ir));
    end
    
    if dy(ir) > 1 && dmy(ir) ~= 0
        y = (0:dy(ir)-1)/(dy(ir)-1);
        AcorrY(Areas(ir,4):Areas(ir,5),Areas(ir,6):Areas(ir,7)) = ones(dx(ir),1)*dmy(ir)*(y - ky);
    end
    
end

aA = aA + AcorrX + AcorrY;

end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Alan Chalker

Date: 6 May, 2010 15:17:04

Message: 103 of 159

"Hannes Naudé" <naude.jj+matlab@gmail.com> wrote in message <hru0tk$75i$1@fred.mathworks.com>...
> Alan : Care to share any details of what you did?
>.......
> single function that could be banned. Having a bot win a mid-contest prize provided the push I needed to overcome my hesitation. After all, if bots are beating us, then how much more mindless can it get (no disrespect intended to Alan’s bot, I’m sure I’d like him if I got to know him ;-) ).
>

Hannes: Absolutely brilliant! Congrats again on your well deserved victory. While it might not have been a fundamental algorithm change, what you did clearly involved deeper thinking about what was going on and an impressive amount of gamemanship.

And no disrespect taken... I know people have varying views on my approaches to the contest.. but as I've said before I like to be involved in a certain way since I can't compete with the 'algorithm experts'. But I do always abide by the explicit rules the contest team puts in place. The fact they've explicitly added a rule about test suite extraction but haven't added one about queue bombing with tweaks speaks volumes.

As a general comment to everyone asking for various changes to put the focus more on the algorithm development side of things, I'd like to point out that that would have the effect of reducing the overall number of participants in the contest. Most of the user community who could potentially be involved doesn't have the time or skill set to dive into the literature and try to develop new algorithms. There is a need for that obviously, but if the overall purpose of the contest is to increase awareness and excitement about using MATLAB, then making the contest as accessible as possible to anybody is vital. Allowing tweaking is one way of doing that.

And since Hannes requested it, here's a general breakdown of what I did:

In previous contests I had developed all MATLAB based code (primarily using the urlread function) to auto-submit entries, grab entries based upon their ID number, and make very crude parameter adjustments. With the changes to the website, the majority of that code was useless.

The biggest issue I had to deal with obviously involved the logging in to your account in order to submit an entry. I spent hours trying to get urlread to work with the new system. However in the end I couldn't because some of the login pages utilize SSL, which urlread doesn't support. (NOTE to MathWorks.. could you please consider adding that support as a feature request? I'd love to continue to do this all in MATLAB)

In the end, I ending up having to go with a command line application called CURL, which is available for a variety of platforms and is capable of supporting SSL. Thus I had to develop code to use the MATLAB system command to make calls to CURL to read and submit code. One small hurdle I struggled with was the fact that the code is submitted as a parameter, but can be thousands of characters long. Windows truncates normal command line entries well before that, so I had to write the code out to text files and then funnel the file as a parameter to CURL. Another minor issue was dealing with bad data coming back from CURL. Sometimes the webserver or network 'hiccups' and returns a malformed page. Our normal webbrowsers intelligently deal with that, but for this case I needed to scan for it and reexecute the CURL command.

For most of the contest I simply had a MATLAB function that would check what the leading entry was, and if it was a new entry that wasn't my own copy it to a file. It would then use the strrep function to replace certain parameters that seemed ripe for tweaking and resubmit the code. All this would happen once per minute in order to deal with the submission limit rule. Early on I manually selected the parameters by looking closely at the code but later on I just had a set of target parameters that the code would be scanned for and then automatically adjust. As I mentioned in another post, it would do something like this for a parameter of 123: 124, 122, 125,121, 126, 120 etc etc. Thus it would slowly adjust both up and down from the parameter.

For the last day of the competition, I had planned on developing code to auto create new accounts and automatically switch to the new accounts when the 10 minute limit was reached, in order to be able to 'tweak bomb' during the final rush. I spent hours and hours trying to get the autocreation code to work, but never code. There is something funky going on with the Javascript and forms for the account creation page that I couldn't figure out how to get to work with CURL. When it got to be only an hour from the contest end I gave up on that approach and just spent 15 minutes hand creating about 30 accounts. Thus at 17 minutes to go I started my code that started grabbing code from the head of the queue and resubmitting 2 -3 copies of each with small parameter changes. When necessary it would switch to a new account. I ended up submitting about 185 entries in that 17 minute window,
which averages to one every 5.5 seconds. I've seen other people due that fast by hand, but only for a few entries in a row.

As an aside, Hannes somewhat 'lucked out' in that he submitted his winning codes at ~22 minutes to go, and thus my resubmitter function didn't grab a copy of them (not that I'm saying a small tweak would have definitely beaten his code.. but you never know).

Finally, to everyone who dismisses 'auto-code' as trivial and not within the spirit of the competition, I respectfully disagree. I've ended up with hundreds of lines of code spread out across dozens of functions that took me hours and hours to develop over the past week. Dealing with all the data handling and error conditions and whatnot requires a lot of effort. Multiple times I'd start it running and come back later to find the program errored out due to some weird occurrence I hadn't taken into account. It's a different sort of challenge from the underlying algorithm of the contest problem, but it still requires a good understanding of MATLAB and thinking through different approaches to a problem.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Nicholas Howe

Date: 6 May, 2010 15:47:05

Message: 104 of 159

"the cyclist" <thecyclist@gmail.com> wrote in message
> Has the contest suite been made available? I am dying to see it. I had a few of my usual small time improvements that I was adding to the queue at the end, and one that I expected to be huge. The function "kippen" can be vectorized extensively (see the code below), and when I did so, the test suite ran consistently 6 seconds faster. I thought I had a real contender to leap over the parameter-tweaking noise in the end.


I'm curious about this as well, for several reasons. My on disappointment with this contest is that I had detected a small but (on the sample test suite) significant algorithmic improvement in solver2 relating to handling the pixels on the boundary of the image. I had hoped that the effect was small enough to improve results on the true test suite without knocking it out of the optimized minimum. Unfortunately that proved not to be the case, as the solvers I applied my modification to generally did worse than the ones they were based on. (On the sample suite I could get about .2% improvement in score.)

My second reason for curiosity is to see whether any images were shared between the sample and true test suites, as has sometimes been the case in the past. A strategy that had occurred to me was to detect a known image (using the sum of some set of pixels as a hash) and return that image exactly. With 65536 characters allowed, you could encode a few of the smaller images in your program. Technically this is not probing the test set; it is rather just an extreme form of overfitting via a lucky guess. I decided not to try it but am curious if it would have worked.

And a comment to Alan, since you brought it up: I'm glad you enjoy your game, but it is different from the one most of the rest of us are playing, and sometimes your game disrupts ours. It's also (except when the Mathworks changes everything) the same game every year: you can reuse all your machinery in contest after contest, regardless of the problem everyone else is trying to solve. I guess from my point of view having one player pushing the system in this way keeps things interesting, but I sure hope the practice does not spread or I will cease to enjoy it!

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Oliver Woodford

Date: 6 May, 2010 16:32:04

Message: 105 of 159

Clearly some contestants enjoy the tweaking game, while others enjoy making general algorithmic improvements. Obviously there is a balance to be had to please everyone, and an interesting question is "What is that balance?". The current format, with many different prizes, means there's definitely something for everyone. However, I feel that at the moment there are still not enough incentives for original algorithmic improvements during daylight. This could definitely be improved without removing all the thrill for tweakers. I do feel a good aim for future competitions is that the Grand Prize is won by an algorithm that works well in general, and not just on the test set. I hope the User Voice comes up with some positive suggestions that are adopted.

Oliver

PS Hannes and Alan, your techniques are impressive and merit kudos.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Hannes Naudé

Date: 6 May, 2010 17:39:05

Message: 106 of 159

"the cyclist" <thecyclist@gmail.com> wrote in message
> Congratulations on a great victory, Hannes, and on your long-overdue grand prize win. Maybe Cobus was dragging you down before? ;-)

Errm, if you'd like to know who's been dragging me down I'd refer you to the Blockbuster contest archive ;-)

> Has the contest suite been made available? I am dying to see it. I had a few of my usual small time improvements that I was adding to the queue at the end, and one that I expected to be huge. The function "kippen" can be vectorized extensively (see the code below), and when I did so, the test suite ran consistently 6 seconds faster. I thought I had a real contender to leap over the parameter-tweaking noise in the end.

Yip, been there. In an early contest (can't remember which one) Cobus and myself vectorized a big chunk of code and got a massive speedup. But on the contest machine it was actually slower. Had nothing to do with the testsuite, just the machines, JVM etc. Unless your OS, architecture, matlab version and especially JVM version matches, you shouldn't put too much faith in local timing results. Also, since JIT vectorization aint what it used to be.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: the cyclist

Date: 6 May, 2010 17:56:04

Message: 107 of 159

"Hannes Naudé" <naude.jj+matlab@gmail.com> wrote in message <hruurp$e65$1@fred.mathworks.com>...
> "the cyclist" <thecyclist@gmail.com> wrote in message
> > Congratulations on a great victory, Hannes, and on your long-overdue grand prize win. Maybe Cobus was dragging you down before? ;-)
>
> Errm, if you'd like to know who's been dragging me down I'd refer you to the Blockbuster contest archive ;-)
>

Yes, I continue to watch my back since reading your commentary in that contest!

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Sergey Y.

Date: 6 May, 2010 18:45:07

Message: 108 of 159

The problem of development of the code which works in general is the definition of “general”. For the contest “general” is the test set. One may say that set is not general enough. Then I will ask: what will make it more general? Adding more pictures? Then what about orientation preferences in natural scene pictures? Adding pictures of absolutely random pixels? Then do we need to make sure that statistics of random pixel is uniform in all possible ways? Any finite set of test boards will have distortions in one way or another. And we will not know those distortions until contest will uncover them.

I do not think that we can use second test set to fix it. It will be just checking one bias set using another bias set.

However what we may try to enforce is “uniform” overfitting of the test set.
I suspect that currently tweaking results in overfitting of the most valuable examples at the expense of less valuable.
Maybe we have to normalize each example score (one way is to normalize to the current best result for each picture)

Obviously the Amy Ant contest practically resolves those issues. (Practically because result still depended on field configuration. Theoretically it should be tested on all possible fields). If we have that kind of contest in the future then my only recommendation would be to run new entries against all previous Kings of the Hill to make best solution more “universal”. Probably we even can reduce number of submissions to 1 per 1 hour.

P.S.
In addition, Amy Ant type contest will easily allow changing boards every day. And previous day battles could be shown every day for general public entertainment.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Hannes Naudé

Date: 6 May, 2010 18:48:04

Message: 109 of 159

"Alan Chalker" <alancNOSPAM@osc.edu> wrote in message
>As a general comment to everyone asking for various changes to put the focus more >on the algorithm development side of things, I'd like to point out that that would >have the effect of reducing the overall number of participants in the contest. Most >of the user community who could potentially be involved doesn't have the time or >skill set to dive into the literature and try to develop new algorithms.

>There is a need for that obviously, but if the overall purpose of the contest is to >increase awareness and excitement about using MATLAB, then making the contest >as accessible as possible to anybody is vital. Allowing tweaking is one way of doing >that.


Yes Alan, we've heard this argument before, but the numbers aren't with you on this one. The stats page clearly indicates that darkness and twilight are more popular than daylight by a factor ~3 (or more if sock puppet accounts are taken into account). While its impossible to determine to what extent this trend is caused by excessive parameter tuning in daylight it would be incredibly naive to imagine that the two are wholly unconnected.

On the other side of the equation, there is the "most of the user community" group that would supposedly abandon the contest if they were expected to do more than parameter tweaks. Looking at the same stats page we see a total of 5 participants with more than 200 entries. For each of these 5 participants I can point to a productive change they've applied to the code that improved (as opposed to just overfitted) it. So I don't know who these people who are unable or unwilling to do more than change 0.38 to 0.37 are. Maybe you'd like to give a few examples?

>The biggest issue I had to deal with obviously involved the logging in to your >account in order to submit an entry. I spent hours trying to get urlread to work >with the new system. However in the end I couldn't because some of the login >pages utilize SSL, which urlread doesn't support. (NOTE to MathWorks.. could you >please consider adding that support as a feature request? I'd love to continue to do >this all in MATLAB)

Similar experience here. I also devoted a couple of days to developing an autosubmitter. In previous contests I developed a mechanism for copy protection that would harm performance (typically by dropping out of the local minimum) if the code was edited or even if it was resubmitted identically. As an unintended side effect it would seriously screw with the contest diff. See http://www.mathworks.com/contest/splicing.cgi/diff.html?id1=45516&id2=45323
for an example. However using this mechanism requires that my code contains non-printing characters that get corrupted in the process of a copy-paste. So I could not submit code manually but had to go via an autosubmitter. Anyway, back to the point.

I also realised that urlread wouldn't do the job, partly because of SSL, but also because of lack of support for cookies, which are used to keep track o you once you're logged in. I developed an autosubmitter using WaTiR in C# (which I don't actually know) only to realise that because WaTiR actually just automates IE, all the complexity of dealing with SSL, cookies, and malformed pages are handled for you, but on the other hand you are still unable to send hidden characters. :-((

So I started over using the System.Web assembly in C#. I did not do the login at all but just grabbed the cookies from IE's cache. This still did not work so I resorted to using a packet sniffer to see what I was doing different from IE. Turns out IE had some more cookies that weren't in its cookiefile as far as I could see. (If some network guru here could tell me where IE squirelled these away I would be eternally grateful) So I just copy pasted these into my code and voilla, I could submit.

Only to learn that something in the contest machinery change has caused this to stop working. Even when sending the hidden characters without the urlencoding, they would somehow get corrupted/translated at the contest machine before the code got to execute. So that was the end of my autosubmitter venture.

> Finally, to everyone who dismisses 'auto-code' as trivial and not within the spirit of >the competition, I respectfully disagree. I've ended up with hundreds of lines of >code spread out across dozens of functions that took me hours and hours to develop >over the past week. Dealing with all the data handling and error conditions and >whatnot requires a lot of effort. Multiple times I'd start it running and come back >later to find the program errored out due to some weird occurrence I hadn't taken >into account. It's a different sort of challenge from the underlying algorithm of the >contest problem, but it still requires a good understanding of MATLAB and thinking >through different approaches to a problem.

Having coded an autosubmitter that is nowhere near as sophisticated as yours, I fully agree that what you did is not trivial. But it IS disruptive and it DOES change the nature of the game. And most of us agree that it changes the nature of the game for the worst.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Sergey Y.

Date: 6 May, 2010 19:17:04

Message: 110 of 159

"Hannes Naudé" <naude.jj+matlab@gmail.com> wrote in message <hrv2t4$d42$1@fred.mathworks.com>...
> In previous contests I developed a mechanism for copy protection that would harm performance (typically by dropping out of the local minimum) if the code was edited or even if it was resubmitted identically. As an unintended side effect it would seriously screw with the contest diff. See http://www.mathworks.com/contest/splicing.cgi/diff.html?id1=45516&id2=45323
> for an example. However using this mechanism requires that my code contains non-printing characters that get corrupted in the process of a copy-paste. So I could not submit code manually but had to go via an autosubmitter. Anyway, back to the point.



Brilliant !!!!

Sergey

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Yi Cao

Date: 6 May, 2010 19:52:06

Message: 111 of 159

"Hannes Naudé" <naude.jj+matlab@gmail.com> wrote in message <hru0tk$75i$1@fred.mathworks.com>...
>
> rand('seed',SomeParameterThatIdentifiesTheProblem);
> lookup=[ 0.5 1;1 0];
> i=find(rand(1)<lookup(1,:),1,'first');
> blockSize = ExistingFormula+lookup(2,i)*DELTA;
>

Hanners,

Thanks for sharing this great idea. By revising your 20 'Random Randomness' entries, I got the same loopup table your implemented in your winning code and the predicted result is exactly the same as the actual one show on the website. However, onething still puzzels me: in two solvers, you used two independent rand calls to determined the corresponding column of lookup table. In theory, these two calls should give two different columns, shouldn't they? If so, the above calculation cannot be right, can it?

Yes, from this approach, almost any parameters can be overfitted to get the minimum results. We have to consider a workable scheme to make future contest more general. Contesters against contesters is a good idea. For example, the testsuit can contain a lager number of cases, say, 1000. Select 100 cases for which the current leading entry has the best results and calculate the results of these 100 cases. The new leading code have to beat the current leader on score based on these 100 cases. Once new leader determined, another set of 100 cases will be determined based on the new leading code. In this way we can ensure the leading score is monotonically decrease. However, since the actual suit is dynamically chaning, it will make overfitting difficult.

Another idea regarding submission limit, we may introduce an average-waiting time scheme to punish multiple submission simutanously. Each submission has a waiting time counted from when it is submitted. If the same contester submit several entries, then the waiting time is the acutal time divided by the number of entries in the queue. Then, if there is another contester has an entry in the queue. Even the single entry is submitted later, the waiting time of this entry grows quicker then the first contester who has multiple entries in the queue. Hence, the entry of the second contester may get evaluated earlier. This may encourage contesters to keep the queue as short as possible.

Yi
 

 

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Nathan

Date: 6 May, 2010 20:05:22

Message: 112 of 159

Hello all

Congratulations to the prize-winners and thank you again, Mathworkers. I missed most of this contest but I had fun in the last few days, spectating and thinking (unsuccessfully) about the problem. This problem has the nice feature that it doesn't provide feedback to the solver (like Ants) and is probably the closest we've had yet to a real-world application (?).

Alan: "But I do always abide by the explicit rules the contest team puts in place." The rules: "we ask that you not overwhelm the queue"
Alan: "One of the tactics I tried yesterday morning was to ... fill the queue with lots of entries that take up almost the full 180 second limit ... those queue delay entries would significantly increase the time it took before other entries could be scored, artificially inflating my longevity time. "
Was that not a deliberate effort (though misguided, on this occasion) to "overwhelm the queue" for advantage in a prize race?

I don't know how hard it is to write code that talks to a web server, but I don't think it's a skill that this contest should reward. The contest has always been about raw abstract problem-solving ability, even more than it is about matlab. Problem-setters have always been skilfully avoided rewarding specialist knowledge (not even specialist knowledge of compressive sensing seems to have helped in this case). Specialist knowledge of web programming should not be a factor.

Daylight, contrary to some comments in this thread, is not all about tuning of numerical parameters. Many contests have featured a sprinkling of algorithmic advances in daylight, even in the closing minutes.

I doubt if there is any practicable way the organisers can forcibly eliminate the various methods that many competitors dislike, even if they were willing to. However there is a community ethos here that's stronger than our individual competitive streaks. There's lots of evidence for this, notably the respect for the ban on wholesale probing, and the fact that code scrambling has not has much impact. Could we embody this spirit in a kind of players' charter that states what the purpose of the contest is, and lays out (somewhat vague) principles of good conduct? Parts of the rules are already formulated like this - calls for fair play, rather than enforcable prescriptive rules.

I've been thinking for a few contests about the nature of the Grand Prize, which currently is based on a snapshot of the scoreboard at a particular instant. This has motivated some great competition in the last minutes, and to win one of these is a big achievement. But the guy who wins the last stage of the Tour De France doesn't win the race. I think the most prestigious prize should recognise success at all stages of the contest - maybe by tallying up mid-contest prizes (with weightings and/or tie-break criteria) or by a multi-day "Push". The details would take a bit of working out - it's much easier to push on day 4 than day 7, and so on...

Finally I'd repeat the request for a CAPTCHA, for the usual reasons, and echo Jan's request for a longer twilight. It seems a lot of people really enjoy that phase.

See you all next time

Nathan

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Alan Chalker

Date: 6 May, 2010 21:43:05

Message: 113 of 159

I'm glad to see we are having a thoughtful conversation on the nature of the contest and potential improvements. While we might not all agree on those, it's refreshing to hear lots of voices. In response to a couple of the comments above:

Nathan: "Was that not a deliberate effort (though misguided, on this occasion) to "overwhelm the queue" for advantage in a prize race?"

My understanding of the origin of that rule is that in the past there were certain players who would in the middle of the contest repeatedly submit hundreds and hundreds of the same code just for the sake of wrecking havoc on the overall contest. This would cause hours long backups in the middle of the contest and usually required the MathWorks staff to manually remove the entries from the queue. Note that a long-standing tradition is that the queue gets overwhelmed in the last 20 minutes of the contest, but that's ok because it doesn't prevent anyone from getting feedback from entries or anything.

What I did was essentially delay the queue by ~30 minutes (10 entries * 3 mins per entry). I see the viewpoint of some that this might be 'overwhelming the queue', but in my opinion it's not since it's no more of a delay than is often present at many other times in the contest. We might have to agree to disagree on this one. If the contest team wants to make a more formal definition of 'overwhelming the queue' I'm all for that, since I felt I was within both the letter and the spirit of the rule.

Nathan: "I don't know how hard it is to write code that talks to a web server, but I don't think it's a skill that this contest should reward. The contest has always been about raw abstract problem-solving ability, even more than it is about matlab."

I completely disagree with this. If the contest were just about raw abstract problem-solving ability then it would be setup much differently. The mini-contests are clearly designed to reward various types of things, ranging from abstract problem-solving (Darkness) to efficient m-code creation (1000-node) to gamemanship (longevity).

The MathWorks doesn't expend staff time and resources just to let people see who's a better problem-solver. They run this to build excitement about using MATLAB and perhaps expose users to capabilities and techniques they normally wouldn't see in their everyday usage. Interfacing to a web server is one of those items in my viewpoint.

Nathan: "However there is a community ethos here that's stronger than our individual competitive streaks. There's lots of evidence for this, notably the respect for the ban on wholesale probing, and the fact that code scrambling has not has much impact."

I completely agree with this. I'm glad to see that for the past several contests code obfuscation hasn't reared it's ugly head, nor has the 'hacking' that used to completely crash the contest machine.

Nathan: " I think the most prestigious prize should recognise success at all stages of the contest - maybe by tallying up mid-contest prizes (with weightings and/or tie-break criteria) or by a multi-day "Push"."

If I recall the Golf contest was like this, and it was indeed interesting.

Nathan: "Finally I'd repeat the request for a CAPTCHA, for the usual reasons, and echo Jan's request for a longer twilight. It seems a lot of people really enjoy that phase."

I think extending twilight into the weekend is a great idea. Regarding the CAPTCHA, I don't think it's necessary. They tried it once and couldn't get it to work. Regardless, there are programs out there now that can pretty easily crack most of the standard ones (or at least have enough of a success rate that you can get through with only a couple tries).

What it would do is just raise the bar on auto-code a little higher, but determined players would still get around it. What you are really asking for is a ban on auto-submitters. If that's what the team wants, they should just explicitly put it in the rules. The probing ban has been adhered to and I think the same would happen for this if put in place.

Yi: "For example, the testsuit can contain a lager number of cases, say, 1000. Select 100 cases for which the current leading entry has the best results and calculate the results of these 100 cases. The new leading code have to beat the current leader on score based on these 100 cases. Once new leader determined, another set of 100 cases will be determined based on the new leading code."

I suggested a similar thing in a blog posting and agree it would help. See http://blogs.mathworks.com/contest/2010/05/06/hannes-wins/#comment-6606 for my list of contest suggestions.

Hanens: "I also realised that urlread wouldn't do the job, partly because of SSL, but also because of lack of support for cookies, which are used to keep track o you once you're logged in."

Good points.. I did have to deal with the cookie issue too, however CURL handles it just fine and I didn't have any of the problems with 'hidden' cookies like you described.

Hannes: "In previous contests I developed a mechanism for copy protection that would harm performance (typically by dropping out of the local minimum) if the code was edited or even if it was resubmitted identically. "

This is another brilliant idea Hannes, but is completely going against the spirit and letter of the rules: "any entry can be viewed, edited, and resubmitted as a new entry. You are free to view and copy any entry in the queue. If your modification of an existing entry improves its score, then you are the "author" for the purpose of determining the winners of this contest. We encourage you to examine and optimize existing entries. "

Hannes: "The stats page clearly indicates that darkness and twilight are more popular than daylight by a factor ~3 (or more if sock puppet accounts are taken into account) ... So I don't know who these people who are unable or unwilling to do more than change 0.38 to 0.37 are. Maybe you'd like to give a few examples?
"

I think you are misunderstanding my argument and you are seeing is a bias due to the start of the contest and the advertising that takes place for it. I think a more telling stat is that out of 151 accounts that submitted entries, 111 of them submitted less than 10 entries (I removed my alternate accounts from these totals).

Why do more than 2/3rds of people just submit a couple times instead of being involved more? I propose it's because they quickly find they are overwhelmed or unable to come even close to the expert players. How many of those even come back to subsequent contests?

The CS related blog postings Igor linked to above (http://nuit-blanche.blogspot.com/search/label/CS) are fascinating to read, because the true CS experts were all frustrated that their algorithms weren't performing well in the contest. The fundamental issue was that the contest involves more than just an expertise in CS.. as it should, because otherwise it'd cater to just a very small audience.

Hannes: "But it IS disruptive and it DOES change the nature of the game. And most of us agree that it changes the nature of the game for the worst. "

We are all very vocal about our opinions, but we represent only a small fraction of the playerbase. Most people are casual players and I suspect the contest team is looking for ways to better involve them and cater to them. They've repeatedly shown in the past that if they feel the goals of the contest are being compromised by a technique or player that they will change the rules or call them out on it. I've yet to hear a negative comment from any of them regarding my involvement.

 I think a good question to ask is why do we always see the same general set of names showing up as winners and what can be done to level the playing field for novices and experts alike so that the contest is more inclusive?

Oliver: "Clearly some contestants enjoy the tweaking game, while others enjoy making general algorithmic improvements. Obviously there is a balance to be had to please everyone, and an interesting question is "What is that balance?". The current format, with many different prizes, means there's definitely something for everyone. However, I feel that at the moment there are still not enough incentives for original algorithmic improvements during daylight. This could definitely be improved without removing all the thrill for tweakers.... PS Hannes and Alan, your techniques are impressive and merit kudos. "

I'll finish by saying I am in complete agreement with Oliver's posting above and I thank him for the compliment.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Sergey Y.

Date: 6 May, 2010 23:44:04

Message: 114 of 159

"Alan Chalker" <alancNOSPAM@osc.edu> wrote in message <hrvd59$68n$1@fred.mathworks.com>...
> Why do more than 2/3rds of people just submit a couple times instead of being involved more? I propose it's because they quickly find they are overwhelmed or unable to come even close to the expert players. How many of those even come back to subsequent contests?
>
> I think a good question to ask is why do we always see the same general set of names showing up as winners and what can be done to level the playing field for novices and experts alike so that the contest is more inclusive?
>

I believe it is possible to run some small challenges only for novice. For example 1000 node challenge is very suitable for this. It can be run on separate machine to avoid interference with main stream.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Darren Rowland

Date: 7 May, 2010 02:42:05

Message: 115 of 159

> > Why do more than 2/3rds of people just submit a couple times instead of being involved more? I propose it's because they quickly find they are overwhelmed or unable to come even close to the expert players. How many of those even come back to subsequent contests?
> >
> > I think a good question to ask is why do we always see the same general set of names showing up as winners and what can be done to level the playing field for novices and experts alike so that the contest is more inclusive?
> >
>

@Alan. I am one of those 2/3rds of people who submit only a few entries early on. I then watch the results of the contest and see how the scores develop. This is partly due to time constraints and partly because the leading submissions by e.g. Jan
and Oliver quickly progress beyond my understanding.
This is why I have appreciated your "commented leading entry" during the past few contests, and also the mid-contest analysis. I had hoped that Steve Eddins would write one up for this contest, as he seems ideally qualified.

> I believe it is possible to run some small challenges only for novice. For example 1000 node challenge is very suitable for this. It can be run on separate machine to avoid interference with main stream.


@ Sergey. This is a good idea. A small contest for novice players (perhaps on the weekend) where a novice can be defined as a player who has never won a prize or has only submitted a handful of entries for the contest so far. I can imagine this would generate some interest for players like myself.

Darren

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Alan Chalker

Date: 7 May, 2010 04:24:04

Message: 116 of 159

"Darren Rowland" <darrenjremovethisrowland@hotmail.com> wrote in message

> @Alan. I am one of those 2/3rds of people who submit only a few entries early on. I then watch the results of the contest and see how the scores develop. This is partly due to time constraints and partly because the leading submissions by e.g. Jan
> and Oliver quickly progress beyond my understanding.
> This is why I have appreciated your "commented leading entry" during the past few contests, and also the mid-contest analysis. I had hoped that Steve Eddins would write one up for this contest, as he seems ideally qualified.
>

Darren: I'm glad to hear you've found that helpful in the past. I apologize again to the community for not being able to do it this time due to the amount of effort I expended on redoing my auto-submission code. However I will endeavor to do it again in future contests.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Amitabh Verma

Date: 7 May, 2010 05:14:05

Message: 117 of 159

 Hannes:
<snip>"On the other side of the equation, there is the "most of the user community" group that would supposedly abandon the contest if they were expected to do more than parameter tweaks. Looking at the same stats page we see a total of 5 participants with more than 200 entries. For each of these 5 participants I can point to a productive change they've applied to the code that improved (as opposed to just overfitted) it. So I don't know who these people who are unable or unwilling to do more than change 0.38 to 0.37 are. Maybe you'd like to give a few examples?"</snip>

As a newcomer from my side I did start with studying the code while I looked at certain parameters how they affected but could only do so within a certain capacity due to time constraints. In that respect maybe if the Matlab teams advertised a little description about the upcoming problem statement the prior weekend, it would get the new comers prepared a little better when participating.

Based on Alan's scoring formula I tried to look at aspects which would have the most impact and started with getting the complexity down.
My 2 entries were aimed at getting the complexity down to 10.
Make it Faster Righter.. Simpler
http://www.mathworks.com/matlabcentral/contest/contests/2/submissions/2498

10 Pointer
http://www.mathworks.com/matlabcentral/contest/contests/2/submissions/2963


I personally am not happy when I look at the number of entries I submitted. A majority of them were actually due to my own fault of interpreting the Longevity contest the wrong way. I was trumping my own entries diminishing my own lead. And not to mention sleep deprivation was making me talk to my own entries as you can see from some of the titles ;)

Regarding the bots I'd like to point to the following lines from the contest page:
- Out of consideration for everyone participating in the contest, we ask that you not abuse the system.
- Tuning the entry to the contest test suite via tweak bombing or other techniques is permitted, but we ask that you not overwhelm the queue.
- Max 10 submits per player in 10 minutes: This is another request from the community do minimize the queue flooding especially towards the end of the contest. If anyone updates their queue flooding scripts for the new contest, after each 10 files, you will get an error message.
- On the “1 user = 1 login id”, this is tied to your MathWorks.com login. So one login = 1 contest user.
http://www.mathworks.com/matlabcentral/contest/contests/2/rules
http://blogs.mathworks.com/contest/2010/04/28/new-contest-website-features/#comments

Of course, these don't mention explicitly anything but I think we still do understand what they are trying to say.

Finally, regarding alternate algorithms that are not favored by the tweaking community in the day and suffers due to the combined optimization as someone mentioned.
An example I thought being try05 by Alfonso Nieto-Castanon
http://www.mathworks.com/matlabcentral/contest/contests/2/submissions/838
I had the following suggestion which could run parallel and still use the same scoreboard.

 At the end of Twilight the top 5-10 contestant be given the chance to form a team with each one as a mentor to their team. If someone has time issues, he/she could delegate a willing successor. The team members contribution can be documented in the code which can be only submitted by the mentor of that team. The advantages would be endless. Newcomers will feel more involved and have a better understanding of the underlying mechanism. This they can then apply to their individual entry. In turn mentor gets the tweaking feedback from the team members and more time to focus on the actual code. The mentor is not obliged to share the final code with the team-members in case there are any trust issues. For their guidance they get the feedback.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Hannes Naudé

Date: 7 May, 2010 07:02:05

Message: 118 of 159

Yi:”Thanks for sharing this great idea. By revising your 20 'Random Randomness' entries, I got the same loopup table your implemented in your winning code and the predicted result is exactly the same as the actual one show on the website. However, onething still puzzels me: in two solvers, you used two independent rand calls to determined the corresponding column of lookup table. In theory, these two calls should give two different columns, shouldn't they? If so, the above calculation cannot be right, can it?"

Glad to hear someone took up the challenge to reverse engineer the code. I'm not 100% sure that I understand your question correctly, but I'll try to answer anyway. As I understand, you problem is with the fact that there are two independent calls to rand(1). One is in solver1 and the other in solver2. But only one of these solvers will get called for each problem, so for any given problem the existence of the other lookup table will be irrelevant. The calls don't give different columns because only one of them gets called.

If you wanted to go all out, you could fix the solver to always call solver1 and use these results to optimize the lookup table there. Then fix the solver to always call solver2 and optimize the lookup table there independently. Then use a third lookup table to determine which solver gets called. I chose not to go this route in order to keep a low profile by using a small number of probes.

Cheers
H

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: srach

Date: 7 May, 2010 07:07:03

Message: 119 of 159

> Having coded an autosubmitter that is nowhere near as sophisticated as yours, I fully agree that what you did is not trivial. But it IS disruptive and it DOES change the nature of the game. And most of us agree that it changes the nature of the game for the worst.

I think that it is not the autosubmitter per se that is frustrating, but an autosubmitter that continuously scans the highscore and starts submitting when another entry takes the lead. Such kind of code does not need humans anymore once it is created.

Personally, I think the "big" prizes of the matlab contests are the darkness and twilight awards and I have the deepest respect for all those people who have won these before. And although these are contest phase I am terribly bad in, I like them the most and I would really be happy if twilight would last a bit longer.

However, I also like daylight, because that is the phase where I could learn the most, but often the tweaking drags away my attention from trying to understand the algorithms to playing with the profiler. Therefore I would love to have another phase after twilight, something like a "reverse twilight" where you can see the code of entries, but not the score. (only the code and score of entries scored in darkness and twilight are visible, but nothing of entries submitted in "reverse twilight".

This would allow for studying the code of others, adopting and combining ideas, learning and probing, without tweaking. Especially I think that this would lead to an overall improvement of the algorithms and probably more diversity of algorithms, hence, more to learn.

Regards
Stefan

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Hannes Naudé

Date: 7 May, 2010 08:51:04

Message: 120 of 159

Alan:"My understanding of the origin of that rule is that in the past there were certain players who would in the middle of the contest repeatedly submit hundreds and hundreds of the same code just for the sake of wrecking havoc on the overall contest. This would cause hours long backups in the middle of the contest and usually required the MathWorks staff to manually remove the entries from the queue. Note that a long-standing tradition is that the queue gets overwhelmed in the last 20 minutes of the contest, but that's ok because it doesn't prevent anyone from getting feedback from entries or anything. "

The intentions of previous queue spammers are not and can not be known. Personally I think it much more likely that the incident you refer to was caused by an buggy autosubmitter than by malicious intent. In any case it is not relevant, since the rules do not refer to intentions. They state clearly and simply :
"...we ask that you not overwhelm the queue. "
and (allthough technically this is from the blog not the rules):
"Max 10 submits per player in 10 minutes: This is another request from the community do minimize the queue flooding ESPECIALLY TOWARDS THE END OF THE CONTEST" (Caps are my own)

So, no. It is not "ok" to intentionally delay the queue nor is it "ok" to flood the queue at the end.

Alan:"What I did was essentially delay the queue by ~30 minutes (10 entries * 3 mins per entry). I see the viewpoint of some that this might be 'overwhelming the queue', but in my opinion it's not since it's no more of a delay than is often present at many other times in the contest."

Correction. You submitted 20 queue delaying entries causing a delay of an hour. As a percentage of the 24 hour Longevity period that's pretty significant. And the argument that its acceptable since it is no more of a delay than is present at other times is circular and disingenious. It also implies that delays of up to 8 hours are acceptable since that is how long we had to wait for the queue to finish at the end of daylight.

Alan: "The MathWorks doesn't expend staff time and resources just to let people see who's a better problem-solver. They run this to build excitement about using MATLAB and perhaps expose users to capabilities and techniques they normally wouldn't see in their everyday usage. Interfacing to a web server is one of those items in my viewpoint."

Silly argument, given that we have agreed publicly that if you want to interface to a web server sporting SSL and cookies then Matlab is not the right tool for the job.

Alan:"I'm glad to see that for the past several contests code obfuscation hasn't reared it's ugly head, nor has the 'hacking' that used to completely crash the contest machine."

Not so sure of that on two counts. Firstly the simplest form of hacker attack on a server is a denial-of-service attack. It's clear that we had several of these during the contest. Secondly, at some point someone defaced Oliver's solver giving us variable names like lllli to replace blockSize. I'm not sure what that is if not obfuscation. It obviously never bothered any of the leaders enough for them to remove it from the code since it survived all the way to the winning entry. But it does present a significant barrier to entry for novices.

> Hannes: "In previous contests I developed a mechanism for copy protection that would harm performance (typically by dropping out of the local minimum) if the code was edited or even if it was resubmitted identically. "
>
Alan:"This is another brilliant idea Hannes, but is completely going against the spirit and letter of the rules: "any entry can be viewed, edited, and resubmitted as a new entry. You are free to view and copy any entry in the queue. If your modification of an existing entry improves its score, then you are the "author" for the purpose of determining the winners of this contest. We encourage you to examine and optimize existing entries. "

Sorry, I don't understand your problem. Could a copy protected entry be viewed? Check. Edited? Check. Resubmitted? Check. Can people examine it and optimize it? Check. Any significant optimization would still improve the score. It's just the miniscule parameter tweaks that would most likely cause more harm then benefit. And the same can be said of any entry that is overfitted to a great enough extent. In any case this is a moot discussion as I never used this trick and will now most likely never get the opportunity. So I consider it acceptable and will not comment further on this topic.

Alan:"Why do more than 2/3rds of people just submit a couple times instead of being involved more? I propose it's because they quickly find they are overwhelmed or unable to come even close to the expert players. How many of those even come back to subsequent contests? "

Per Rutquist submitted a single entry. Mike Bindschadler submitted 2. Cobus Potgieter and Ander Skjal both submitted 3. These are just the names that leapt out at me. I'm sure I missed a few. I don't think these guys were overwhelmed by anything. I just think that many players expert and novice alike are so disilusioned with daylight that they no longer play beyond the end of twilight. I know that both Cobus Potgieter and Gerbert Myburgh fall in this category. I agree that we should try to make the contest more acessible to novices and a "best rookie" prize would be a great place to start. I don't see how massively overfitted solutions driven by bots do anything to make the contest more accesible to anyone.

Incidentally, don't get me wrong, I have nothing against daylight per se. When I started playing the contest EVERYTHING was in daylight. I absolutely loved being part of the hectic groupmind superorganism that spontaneously emerged and I learnt an incredible amount from guys like Stijn Helsen, Paulo Uribe, Christian Ylamaki etc. This got me hooked, but somehow the contest experience seems to have changed and today, you'd be hard pressed to really learn anything by investigating what the current leader changed to "improve" the code. In those days we used to moan about small time tweaks taking the lead. Oh, how I yearn for time tweaks.

Alan:"We are all very vocal about our opinions, but we represent only a small fraction of the playerbase. Most people are casual players and I suspect the contest team is looking for ways to better involve them and cater to them. They've repeatedly shown in the past that if they feel the goals of the contest are being compromised by a technique or player that they will change the rules or call them out on it. I've yet to hear a negative comment from any of them regarding my involvement. "

Please don't take anything I write as a negative comment regarding your involvement. You are probably the most involved player in the entire contest and make great contributions contest after contest. If you were no longer involved it would be a great loss to this community. I'm merely suggesting that some of your techniques are extremely disruptive and need to be toned down or abandoned. I know that you will respect a ban on any given technique imposed by TMW, but I've also seen from past contests that the contest team don't lightly make changes to the rules. Rather a rule is only changed if there is overwhelming community support.

The fact that you are quite vocal on this forum make this seem like a balanced argument, when in fact it is not. I doubt that there is more than 2 or 3 players in this entire contest who believe that auto submitters and sock puppet accounts are beneficial, or even benign. The rest of us are pretty unanimously against them.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Jan

Date: 7 May, 2010 09:13:04

Message: 121 of 159

"Alan Chalker" <alancNOSPAM@osc.edu> wrote in message
...
> As a general comment to everyone asking for various changes to put the focus more on the algorithm development side of things, I'd like to point out that that would have the effect of reducing the overall number of participants in the contest. Most of the user community who could potentially be involved doesn't have the time or skill set to dive into the literature and try to develop new algorithms. There is a need for that obviously, but if the overall purpose of the contest is to increase awareness and excitement about using MATLAB, then making the contest as accessible as possible to anybody is vital. Allowing tweaking is one way of doing that.
...

I don't think it decreases the number of participants. For me it was rather frustrating to see that a general approach cannot come below rank 2000 in the final submission list. Overfitting is not so interesting for me (although I did take a small part in the game because everybody seemed to do it - wanted actually to change the algorithm instead of tweaking the parameters which was not effective), so I am more interested in the twilight phase and it seemed like many other people were mainly active on twilight.

However, I really don't mind the tweaking attempts. Hannes actually showed how much thought can be behind such things. :)

But I would also love to see what was the best general approach.

So, inspired by Alans description I will try to run all passed algorithms over the weekend on the testset (3000-4000 entries times ~1 minute makes 2-3 days running time) using urlopen and some hand crafted matlab script. Then make a plot comparing old and new rank. The region where the data is diagonal will probably depict development of generalized algorithms (since people optimized for the validation set), the region where it will be scattered is the tweaking region. Or maybe there isn't such a region? With colorcoding for submission date, this will be nice, I am sure.

Using other test-sets with relatively large/small number of queries per area will also be interesting, but is optional.

And for the flooding of the score list, why not restricting the number of entries to the 10 best entries per player. It's just demotivating to not come below 2000 without taking part in the tweaking game.

Regards, Jan

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Hannes Naudé

Date: 7 May, 2010 09:22:02

Message: 122 of 159

Amitabh:"As a newcomer from my side I did start with studying the code while I looked at certain parameters how they affected but could only do so within a certain capacity due to time constraints. In that respect maybe if the Matlab teams advertised a little description about the upcoming problem statement the prior weekend, it would get the new comers prepared a little better when participating."

I wouldn't worry too much about that. Despite statements to the contary, specialist knowledge is rarely an advantage in this contest. In fact in this contest it appears to have been a hindrance ;-). Algorithm development in the context of this competition is not about delving into the literature and coming up with hypercomplex non-intuitive ways of computing an optimal answer. Rather it is mostly about common sense. After all what literature are you going to read to determine an optimal strategy for an ant to get food to it's nest? You just have to sit quietly and come up with a strategy (a six year old could do this) and then put that strategy into code (most six year olds can't do this). Or in the context of daylight, you have to read the leading code (or a portion of the code) understand what it does, think of a way it could be improved and code it. And by "way it can be improved" I
dont mean "replace Hakajekko[76] radially inbred complex conjugate countersearch which runs in O(n^3) with Smith[2010]'s counterrotating hyperspectral supersearch which only requires O(n^2.97)" I mean stuff like "Let's not split blocks that are close to 0 or 255, since the gain here is limited" or "Don't kill opposing ants that are actually helping me".

Amitabh:"I personally am not happy when I look at the number of entries I submitted. A majority of them were actually due to my own fault of interpreting the Longevity contest the wrong way. I was trumping my own entries diminishing my own lead. And not to mention sleep deprivation was making me talk to my own entries as you can see from some of the titles ;)"

HaHa. Been there. I remember pulling all-nighters in the trucking contest with code bouncing between me and Stijn at a rapid pace. Sometimes you later look at stuff you submitted and you think WTF??? But even when I do the responsible thing and go to bed at a reasonable time (I'm no longer a student after all) I still have problems switching my brain off and falling asleep. So I lie awake for hours thinking about the problem and possible new approaches. Sleep deprivation is an occupational hazard in this contest. One morning this week I sprayed deodorant in my own eye. Again. You can't get pills for stupid. Not even on prescription.

Welcome to the contest. Hope to see you again in 6 months. (I still have one good eye ;-) )

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Hannes Naudé

Date: 7 May, 2010 10:09:04

Message: 123 of 159

@Alan: I just thought of a great way in which your bot could be productively used in future contests. The fact that the new interface allows us to leave comments on other submissions implies that one could create a cleanup bot that would automatically post as a comment on a submission, an autocleaned version of that submission.

Autocleaning could include:
1) Tagging entries that differ significantly (as measured by matlab's codediff) from the current leader (comments are searchable so one could then easily isolate all submissions belonging to a specific family).
2) Crediting lines of code to their creators.
3) Keeping comments associated with specific code segments in place (if the associated segment has not changed). Obviously someone will have to manually define what comments are associated with what segments. Allthough if you define a protocol that your bot understands along the lines of:
%COMMENT identifier
%This segment waste time
%STARTCODE identifier
for x=1:1e6
y=y+1;
end
%ENDCODE identifier
Then all willing participants could comment segments, and any time that that segment appears the bot would automatically comment it in the annotated code resulting in a segment that displays as:

%thecyclist : This segment waste time
for x=1:1e6
y=y+1;
end

This would turn the commenting of the leading entry into more of a collaborative effort.

4) Deobfuscation. This will require some careful thought, but is also possible. The complexity required will depend on the level of determination of the obfuscator. Light obfuscation (like we had in this contest) can easily be removed (perhaps with a user supported text translation protocol like for the commenst), while heavy Dr. Seuss style obfuscation would require a bit more effort. One way would be to diff the code after all alphanumerical characters were removed. This uses characters like + [ ] ; etc as tokens. If you could line up the tokens appropriately, you could deduce which string repacements were made to obfuscate the code and undo it automatically.

I am sure there is alot more that could be done.

I also realise that what I am describing here is a significant task. But luckily the tasks described above are largely independent and we could easily split this up over a number of volunteers. If your code could handle the scraping of entries from the queue (this you can allready do), provide the data to our modules in a known format and then do the placing of the comment when we return results (this you would have to add, but it should be fairly simple?), that would be a fantastic addition to the contest, don't you think?

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Nathan

Date: 7 May, 2010 11:56:04

Message: 124 of 159

Hannes,

great commentary on the contest in your last few posts. It's the "common sense" bits of the contest that I love.You have much better articulated what I was trying to describe as abstract problem-solving... the flashes of insight and logic, sometimes obvious in hindsight, that blow the problem open, only to be outmoded the next day by something equally elegant. Judging by the discussions, this is the stuff that draws most applause from across the community, and what most players aspire to.

One nice feature of the old contests (all-daylight) was that for the first few hours, code was just tens of lines long. It was easy for anybody to observe and take part in the genesis of prize-winning code. How about a short opening Daylight phase, somewhere between 3 and 24 hours long, with a prize at the end of it? Newbies could work on a blank canvas, and if this phase is short enough, ambitious players will be motivated to hide their best ideas until twilight.

"Hannes Naudé" <naude.jj+matlab@gmail.com> wrote in message <hs0os0$22r$1@fred.mathworks.com>...
> @Alan: I just thought of a great way in which your bot could be productively used in future contests. The fact that the new interface allows us to leave comments on other submissions implies that one could create a cleanup bot that would automatically post as a comment on a submission, an autocleaned version of that submission.

YES! I'd volunteer for this. Alan...?

> You can't get pills for stupid.

how true

Nathan

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Hannes Naudé

Date: 7 May, 2010 13:33:04

Message: 125 of 159

Nathan"One nice feature of the old contests (all-daylight) was that for the first few hours, code was just tens of lines long. It was easy for anybody to observe and take part in the genesis of prize-winning code. How about a short opening Daylight phase, somewhere between 3 and 24 hours long, with a prize at the end of it? Newbies could work on a blank canvas, and if this phase is short enough, ambitious players will be motivated to hide their best ideas until twilight."

Love this idea. It could even be a full day. If darkness is still 1 day and twilight is still 1 day then this pushes twilight into the weekend, which is also something many people have been requesting. If the code at the end of 1st daylight is good enough it will probably be used as a base for a winning twilight submission, so those trying to get (back) into the game at the start of second daylight will at least not be dealing with something totally unfamiliar.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Alan Chalker

Date: 7 May, 2010 18:09:04

Message: 126 of 159

Stefan: "Personally, I think the "big" prizes of the matlab contests are the darkness and twilight awards and I have the deepest respect for all those people who have won these before .... Therefore I would love to have another phase after twilight, something like a "reverse twilight" where you can see the code of entries, but not the score"

I wholeheartedly agree! This is an interesting idea I haven't seen proposed before.

Hannes: "Please don't take anything I write as a negative comment regarding your involvement. You are probably the most involved player in the entire contest and make great contributions contest after contest. If you were no longer involved it would be a great loss to this community. I'm merely suggesting that some of your techniques are extremely disruptive and need to be toned down or abandoned."

I really appreciate the fact that we can have a civil discourse yet very differing opinions on these issues (it happens far to rarely in modern society) and thanks for the compliments. I am very vocal and think everyone knows where I stand on things. I'm just trying to advocate for the position that if change is going to happen, it should be on the 'toned down' end of the spectrum, not the 'abandoned' end.

Jan: "So, inspired by Alans description I will try to run all passed algorithms over the weekend on the testset (3000-4000 entries times ~1 minute makes 2-3 days running time) using urlopen and some hand crafted matlab script."

This will be an interesting experiment. To help you, I've already downloaded all of the entries and data that goes with them and uploaded a compressed version along with a small 'helper' function to the file exchange. I'm note sure when it will get approved (Helen and team perhaps you can expediate it?), but you can check my profile to see if it has been. Alternatively you can directly email me and I'll send it to you.

Hannes: "But even when I do the responsible thing and go to bed at a reasonable time (I'm no longer a student after all) I still have problems switching my brain off and falling asleep. So I lie awake for hours thinking about the problem and possible new approaches. Sleep deprivation is an occupational hazard in this contest. "

I'm glad I'm not the only one obsessed enough to have this regularly happen;) I know my wife is always glad when the contest end rolls around!

Hannes: "I just thought of a great way in which your bot could be productively used in future contests. The fact that the new interface allows us to leave comments on other submissions implies that one could create a cleanup bot that would automatically post as a comment on a submission, an autocleaned version of that submission. "

This is an EXCELLENT idea and I'm all in for it. I particularly like the fact that it can involve the community between now and the next contest as a way to improve the overall experience. A couple things:

1. The comment field is currently limited to 1000 characters. Is there any way the Contest team can increase that? If not, we won't be able to post the full code but could post other valuable info. However the system also allows for multiple comments, so I could break longer pieces into smaller chunks and post them.

2. Since starting small is going to be best, it'd help if people could help id simple things for me to do and example entries for me to look at. Heck even some psuedocode to accomplish the task would be nice.

3. Which would be better: posting everything under one comment (and perhaps posting things just as comments embedded in the code), or breaking items up into multiple comments (i.e. one comment is cleaned code, one comment is ids of major code blocks and originators, etc etc)?

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Hannes Naudé

Date: 7 May, 2010 18:27:39

Message: 127 of 159

Alan, glad to see you're testing the placing of comments on the site. I take this to mean you are keen on the proposed idea. Are you going to go it alone or are you keen on volunteers to write modules for the system?

P.S. Thanks for introducing me to cURL. Am playing with it now (which is how I noticed your test comments. Got going a lot quicker than I did with either WaTiN or System.Web. I'm guessing that's because it does not require me to learn a whole new language. ;-)

Cheers
H

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Helen Chen

Date: 7 May, 2010 18:35:20

Message: 128 of 159

"Alan Chalker" <alancNOSPAM@osc.edu> wrote in message
news:hs1l00$mtn$1@fred.mathworks.com...
> This will be an interesting experiment. To help you, I've already
> downloaded all of the entries and data that goes with them and uploaded a
> compressed version along with a small 'helper' function to the file
> exchange. I'm note sure when it will get approved (Helen and team perhaps
> you can expediate it?), but you can check my profile to see if it has
> been. Alternatively you can directly email me and I'll send it to you.
>

Shari beat me to this one. Your file has been published.


> This is an EXCELLENT idea and I'm all in for it. I particularly like the
> fact that it can involve the community between now and the next contest as
> a way to improve the overall experience. A couple things:
>
> 1. The comment field is currently limited to 1000 characters. Is there
> any way the Contest team can increase that? If not, we won't be able to
> post the full code but could post other valuable info. However the system
> also allows for multiple comments, so I could break longer pieces into
> smaller chunks and post them.
>

I'll share this with the contest development team.

Thanks!
Helen

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Alan Chalker

Date: 7 May, 2010 18:47:05

Message: 129 of 159

"Hannes Naudé" <naude.jj+matlab@gmail.com> wrote in message <hs1m2r$4dl$1@fred.mathworks.com>...
> Alan, glad to see you're testing the placing of comments on the site. I take this to mean you are keen on the proposed idea. Are you going to go it alone or are you keen on volunteers to write modules for the system?
>
> P.S. Thanks for introducing me to cURL. Am playing with it now (which is how I noticed your test comments. Got going a lot quicker than I did with either WaTiN or System.Web. I'm guessing that's because it does not require me to learn a whole new language. ;-)
>
> Cheers
> H

Hannes: Yes I'm very keen and yes I'd like as much involvement from others as possible, both with regards to modules and suggestions.

Regarding cURL.. here are some tips:
- Make sure you get the SSL enabled version and have OpenSSL installed.
- This is the main commandline parameters I use are: -s -L -b cookies.txt -c cookies.txt -k
- Everything on the contest page is a 'post' style form, not a 'get' style form.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Alan Chalker

Date: 7 May, 2010 18:49:04

Message: 130 of 159

"Helen Chen" <helenc@mathworks.com> wrote in message <hs1mhc$4e2$1@fred.mathworks.com>...

> Shari beat me to this one. Your file has been published.

Thanks Helen. Here's the direct link to it for everyone:
http://www.mathworks.com/matlabcentral/fileexchange/27530

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Hannes Naudé

Date: 7 May, 2010 20:14:21

Message: 131 of 159

Jan: "So, inspired by Alans description I will try to run all passed algorithms over the weekend on the testset (3000-4000 entries times ~1 minute makes 2-3 days running time) using urlopen and some hand crafted matlab script."

Alan: "This will be an interesting experiment. To help you, I've already downloaded all of the entries and data that goes with them and uploaded a compressed version along with a small 'helper' function to the file exchange. I'm note sure when it will get approved (Helen and team perhaps you can expediate it?), but you can check my profile to see if it has been. Alternatively you can directly email me and I'll send it to you."

Hannes: Also grabbing it now, thanks. I see there's allready six downloads in the few minutes since it was released, so there's a fair bit of interest in this.

Alan:"1. The comment field is currently limited to 1000 characters. Is there any way the Contest team can increase that? If not, we won't be able to post the full code but could post other valuable info. However the system also allows for multiple comments, so I could break longer pieces into smaller chunks and post them."

Ooh bummer. I wouldn't be keen on breaking the annotated code into pieces, since that would make copy-pasting tricky. But maybe this is a blessing in disguise. Maybe the best solution is for the bot to just post a link to a site where all the annotated code appears. This opens up three possibilities.

Firstly, the annotated code could be dynamic. I've wondered before about how to handle the case where new comments are received after an annotated entry has allready been posted. You could just place another comment but we don't want to completely flood the comments either. With a link to an externally hosted page that problem goes away since the link remains valid but the page changes.

Secondly, we now have the opportunity to have graphics associated with each entry. One example would be some of the scatterplots from the stats page, but with this specific entry shown in another colour, giving in immediate impression of where it sits on the result/runtime tradeoff for example. I believe the code used to generate the stats page is on the file exchange, so we won't be starting from scratch. Another example is contest specific diagrams. For example in this contest, it might have been useful to see the solvers reconstruction of one of the smaller images from the sample testsuite. Personally I also modified my runcontest function to show me the blocks used to get to this reconstruction and a diff image showing where the largest residuals are. While the former is a little specific to the block based approach, one could easily roll such code from competitors into a module
during the contest and deploy it. This also makes the contest friendlier for spectators.

Lastly, it could allow users to choose which annotations they wan't/don't want and download a "customized" code version. As an example I mentioned that the annotated code could credit each line to it's creator. While I'd like to have the option to see this, I don't want the code I paste into my editor to be littered with comments in this way.

Lastly (I'm serious this time) the site could have a facility for users to mark a specific submission as interesting, a hacking attempt, an obfuscated entry or anything else one might want to be alerted about.

Personally I'm interested in tracing the genealogy of functions in a little more detail than is currently possible and representing the submission cloud graphically with similar entries clustering together, hopefully allowing us to identify families of solvers. I'll be using the data you posted, so if you keep your internal data format the same this should be fairly simple to roll into a module.

I'd like to encourage others who are looking at this to declare what they intend working on so we don't duplicate too much effort. Also, when you've got something working, please announce it here so we can all admire your handiwork.

H

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Robert Macrae

Date: 7 May, 2010 23:20:39

Message: 132 of 159

Alan

> I know people have varying views on my approaches to the contest.. but as I've said before I like to be involved in a certain way since I can't compete with the 'algorithm experts'. But I do always abide by the explicit rules the contest team puts in place.
...
> For the last day of the competition, I had planned on developing code to auto create new accounts and automatically switch to the new accounts when the 10 minute limit was reached, in order to be able to 'tweak bomb' during the final rush
...
> I ended up submitting about 185 entries in that 17 minute window

8-)

I've enjoyed watching (and reading about) your optimisation against the constraints, but I do think those constraints should be designed so the resultant delays for other competitors are not excessive. I agree with your comment on the importance of providing a challenge interesting to a wide range of Matlab users, and long delays don't help.

I would suggest a more explicit "Not sockpuppets" rule; also 10 submissions in 10 minutes / 20 in 60 minutes / 30 in 180 minutes, but even then only 2 people would be able to tie up 100% of the BW so I think we also need to modify the queue concept. I'd suggest that time priority is used, but that after executing each code all of that user's remaining queued programs are dropped to the back of the queue. The result would be moderate the delays for occasional users even when there are several autosubmitters in play.

Maybe also cut maximum runtime to 1 minute and reduce the test set size; there is nothing magic about 3 minutes. On which...


Sergey

> I suspect that currently tweaking results in overfitting of the most valuable examples at the expense of less valuable.

I also suspect this. The scoring is such that high-contrast images with large numbers of pixels dominate, making small and smooth images largely irrelevant ( and I suspect the same applies to previous competitions). I'd suggest that the competition would be just as interesting with a smaller set of problems but more equal weighting on their importance to the final score.


Hannes

> In previous contests I developed a mechanism for copy protection that would harm performance (typically by dropping out of the local minimum) if the code was edited or even if it was resubmitted identically.
  
I stand in awe and amazement 8-)

And echo Alan's comment that it would be against the rules >-(


Yi Cao

> We have to consider a workable scheme to make future contest more general. Contesters against contesters is a good idea.

I agree.


Nathan

> I doubt if there is any practicable way the organisers can forcibly eliminate the various methods that many competitors dislike, even if they were willing to. However there is a community ethos here that's stronger than our individual competitive streaks. There's lots of evidence for this, notably the respect for the ban on wholesale probing, and the fact that code scrambling has not has much impact.

Here here! I don't think the problem is in enforcement, but in making clear rules... as illustrated by


Alan

> What I did was essentially delay the queue by ~30 minutes (10 entries * 3 mins per entry). I see the viewpoint of some that this might be 'overwhelming the queue', but in my opinion it's not since it's no more of a delay

So a good area for a clearer rule?


Alan

> I think a more telling stat is that out of 151 accounts that submitted entries, 111 of them submitted less than 10 entries (I removed my alternate accounts from these totals).
...
> Why do more than 2/3rds of people just submit a couple times instead of being involved more?

I am included in that <10 entries category, but with many 10s of hours and around 100 hand-coded solutions I do feel quite involved! You cannot measure involvement by number of submissions.

> I think a good question to ask is why do we always see the same general set of names showing up as winners and what can be done to level the playing field for novices and experts alike so that the contest is more inclusive?

That is an easy one. To win you have to be extremely smart, and extremely motivated to do just fractionally better than the other 10s of extremely smart people who are looking at exactly the same problem. I remember The Cyclist's account from a past problem, hunting through the code for a function call he could tweak to shave a vital few seconds. This time we have Hannes with a lovely scheme for both orthogonalising the tweaking *and* masking his results, and of course both his and Alan's web wizardry. It takes a very particular set of skills and outlook to win. Most Matlab users (certainly I) don't have them.

Fortunately to make the contest more inclusive, we don't need new winnners -- just more interested participants. Set an interesting challenge and make it reasonably easy to understand how it is being solved and many people will continue to look in and have a go.


Amitabh

> At the end of Twilight the top 5-10 contestant be given the chance to form a team with each one as a mentor to their team. If someone has time issues, he/she could delegate a willing successor.

What a great idea. I do aim to comment my code, if only because I am easily confused, and am delighted to discuss it if anyone is interested -- I'll try to remember to add a header to that effect if I write anything next session.


Robert Macrae

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Alan Chalker

Date: 8 May, 2010 05:17:05

Message: 133 of 159

One thing I didn't do is capture all the existing submission comments in the entry dataset I uploaded. If anyone thinks those would be of value I'd be happy to rerun the collection.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Robert Macrae

Date: 8 May, 2010 08:02:05

Message: 134 of 159

Yi Cao

> We have to consider a workable scheme to make future contest more general. Contesters against contesters is a good idea.

A highly successful model is the Corewars "King of the hill", a continuous contest that has been running since the 1990s:

  http://www.koth.org/lcgi-bin/current.pl?hill94nop

Whenever the mood takes a competitor they submit a warriors to KOTH and fight the current hill of 20 warriors; if they score more highly than the weakest then they replace it on the hill. The resut is continuous evolution rather than the day-by-day approach taken in the ant competition. A lot of kudos attaches to age, which represents the ability to survive later challenges.

We would need rules limiting each entrant to one warrior on the hill at a time, and rules on collaboration between entrants because there are incentives for a cabal to write warriors that co-operate when they meet known friend.

The hill approach encourages diversity because if there are a large number of similar warriors then specialists that exploit them will appear; the reason Corewars is still around is the richness of the paper-scissors-stone structure the hill promotes even when the underlying problem is relatively simple.

Robert Macrae

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Hannes Naudé

Date: 8 May, 2010 10:53:05

Message: 135 of 159

"Alan Chalker" <alancNOSPAM@osc.edu> wrote in message <hs2s4h$k5k$1@fred.mathworks.com>...
> One thing I didn't do is capture all the existing submission comments in the entry dataset I uploaded. If anyone thinks those would be of value I'd be happy to rerun the collection.

I doubt that this is important, but entries 1 through 49 appear to be blank in the file you distributed. Not critical, just thought you might want to know.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Hannes Naudé

Date: 8 May, 2010 11:03:05

Message: 136 of 159

"Hannes Naudé" <naude.jj+matlab@gmail.com> wrote in message <hs3fqh$7t4$1@fred.mathworks.com>...
> "Alan Chalker" <alancNOSPAM@osc.edu> wrote in message <hs2s4h$k5k$1@fred.mathworks.com>...
> > One thing I didn't do is capture all the existing submission comments in the entry dataset I uploaded. If anyone thinks those would be of value I'd be happy to rerun the collection.
>
> I doubt that this is important, but entries 1 through 49 appear to be blank in the file you distributed. Not critical, just thought you might want to know.

OK, I can now see why. You do actually have all the entries, but Jan's Ilgaz entry (the first submitted in this contest) is numbered 50. Interestingly if you create the links for earlier entries manually eg.
http://www.mathworks.com/matlabcentral/contest/contests/2/submissions/1
Then you get submissions from before the contest began. Looks like this was the internal Mathworks contest. Care to tell us who won that and what the best score was? :-)

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Nathan

Date: 8 May, 2010 12:48:05

Message: 137 of 159

"Hannes Naudé" <naude.jj+matlab@gmail.com> wrote in message <hs1sat$pk8$1@fred.mathworks.com>...

> Personally I'm interested in tracing the genealogy of functions in a little more detail than is currently possible and representing the submission cloud graphically with similar entries clustering together, hopefully allowing us to identify families of solvers.

I did something like this, crudely, for the visualisation contest.

http://www.mathworks.com/matlabcentral/fileexchange/23557-meet-the-family

Nathan

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Jan

Date: 8 May, 2010 13:53:21

Message: 138 of 159

Okay, so all 4262 submitted and passed algorithms are currently beeing tested against the test set. I had written my own code because my number cruncher was already working at the time of Alans posting.. However within one day 1334 algorithms (from begining and end of the submission list) have finished and the best so far is DeepCat9 by Sergey Y. (http://www.mathworks.com/matlabcentral/contest/contests/2/submissions/4820) with result 19359469 who actually also scored a final 32nd rank in the contest. However it's not yet over.. :)

I had a runtime problem (can't restrict execution time within Matlab) with function lsqr() in submission pushy2 (http://www.mathworks.com/matlabcentral/contest/contests/2/submissions/777). It runs forever on my machine.

More news can be expected probably for monday...

Cheers
Jan Keller
(actually not Jan Langer) :)

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Alan Chalker

Date: 8 May, 2010 16:05:21

Message: 139 of 159

"Hannes Naudé" <naude.jj+matlab@gmail.com> wrote in message <hs3fqh$7t4$1@fred.mathworks.com>...
> "Alan Chalker" <alancNOSPAM@osc.edu> wrote in message
> I doubt that this is important, but entries 1 through 49 appear to be blank in the file you distributed. Not critical, just thought you might want to know.

Sorry I forgot to mention that. I wanted the record numbers to align with the entry id numbers, so I started at 50.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Sergey Y.

Date: 8 May, 2010 17:20:23

Message: 140 of 159

"Jan" <thisis@notanemail.com> wrote in message <hs3qch$j13$1@fred.mathworks.com>...
> Okay, so all 4262 submitted and passed algorithms are currently beeing tested against the test set. I had written my own code because my number cruncher was already working at the time of Alans posting.. However within one day 1334 algorithms (from begining and end of the submission list) have finished and the best so far is DeepCat9 by Sergey Y. (http://www.mathworks.com/matlabcentral/contest/contests/2/submissions/4820) with result 19359469 who actually also scored a final 32nd rank in the contest. However it's not yet over.. :)
>
> Cheers
> Jan Keller
> (actually not Jan Langer) :)


In last 10 DeepCat9 submissions I modified Contrast calculation and was hunting for best parameter. Unfortunately, simultaneously I introduced another parameter, which move solver significantly from optimal point. (Lesson to myself – read code more carefully ;) ).
By my estimation, with correct implementation, modified Contrast calculation should reduce result of base code by 25K

Sergey

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Alan Chalker

Date: 8 May, 2010 17:29:04

Message: 141 of 159

"Hannes Naudé" <naude.jj+matlab@gmail.com> wrote in message
> OK, I can now see why. You do actually have all the entries, but Jan's Ilgaz entry (the first submitted in this contest) is numbered 50. Interestingly if you create the links for earlier entries manually eg.
> http://www.mathworks.com/matlabcentral/contest/contests/2/submissions/1
> Then you get submissions from before the contest began. Looks like this was the internal Mathworks contest. Care to tell us who won that and what the best score was? :-)

That is interesting. I've grabbed all 49 entries from the internal contest and there are some interesting tidbits:

-There is no 1st place final entry according to the ranks. There is a 2nd place entry: #48, with a score of 166203. The 3rd entry of the actual contest (id 53) beat that score by a lot (119888 vs 166203), but ran in less than a second versus the 97 seconds it took for #48 to run. Thus I assume they were using a different test suite. I ran the entry against the sample suite and the results didn't match up either.

-29 of the entries passed. Most of the failed entries were deliberate attempts to test the error checking mechanisms of the contest parser.

-There were 12 unique authors, including our wonderful contest organizers Helen and Ned. The 'winner' was Teja Muppirala.

-It looks like they ran the contest on 4/22 and 4/23 (the Thurs and Fri before the real one started). However, there were entries submitted as early as 4/14 (that aren't showing scored until 4/22), so I suspect there was some sort of queue testing / rerunning of entries going on.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Sergey Y.

Date: 8 May, 2010 20:19:07

Message: 142 of 159

"Alan Chalker" <alancNOSPAM@osc.edu> wrote in message <hs4710$4gk$1@fred.mathworks.com>...
> -There is no 1st place final entry according to the ranks. There is a 2nd place entry: #48, with a score of 166203. The 3rd entry of the actual contest (id 53) beat that score by a lot (119888 vs 166203), but ran in less than a second versus the 97 seconds it took for #48 to run. Thus I assume they were using a different test suite. I ran the entry against the sample suite and the results didn't match up either.
>

I was trying to run #48 on sample suite and it crashes on image #7 on griddata
(I had the same problem with my darkness version, that force me to abandon griddata function)
I am using 2007b version. Anybody can comment on this?

Error message is long, starting with:
??? Error using ==> qhullmx
qhull precision error: initial facet 1 is coplanar with the interior
point


Sergey

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Sergey Y.

Date: 8 May, 2010 20:50:24

Message: 143 of 159

"Sergey Y." <ivssnn@yahoo.com> wrote in message <hs4gvr$4tv$1@fred.mathworks.com>...
> "Alan Chalker" <alancNOSPAM@osc.edu> wrote in message <hs4710$4gk$1@fred.mathworks.com>...
> > -There is no 1st place final entry according to the ranks. There is a 2nd place entry: #48, with a score of 166203. The 3rd entry of the actual contest (id 53) beat that score by a lot (119888 vs 166203), but ran in less than a second versus the 97 seconds it took for #48 to run. Thus I assume they were using a different test suite. I ran the entry against the sample suite and the results didn't match up either.
> >
>
> I was trying to run #48 on sample suite and it crashes on image #7 on griddata
> (I had the same problem with my darkness version, that force me to abandon griddata function)
> I am using 2007b version. Anybody can comment on this?
>
> Error message is long, starting with:
> ??? Error using ==> qhullmx
> qhull precision error: initial facet 1 is coplanar with the interior
> point
>
>
> Sergey

Option {'QJ'} helped.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Alan Chalker

Date: 10 May, 2010 04:05:21

Message: 144 of 159

"Jan" <thisis@notanemail.com> wrote in message <hs3qch$j13$1@fred.mathworks.com>...
>
> I had a runtime problem (can't restrict execution time within Matlab) with function lsqr() in submission pushy2 (http://www.mathworks.com/matlabcentral/contest/contests/2/submissions/777). It runs forever on my machine.

This brings up an interesting question I've always wondered about. How does the contest machinery do this? Would anyone from the MATLAB team be willing to share the details?

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Alan Chalker

Date: 10 May, 2010 04:23:04

Message: 145 of 159

I spent some time this weekend developing several robust 'helper' functions to help with the commenting tasks we are discussing. As a test, I wrote a small 'module' that inserted a comment into an entry describing what phase of the contest the entry was submitted in. I'm running it through the entire test suite right now as a test (the comment submission process is surprisingly slow), and I thought I'd present the details here so if anyone else wants to create any 'modules' for me to run they can. The function code is as follows:

-------------------------------------------------------------------------------------
function comment=phasecomment(id)
% Given an input entry id, this function determines what contest phase it was in
% and provides an appropriate comment

% Index cell array of phase start and end times
phases={[2010 04 28 16 00 00] [2010 04 29 16 00 00] 'Darkness'; ...
    [2010 04 29 16 00 01] [2010 04 30 16 00 00] 'Twilight'; ...
    [2010 04 30 16 00 01] [2010 05 05 16 00 00] 'Daylight'; ...
    [2010 04 30 16 00 01] [2010 04 30 21 00 00] 'Friday Early Bird'; ...
    [2010 05 01 00 00 00] [2010 05 01 23 59 59] 'Saturday Leap'; ...
    [2010 05 02 00 00 00] [2010 05 02 23 59 59] 'Sunday Push'; ...
    [2010 05 03 18 00 01] [2010 05 03 22 00 00] 'Monday K-Note'; ...
    [2010 05 04 00 00 00] [2010 05 04 23 59 59] 'Tuesday Longevity'};

% Returns all the data for an entry
entry=grabentry(id);
if strcmp(entry.status,'Passed')
    entrytime=datenum(entry.submitted);
    entryphases=phases(cellfun(@(x) entrytime>=datenum(x),phases(:,1)) ...
        & cellfun(@(x) entrytime<=datenum(x),phases(:,2)),3);
    comment=['This entry was submitted during the ' entryphases{1} ' phase'];
    if numel(entryphases) > 1
        comment=[comment ' and was eligible for the ' entryphases{2} ' prize'];
    end
else
    comment='';
end
-------------------------------------------------------------------------------------

The key item is the 'grabentry(id)' function, which does exactly what it sounds like. It returns a structure with all the data for a specific entry as follows:
All fields are numeric unless otherwise specified
    id - the same as the entry id
    author - the entry author (string)
    name - the title of the entry (string)
    status - 'Passed' or 'Failed' (string)
    results - the numerical results
    cyc - the Cyclomatic Complexity
    node - the node length
    cpu - the CPU time
    score - the score
    submitted - the submitted time as a date vector
    scored - the scored time as a date vector
    currank - the current rank
    highrank - the highest rank
    basedon - the id of the entry it is based on
    basisfor - a vector of ids of entries based on it
    error - the error message if status == Failed (string)
    code - the code (string)
    comments - all comments as a 3 col cell array {author, date, comment}

Note the function outputs a single string that is the comment to insert. Note the comment field is currently limited to 1000 chars and if you want to insert newlines you need to use the following to place them into the appropriate location in your comment string:

comment=['1st line of text goes here' sprintf('\n') '2nd line of text goes here. etc etc.'];

I have another function called submitcomment that does just that. It has some built in error checking in that it will first check to make sure the text of that comment isn't present in the entry and if it is, won't actually submit the comment since it would just be a duplicate.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Alan Chalker

Date: 10 May, 2010 04:32:04

Message: 146 of 159

I thought I'd start a list of 'comment modules' to create. If anyone has any suggestions or would like to tackle one of these please let me know.

1. Comment all the entries that show up on the stats page with their corresponding rank. For example, in an entry finished 10th in the K node challenge, the comment would say so. I do know there were some issues with the leader stats, so I might actually write this to calculate the stats directly from the entries, instead of copying what's on the stat page.

2. Indicate the results of running the entry through the test suite, and the corresponding rank (based upon the results Jan ends up with this weekend)

3. Indicate the length of the entry (a stat that disappeared when the node concept appeared)

4. Give a breakdown of what percentage of the entry was 'new code' versus copied from previous entries (and which entries / authors originated the copied code). This would obviously be a bit of a challenge since it would require significant 'fingerprinting' algorithms.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Alan Chalker

Date: 10 May, 2010 04:35:04

Message: 147 of 159

"Hannes Naudé" <naude.jj+matlab@gmail.com> wrote in message <hs1sat$pk8$1@fred.mathworks.com>...

>
> Hannes: Also grabbing it now, thanks. I see there's allready six downloads in the few minutes since it was released, so there's a fair bit of interest in this.
>

And now it's up to 41 downloads. It seems like this idea has resonated with a lot of people!

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Jan

Date: 10 May, 2010 12:06:35

Message: 148 of 159

On 5/10/2010 6:05 AM, Alan Chalker wrote:
> This brings up an interesting question I've always wondered about. How
> does the contest machinery do this? Would anyone from the MATLAB team be
> willing to share the details?

Most probably not within Matlab itself. Since it was annoying me a lot
that I could not be sure about the termination of the thousands of
algorithms I wanted to have the execution time limit functionality also
but failed. Matlab is strictly single threaded. Even the timer when
setting the busymode to error will not execute the error function unless
the timer function has finished which can potentially take very long.
And if you are inside some dll doing some stuff immediate breaking out
is dangerous, who will do the clean up? However for automatic testing
something like a scheduled ctrl.+c (i.e. if still within my function, in
X seconds, do what is normally done upon ctrl.+c but do not print
anything and instead execute a function specified by me) would be nice.

I guess that the Matlab people had two Matlabs running and remotely
controlled one from the other, including a tool that kills a process if
a time limit is exceed. Only killing a process really cleans all memory
allocation at any time. But maybe there is an easier way?

Jan

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Alan Chalker

Date: 10 May, 2010 20:57:05

Message: 149 of 159

"Alan Chalker" <alancNOSPAM@osc.edu> wrote in message <hs4710$4gk$1@fred.mathworks.com>...

>
> -There is no 1st place final entry according to the ranks. There is a 2nd place entry: #48, with a score of 166203. The 3rd entry of the actual contest (id 53) beat that score by a lot (119888 vs 166203), but ran in less than a second versus the 97 seconds it took for #48 to run. Thus I assume they were using a different test suite. I ran the entry against the sample suite and the results didn't match up either.
>

I just discovered that soon after the main contest started, Ned resubmitted an entry from the internal contest (#12) to the queue which became entry #58. They manually removed it from view and the stats, but it is in the data set I uploaded and has score (33479.0) that put it temporarily in the lead during darkness (at least for the first 40 or so entries.

IF YOU ARE DOING AN ANALYSIS OF THE DATASET, PLEASE ZERO OUT ENTRY #58! Otherwise it'll mess things up.

Ned: I did a search and couldn't find any other entries that you or the team snuck in during the main contest which might mess up stuff for us. Is that a correct assessment?

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Helen Chen

Date: 10 May, 2010 21:09:09

Message: 150 of 159

"Hannes Naudé" <naude.jj+matlab@gmail.com> wrote in message <hs3gd9$f7c$1@fred.mathworks.com>...
Interestingly if you create the links for earlier entries manually eg.
> http://www.mathworks.com/matlabcentral/contest/contests/2/submissions/1
> Then you get submissions from before the contest began. Looks like this was the internal Mathworks contest. Care to tell us who won that and what the best score was? :-)

Nice catch Hannes! Yes, we run an internal version of each contest so that MathWorkers can test both our contest infrastructure and the game. That gives us that chance to tweek if necessary. This year, since the contest application is new, we ran both a regular contest but also a contest for the most creative failures to check the error trapping. As Alan notes Teja (http://www.mathworks.com/matlabcentral/fileexchange/authors/66822) won the internal contest. He is an application engineer from Japan. :-)

Helen

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Jan

Date: 11 May, 2010 08:03:03

Message: 151 of 159

So, I could upload the result from running the submissions against the test suite also on the FileExchange: http://www.mathworks.com/matlabcentral/fileexchange/27554-matlab-sensor-contest-data-set-run-on-test-data

It uses a similar database to what Alan has also uploaded but also contains it so that one can check the results directly.

As already postet Sergey Y. submitted some pretty good algorithms with his DeepCat9 series and my personal opinion is that in darkness and twilight generalization took place, in daylight then a mixture and on the last day almost purely tweaking.

Cheers
Jan

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Yi Cao

Date: 11 May, 2010 10:32:04

Message: 152 of 159

"Jan" <thisis@notanemail.com> wrote in message <hsb2vn$hjs$1@fred.mathworks.com>...
> As already postet Sergey Y. submitted some pretty good algorithms with his DeepCat9 series and my personal opinion is that in darkness and twilight generalization took place, in daylight then a mixture and on the last day almost purely tweaking.

For any contest problem, analytic solution does not exist. Therefore, whatever how good an algorithm is, tweaking is necessay. However, we wish the tweaking to be as general as possible. In computer modelling, we normally use three different data sets to ensure the genericity of the model to be developed, i.e. training, validation and testing sets. We can design a similar scheme to encourage code genericity:

1. Sample data set is tranparent to all contesters for them to develop algorithms and to tuning parameters.
2. Test data set is grey to contesters. Contesters do not what actual the data set is but know the results of submitted code.
3. Generic data set is dark to contesters. All submissions will be tested on this data set. Results will be weighted with a coefficient and add to the final score for the ranking. However, the actual results of this set and the coefficient will never be known. This 'dark' part of the total score makes a prob to the score coefficients and any overfitting to the generic data set impossible.

The final score should also include a term for weighted results based on the sample data set. This will encourage any tweaking to be test on the sample data set before submission.

This scheme shoul encourage genericity of code developement, also may discourage massive submission.

Yi

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: srach

Date: 11 May, 2010 11:20:06

Message: 153 of 159

"Yi Cao" <y.cao@cranfield.ac.uk> wrote in message <hsbbn4$n4a$1@fred.mathworks.com>...
> "Jan" <thisis@notanemail.com> wrote in message <hsb2vn$hjs$1@fred.mathworks.com>...
> > As already postet Sergey Y. submitted some pretty good algorithms with his DeepCat9 series and my personal opinion is that in darkness and twilight generalization took place, in daylight then a mixture and on the last day almost purely tweaking.
>
> For any contest problem, analytic solution does not exist. Therefore, whatever how good an algorithm is, tweaking is necessay. However, we wish the tweaking to be as general as possible. In computer modelling, we normally use three different data sets to ensure the genericity of the model to be developed, i.e. training, validation and testing sets. We can design a similar scheme to encourage code genericity:
>
> 1. Sample data set is tranparent to all contesters for them to develop algorithms and to tuning parameters.
> 2. Test data set is grey to contesters. Contesters do not what actual the data set is but know the results of submitted code.
> 3. Generic data set is dark to contesters. All submissions will be tested on this data set. Results will be weighted with a coefficient and add to the final score for the ranking. However, the actual results of this set and the coefficient will never be known. This 'dark' part of the total score makes a prob to the score coefficients and any overfitting to the generic data set impossible.
>
> The final score should also include a term for weighted results based on the sample data set. This will encourage any tweaking to be test on the sample data set before submission.

Nice idea. How about leaving the "dark" data set entirely dark until the contest ends (i.e., no influence on the contest score via the coefficent) and then base the grand price entirely on the ranking with regard to the "dark" data set? (In other words: computing the final ranking based on an entirely new testsuite).

This would remove any tweaking results from the grand prize, while leaving enough fun for tweakers during the mid contest prizes.

If technically feasible, the contest machine could compute the results from the "gray" data sets with priority and whenever the queue is empty, it could compute the "dark" data set

Regards
srach

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Alan Chalker

Date: 12 May, 2010 03:09:05

Message: 154 of 159

"Jan" <thisis@notanemail.com> wrote in message <hsb2vn$hjs$1@fred.mathworks.com>...
> So, I could upload the result from running the submissions against the test suite also on the FileExchange: http://www.mathworks.com/matlabcentral/fileexchange/27554-matlab-sensor-contest-data-set-run-on-test-data
>
>
> Cheers
> Jan

I used Jan's code as a 'module' for my comment submission code and inserted all the result data he calculated into the individual entry comments. You can now just browse to a specific entry on the website and see what the test suite results were. For example, the comment for Hannes winning entry is:

"This entry gets a result of 19556985 on the test suite (5678335 more than the contest suite). It has a ranking of 1698 compared to all other entries run against the test suite according to the data set provided by Jan Keller. "

Likewise, I also ran a module that commented all the leaders. Here's an example comment from one of the cyclist's final entries (#4515):

"This was the 278th entry to take the overall contest scoring lead. It improved upon entry #4512 by 0.015% (4.3 points). It stayed in the lead for 2 mins, 35 secs before being replaced by entry #4526. "

I'm interested in hearing feedback on this. Is this of value? Potentially I could be adding these types of comments in near real-time during future contests. Obviously the ultimate goal is to handle comments and code legacy, but unless the comment length issue is resolved that's not really feasible right now.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Hannes Naudé

Date: 12 May, 2010 06:35:21

Message: 155 of 159

Alan:"I'm interested in hearing feedback on this. Is this of value? Potentially I could be adding these types of comments in near real-time during future contests. Obviously the ultimate goal is to handle comments and code legacy, but unless the comment length issue is resolved that's not really feasible right now."

I definitely think this is of value. If you're doing this in near real-time, then you'll run into problems with statements like "It has a ranking of 1698 compared to all other entries..." since these rankings will be changing all the time and you don't want to repost all the comments each time this happens.

I do wish that it was possible to do things like "sort by test suite result". Since people don't comonly overfit to the test suite (allthough public visibility of test suite scores may change this) sorting by test suite score should give a better indication of where algorithmic improvements took place. These improvements (that would otherwise have been lost) can then be picked up and tweaked until they are competitive.

Regarding the comment length issue. What do you think of the "URL comment" solution that I suggested? Obviously this requires access to hosting resources, but I'm sure this can be resolved (TMW might even be willing to host it for us). I don't think the bandwidth required will be excessive.

Hannes

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Yi Cao

Date: 12 May, 2010 08:05:21

Message: 156 of 159

"Alan Chalker" <alancNOSPAM@osc.edu> wrote in message
> I used Jan's code as a 'module' for my comment submission code and inserted all the result data he calculated into the individual entry comments. You can now just browse to a specific entry on the website and see what the test suite results were. For example, the comment for Hannes winning entry is:
>

Thank Alan for doing this. It is valuable. From these comments, I noticed an error in the statistics page, hence in the comments:

My entry (ID 4475) was marked as the highest rank 1st, but it was not included in the All the Leaders list.

This was the entry cloned by 27 entries in the final hour. So the highest rank record is correct. This indicates that the code to produce the statistics page has some bugs. The contest team may wish to have a look.

Yi

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Alan Chalker

Date: 12 May, 2010 15:39:06

Message: 157 of 159

"Hannes Naudé" <naude.jj+matlab@gmail.com> wrote in message <hsdi79$5bo$1@fred.mathworks.com>...
> Regarding the comment length issue. What do you think of the "URL comment" solution that I suggested? Obviously this requires access to hosting resources, but I'm sure this can be resolved (TMW might even be willing to host it for us). I don't think the bandwidth required will be excessive.
>

Hannes:

Regarding the URL comments, it's an interesting idea. I did some testing and unfortunately the comments field currently does NOT accept html tags. Thus while you can put a URL in the comment, it won't be clickable, somewhat diminishing the value of that.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Alan Chalker

Date: 12 May, 2010 15:49:05

Message: 158 of 159

"Yi Cao" <y.cao@cranfield.ac.uk> wrote in message <hsdng1$992$1@fred.mathworks.com>...
>
> My entry (ID 4475) was marked as the highest rank 1st, but it was not included in the All the Leaders list.
>
> This was the entry cloned by 27 entries in the final hour. So the highest rank record is correct. This indicates that the code to produce the statistics page has some bugs. The contest team may wish to have a look.
>

Yi:

Actually the stats page is correct, the highest rank is wrong. I encountered this previously during the longevity phase with entry 4012 from Amitabh and mentioned it in a posting above. Here's what the details of the issue are:

#4369 got a score of 27799.8 and took the lead
#4475 also got a score of 27799.8, but since it didn't BEAT #4369 it didn't take the lead. However the submission list page for some reason showed it in first place
#4512 later got a score of 27799.7, overtaking both.

I suspect there is a minor coding error in the submission listing / highest rank code that uses a '>=' comparison instead of the '>' comparison it should be using. Alternatively, it's using more decimal points in the calculation, which aren't available to the statistics code. Regardless, an entry needs to overtake, not equal another in order to take the lead. This situation happened several times during the contest.

Subject: Spring 2010 MATLAB Contest, April 28 - May 5

From: Randy Souza

Date: 12 May, 2010 17:47:04

Message: 159 of 159

> Alternatively, it's using more decimal points in the calculation,
> which aren't available to the statistics code.

This is correct, the stats code was not receiving the score with sufficient precision.
So while entry #4369 and entry #4475 each have a displayed score of 27799.8, #4475 actually scored about 0.05 points better than #4369.
We plan to fix this for the next contest, apologies for the confusion it caused.

Best,
Randy

Tags for this Thread

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us