Contest Evolution
There were 1138 entries in this contest, with more than a hundred participants
(the exact number is hard to determine since people don't always report their names
consistently).
This area plot shows how the entries added up as the contest wore on.
The green area represents the number of files that passed the test suite, and the red area
shows those that failed. Overall 507 of 1138 entries passed the test suite, for a
pass rate of 44.5%.
(Note: This contest analysis was calculated and published entirely from
MATLAB)
|
|
 |
|
|
|
This histogram shows when contest activity was occurring.
Notice the obvious spike around midnight on the second day of the contest. This
is when E. Brian Welch was submitting a flurry of files designed to exploit the
PERSISTENT function loophole (he describes this approach in another part of this analysis).
The histogram at the bottom shows when E. Brian Welch was submitting his files.
|
|
 |
|
|
|
This is probably the single most useful diagram for visualizing the contest: here we see the score
of the entries logged against the time at which they were submitted. Since
a lower score is better, the lowest part of this data set decreases steadily
over time as the lead entry gets better. All leaders are shown connected by
a red line. Therefore the winning entry, "fcol_a_05" by Stijn Helsen is
found in the lower right corner of the plot: the lowest score at the end of the
contest.
This plot shows some of the dramatic improvements in the algorithm that occurred
over time.
|
|
 |
|
|
|
Here we are looking at the same basic presentation as above, only now line
segments are shown connecting entries that are related. You can see in a few places
multiple lines that radiate from a few points, indicating places where the same
entry spawned a second generation of multiple offspring.
We have highlighted with red connecting lines the descendants of Nedialko Krouchev's
"t7c", which has the distinction of being in first place for the longest period of time.
|
|
 |
|
|
|
The scoring for the contest was based on several factors: the correctness of the
answer in terms of black pegs and white pegs, how many guesses it took to reach the final answer,
and how many seconds it took to reach the final answer.
From a practical point of view the contest was driven by the ability to
to guess as many black pegs as possible before timing out at 180 seconds of CPU time.
Soon after the contest started, the algorithms were cycling through all possible colors in
the first few guesses, which means that they were assured at least all white pegs. So
maximum black peg count ended up being the distinguishing characteristic of the leader.
In the following plots, we see on the left score vs. CPU time.
On the right we are showing the percentage of black pegs vs. score. This plot makes obvious
the correlation between black pegs and getting a better (i.e. lower) score.
By the end of the contest, the correlation is almost perfect.
|
|
 |
|
|
|
In order to get a feel for the trends in the contest, let's focus on the score vs.
CPU time plot. Here we are looking at each of the five days of the contest.
All entries for all days are shown in light blue for context, and the entries for
the day in question appear as red dots. This lets you see how large-scale trends play
out over time.
|
|
 |
|
|
|
Here is a plot in which we pay particular attention to the breakthroughs
in the algorithm as the contest progressed. These "corner points" represent
significant improvements to the algorithm which were then tweaked by following entries.
|
|
 |
|
|
|
Now we'll do the same plot on a score vs. CPU time plot for the
sake of comparison.
|
|
 |
|
|
|
Here are some pictures that illustrate how the algorithms work. We'll examine how
some of the critical "corner point" breakthroughs improved on earlier approaches.
Note on the visualization: the guesses are shown
with colored pixels on the left, and the score is shown with black and
white pixels on the right. The correct answer is at the very top on the
left, and the final guess appears just below it.
Our first example entry (on the left here) simply made random guesses and kept the best
for its final answer. Soon after the contest started, E. Brian Welch
started the guessing by cycling through all the colors first, so that at
least he could be assured of doing no worse than all white pegs. His
entry "ebw_sub03" became the first big corner point improvement. Even with this
improvement, though, a random guess is unlikely to ever guess the correct answer
in all but the simplest problems. The next innovation has to do with an algorithm that
is guaranteed to converge.
|
|
 |
|
|
|
Our next example is the first from Benno Weber's "alcohol series" (we here on the
MATLAB Contest Team would like to nominate Benno for best entry names).
Notice that he also cycles through all the colors initially.
He picks a color and tries to find the column where it belongs.
On the right we have John Stroebels "submission 1".
Stroebel rocketed into the lead at the end of the first day.
|
|
 |
|
|
|
Here are three of the later entries, starting with Stijn Helsen's breakthrough
code based on a binary tree algorithm.
He's able to nail down the position for a given color in far less time.
Also show here are Paul Valiant's "new ugly" and Stijn's winning entry.
|
|
 |
|
|
|
Several people sent us email observing that the contest ended a little too
soon. They wanted to spend weekend time attacking the problem, time that
simply wasn't available during the week. For the most part, the story of the
last three days of the contest was the
steady improvement of a common theme of attack. But several other approaches were
being developed and might have succeeded in taking the lead, given a little more time.
One example is Alex Backer's Papi code which appeared in the last few hours of the
contest. Look at the dramatic difference in solving technique between Stijn's
winning entry on the left and Alex's contender Papi5.
For those of you who feel you didn't have enough time to get involved in this
contest, get ready for the next one. And we plan on taking your advice and
lengthening the next contest so it stretches over the weekend. Thanks for playing,
and we'll see you next time.
|
|
 |
|
|
We close here with a composite picture of the contest that shows the score vs. CPU time
picture with algorithm visualization pictures superimposed.
|
|
 |
|