|
MATLAB Programming Contest Analysis
Our analysis of the Surveyor contest is in two parts:
-
"Strategies, Randomizations, and Optimizations"
-
First, we take
you line by line through the winning entry, in an analysis of the
algorithms, strategies, and optimizations in the final winner,
NoSoup4U !1.
- "In Praise of Tweaking"
-
Next we take a quantitative look at how these innovations developed
over the course of the contest. In
"In Praise of Tweaking", we look the
interplay between radical innovations and tweaking battles in the
development of the winning entry.
In Praise of Tweaking
In Praise of Tweaking:
The MATLAB Programming Contest, Take Two
This report is in direct contrast to the painstaking line-by-line analysis provided by
the companion report entitled "Strategies, Randomizations, and Optimizations" All the images
in this report come directly from the raw data in our contest database rather than from the careful inspection of individual entries.
Problem complexity and tweak-bombing
Despite the fact that there were more entries in the second contest than the first
(1647 vs. 1455), there were actually fewer participants. This probably resulted from the
fact that the second contest was significantly more complex. Once contestants were inside the door, they were
likelier to submit more entries per person. Another hallmark of this second contest
was a conviction that submitting multiple entries differing by small and random amounts
might lead to success. For example, on June 17th between midnight and 2:30 AM Natick time,
Roger Stuckey submitted 161 entries. We affectionately named this practice
"tweak-bombing," and it shows up dramatically in some of the plots below.
Here is an area plot of all 1647 entries as they came in. Interestingly, the number of
failures per 100 entries is significantly down from the first contest.

Score vs. Time
Now we move on to a more interesting plot: the score of each entry plotted against when
it was submitted. The leader at any given time is shown in red, and since lower scores are
better, the red line naturally marks the lowest extremity of the set of all entries.
During the contest, the lead changed hands a total of 85 times. Murray Simpson has the
distinction of leading for the longest period of time with his entry
"RandomRandv1.16" which ruled the roost from 2:30 on the afternoon of June 15th
until Martin Leach's breakthrough entry "SoupDragon31" bettered it at 10:52 on
the morning of June 17th. Martin's "SoupDragon34" then shattered the 200 point
barrier only 15 minutes later.

Notice the tweak bombing runs tend to show up as thick vertical stripes. In general,
each of these stripes is attributable to a single person.
Broadly speaking, based on the contest data we can divide the contest into two parts:
an initial tweaking battle based on code initially appearing in "Lolomove" by
Anders Holtsberg, and a second climactic tweaking battle set off by Martin Leach's soupy
innovations of the 17th and finally ending with Paulo Uribe's winning entry "NoSoup4U
!1." Keep in mind this is a transparent analysis based only on how code is attributed
in the database. We don't want to overemphasize some people's contributions at the expense
of others. But if you observe the inheritance patterns as reported by the various authors,
some definite trends appear. For example, here is a plot similar to the one shown above,
except for the fact that it connects related entries with light blue lines. Furthermore,
the lineage of Anders Holtsberg's "Lolomov4" is called out in red: it consumes
almost the entire center of the contest!

There are
other ways to spot the influence of an algorithm. Below are shown all the entries that
have the word "Soup" in their title. Notice that the original SoupDragon is a
middling performer lost amid the last great tweaking binge of the Lolomov4 era. But after
a day of tuning, Lolomov4 has waned and the SoupDragon is in the ascendant, leading to
such names as "SoupedUpDragon," Colin Sinclair's "SneakyGreenSoup,"
Roger Stuckey's SoupMix series, and of course Peter J. Acklam's inimitable "Can't
think of a name." (The original Soup Dragon, by the way, is from a British children's
television program of the 1960s.)

To close out the section on legacy charts, here is a zoomed-in view of what a tweak
cluster-bomb looks like. This is from Murray Simpson's raid on the afternoon of June 15th.
Visually, it resembles exploding fireworks; any entry may be a launching point for still more entries.
The root of this small inheritance tree is Murray's "RandomRandv1.10," and in
fairness to the bombing technique, one of the offspring of that entry,
"RandomRandv1.16" (the low entry submitted at 14:38 below) went on to a
spectacular and unequalled run as contest leader.

Percent unexplored vs. CPU time
Another way to look at the progress of the contest is to view the entries on a plane
where CPU time is plotted against % unexplored as seen below. The segmented red line again
shows the progress of the leading entry from the beginning of the contest in the top
center to the winner in the bottom left. Light gray curves show lines of constant score
(the log scale on the y axis gently curves these lines).

In this light, the significant changes made to the algorithm by Anders Holtsberg and
Martin Leach become more apparent. Tweaking skirmishes tend to work left along horizontal
lines by speeding up code that still ultimately explores the same area. A significant
algorithmic departure is signified by a shift in both CPU time and % unexplored. Notice
that SoupDragon31 moves almost along a line of constant score. It's not very much
better than the previous leader, but because the algorithm is new, it has a lot of
breathing room for subsequent tweak exploitation, leading to a much better SoupDragon34 in
only a few minutes. This image is nothing less than a quantitative snapshot of innovation
at work.
If we take a time-lapse camera approach to the above plot, we can see how the entries
moved over time. The leader in each of the subplots below is circled in black. The light
blue backdrop displays all entries for context.

Like bees in a hive, the red dots show the entries crawling toward a reward in the
lower left.
Tweaking, innovation, and open sources
The programming contest achieves something remarkable: it turns MATLAB coding into a highly entertaining full-contact sport. The contest also serves as a model case for how (and why) open source coding works. It would seem the the contest is successful because it is
- competitive
- real-time
- open-sourced
- personal, and
- tweak-friendly
This last feature is a crucial one. Tweaking, or the
unrestrained (and sometimes capricious) modification of a predecessor's code, is in large part what sustains the contest. To the
hardworking contestant who has just seen his code speed-tweaked by a half percent to take
someone else into first place, tweaking can seem rude and unfair. On the other hand,
tweak-battles make for a fabulous spectator sport, and assuming that they don't cause all
the contestants to pack up and go home, they certainly generate streamlined code. All evidence indicates that tweaking doesn't drive people away, but rather it draws them in closer and energizes the contest. It has an appealingly egalitarian effect: no optimization is too small to be worthy of consideration. Tweaking, cluster bombs and all, fuels the engines of innovation.
What do you think about tweaking? About the contest in general? Let us know.
|
|