MATLAB Examples

Who has the most children?

load contest_data
parentList = [d.parent];
numChildren = zeros(size(d));

for i = 1:length(d)
    numChildren(i) = length(find(parentList==d(i).id));
end

plot(numChildren,'.')
xlabel('Entry Index')
ylabel('Number of Children')
[val, ix] = max(numChildren);
fprintf('"%s" by %s had %d children\n', d(ix).title, d(ix).author, val);
"Fledgling lucky 13" by nathan q had 51 children

Do leaders have the most children? Let's draw some circles around the leaders.

% Find the leaders

bestIndexList = findleaders(d);

hold on
plot(bestIndexList,numChildren(bestIndexList),'ro')
hold off

How different does the child distribution look between non-leading entries and leading entries?

numChildrenNonLeaders = numChildren;
numChildrenNonLeaders(bestIndexList) = [];

subplot(2,1,1)
hist(numChildrenNonLeaders(numChildrenNonLeaders>2),1:51)
ylim([0 40])
title('Histogram of Number of Children of Non-Leading Entries')

numChildrenLeaders = numChildren(bestIndexList);

subplot(2,1,2)
hist(numChildrenLeaders(numChildrenLeaders>2),1:51)
ylim([0 40])
title('Histogram of Number of Children of Leading Entries')

It seems surprising that being a leader doesn't make a bigger difference in whether or not your code is cloned. Perhaps a better question is how often do people clone the code of someone other than themselves?

numChildrenNotSameAsParent = zeros(size(d));

for i = 1:length(d)
    parentAuthor = d(i).author;
    childList = find(parentList==d(i).id);
    for j = 1:length(childList)
        childAuthor = d(childList(j)).author;
        if ~strcmp(parentAuthor,childAuthor)
            numChildrenNotSameAsParent(i) = numChildrenNotSameAsParent(i) + 1;
        end
    end
end

subplot(1,1,1)
plot(numChildrenNotSameAsParent,'.')
xlabel('Entry Index')
ylabel('Number of Children Not the Same as the Parent')

Let's circle all the leaders again.

hold on
plot(bestIndexList,numChildrenNotSameAsParent(bestIndexList),'ro')
hold off

This makes more sense. If someone else is going to copy your code, it stands to reason that you're probably in the lead. There are some exceptions here, but this is not surprising when you realize many of these were part of the 1000-character challenge. In other words, the exceptions were simply leaders of a different (smaller) contest.

ix = 2710;
fprintf('"%s" by %s had %d children that were not by the original author\n', ...
    d(ix).title, d(ix).author, numChildrenNotSameAsParent(ix));
hold on
plot(ix,numChildrenNotSameAsParent(ix),'bsquare')
hold off
"Remains of the Day" by the cyclist had 33 children that were not by the original author