Who has the most children?
load contest_data
parentList = [d.parent]; numChildren = zeros(size(d)); for i = 1:length(d) numChildren(i) = length(find(parentList==d(i).id)); end plot(numChildren,'.') xlabel('Entry Index') ylabel('Number of Children')
[val, ix] = max(numChildren);
fprintf('"%s" by %s had %d children\n', d(ix).title, d(ix).author, val);
"Fledgling lucky 13" by nathan q had 51 children
Do leaders have the most children? Let's draw some circles around the leaders.
% Find the leaders bestIndexList = findleaders(d); hold on plot(bestIndexList,numChildren(bestIndexList),'ro') hold off
How different does the child distribution look between non-leading entries and leading entries?
numChildrenNonLeaders = numChildren; numChildrenNonLeaders(bestIndexList) = []; subplot(2,1,1) hist(numChildrenNonLeaders(numChildrenNonLeaders>2),1:51) ylim([0 40]) title('Histogram of Number of Children of Non-Leading Entries') numChildrenLeaders = numChildren(bestIndexList); subplot(2,1,2) hist(numChildrenLeaders(numChildrenLeaders>2),1:51) ylim([0 40]) title('Histogram of Number of Children of Leading Entries')
It seems surprising that being a leader doesn't make a bigger difference in whether or not your code is cloned. Perhaps a better question is how often do people clone the code of someone other than themselves?
numChildrenNotSameAsParent = zeros(size(d)); for i = 1:length(d) parentAuthor = d(i).author; childList = find(parentList==d(i).id); for j = 1:length(childList) childAuthor = d(childList(j)).author; if ~strcmp(parentAuthor,childAuthor) numChildrenNotSameAsParent(i) = numChildrenNotSameAsParent(i) + 1; end end end subplot(1,1,1) plot(numChildrenNotSameAsParent,'.') xlabel('Entry Index') ylabel('Number of Children Not the Same as the Parent')
Let's circle all the leaders again.
hold on plot(bestIndexList,numChildrenNotSameAsParent(bestIndexList),'ro') hold off
This makes more sense. If someone else is going to copy your code, it stands to reason that you're probably in the lead. There are some exceptions here, but this is not surprising when you realize many of these were part of the 1000-character challenge. In other words, the exceptions were simply leaders of a different (smaller) contest.
ix = 2710; fprintf('"%s" by %s had %d children that were not by the original author\n', ... d(ix).title, d(ix).author, numChildrenNotSameAsParent(ix)); hold on plot(ix,numChildrenNotSameAsParent(ix),'bsquare') hold off
"Remains of the Day" by the cyclist had 33 children that were not by the original author
