%% Fractal visualisation of genomic base counts
% Genomic researchers are often interested in counts of the nucleotide base
% in a genetic sequence. The Bioinformatic Toolbox contains functions for
% counting sequences of length one, and two, ie individual bases and dimers.
% We demonstrate these using a mitochondrial DNA sequence.
%% Visualising the counts
% These functions also have options for visualising the counts.
%% Longer sequences
% If we want to count longer sequences, we can use nmercount:
counts = nmercount(mitochondria,3);
for i = 1:10
% But there is currently no way of visualising these counts. We turn to
% some work by Carr  that advocates using a fractal visualisation, which
% is extensible (in theory) to arbitrarily long sequences. To visualise
% counts of length n, we construct a Sierpinski gasket of depth n.
%% What is a Sierpinski gasket?
% A Sierpinski gasket is a simple fractal. A gasket of depth 0 is simply a
% tetrahedron. To construct a gasket of depth 1, the tetrahedron is
% replaced by four smaller tetrahedra, each of half the side length of the
% original, placed at each corner of the original. This leaves a hole in the
% center. To construct deeper gaskets, keep replacing each tetrahedron with
% four smaller ones.
%% How does this relate to genetic sequences?
% There are four nucleotide bases: A, C, G and T - conveniently, the same
% number as the number of vertices of a tetrahedron. Base sequences of
% length n are mapped onto subgaskets as follows: The first base in the
% sequence maps onto one of the four tetrahedra of depth 1. This contains
% four sub-subgaskets. The second base in the sequence maps onto one of
% these sub-subgaskets. And so on. We color the gasket by the count of that
% sequence, and set all the gaskets to have a transparency of 0.1.
% For length 3:
% And for length 4:
% Sequences of length 5 and 6 are fine; 7 might be a task for DCT.
% This is nicely rotatable. The Sierpinski gasket has a nice property
% that when it is viewed from above, all the tetrahedra line up to form a
% square. You can achieve this by manually rotating the previous plot, or by
% calling as follows.
% For longer sequences, it is useful to also scale the transparency of the
% tetrahedra by the count, to focus sttention on the common sequences.
% Carr DB, 2002: 2D and 3D coordinates for m-mers and dynamic graphics.
% Computing Science and Statistics: Proceedings of the 34th Symposium on
% the Interface.