This code does the same as matlab's dct with a few improvements:
1) on the first call, it's a bit faster than the builtin dct
2) on subsequent calls, due to persistent variables, it's about 2x or more faster than the builtin dct, and only about 1.5x slower than a fft call
3) you can specify which version of the type II DCT you want: either matlab's orthogonal version, or the standard version (cf. fftw website, or wikipedia)
4) you can sample the "rows" of the (implicit) DCT matrix, i.e. sample the rows of the output
This complements the code "idctt".
The extra "t" at the end of the filename has no meaning other than to distinguish it from the builtin dct.