On page 32 (on adobe reader 49/225) of HNM thesis (Stylianou) it states that after estimating the pitch by minimizing E the gross
pitch errors are removed (reference to Multiband Excited Vocoder).
I have read that paper and cannot understand where we get tracks from> Before this step I have got the pitch for every frame of the speech signal
So as such there is only one track.
Can you please clarify?
On page 33 (on adobe reader 50/225) for voiced/unvoiced decision a synthetic signal s ̃(t) is generated
using the amplitudes and phases determined by the DFT algorithm.
Can you please specify which algorithm is this? Any papers on it?