I am trying to train a GAN. By exploring MATLAB's official example, I realised the following
gradientsGenerator = dlgradient(lossGenerator, dlnetGenerator.Learnables,'RetainData',true);
gradientsDiscriminator = dlgradient(lossDiscriminator, dlnetDiscriminator.Learnables);
And after reading the help of dlgradient(...), I have the following questions:
- What is derivative trace in dlgradient function? Consider a two-layered dlnetwork, in which
output = sigmoid(z);
targetOutput = 1 * ones(size(z));
Cost = 0.5*mean(targetOutput-output).^2;
So my guess is that the derivative trace is del(Cost)/del(z)=-(targetOutput-output).*sigmoid(z).*(1-simoid(z), del(Cost)/del(input)=W'*del(Cost)/del(z), etc., is that correct? Or dose it indicate something else? May anyone tell me?
2. If my guess is correct, when I train a GAN and perform dlgradient for the discriminator and the generator in the same dlfeval, will it be the same if I calculate derivatives of the discriminator first? For example
gradientsDiscriminator = dlgradient(lossDiscriminator, dlnetDiscriminator.Learnables,'RetainData',true);
gradientsGenerator = dlgradient(lossGenerator, dlnetGenerator.Learnables);
Because when calculating gradients in the generator, the W's and B's in the discriminator remain unchanged.
3. As I can see in many GAN papers, the key to a successful training of an GAN is that the generator and the discriminator are trained separately, that is, the first set of synthetic (fake) images goes through both the discriminator and the generator, and the discriminator is trained by its cost together with the cost caused by real images so that W's and B's in the discriminator get updated. Then the second set of synthetic images goes through both, and the generator is trained by its cost so that ONLY W's and B's in the generator are updated. In Keras of Python, parameters of a model can be set not trainable explictly. In MATLAB, how can I make sure that it is EXACTLY what happens?
Thanks a lot.