The function trainNetwork is slow during startup - what is the cause?
7 views (last 30 days)
I'm training a segmentation network (a modified unetLayers-network) on ~30000 128x128 images, using a validation set of about 7000 128x128 images. However when I start my training progress (using trainNetwork), it takes around 10 minutes after the training-progress-window shows up before any data is plotted in the window. Since the normalization is performed before this window appears (according to a popup-window), what is is it that takes this time during the startup of the training process for the trainNetwork-function? I'm aware that this is most likely due to the size of my dataset and hardware constraints (a single NVIDIA GPU), but I'm curious about what process is running here.
Edit: It should also be noted that when pressing the button for early stopping, it can take up to 15 minutes for it to finish.
Joss Knight on 21 Nov 2020
Your validation set is huge. I suspect that what's taking time is the computation of the first validation accuracy metric. If finalization takes 15 minutes then it means processing 30000 images takes at least 15 minutes, which means 7000 images is going to take at least 3.5 minutes, and most likely more. Validation sets should be the smallest you can get away with for statistically significant metrics. Use a larger test set on the final network to check the quality of the output, rather than relying on validation during training.
Why not run the MATLAB profiler during training and tell us what the report says? It could be other things, like loading and preprocessing data.
The 15 minutes at the end is the finalization - your network has batch normalization layers and the TrainedMean and TrainedVariance population statistics are being computed from your complete dataset.