Training Loss is NaN, Deep Learning

403 views (last 30 days)
Hey everyone!
I am making a neural network in Matlab and I have a database with a lot of zeros (which are NaN values that I had to convert into zeros to be able to run the program), the thing is that when I train the network I get the error "Training finished: Training loss is NaN" and I don't know why or how to solve it. If anyone has any idea it would be appreciated.
Thank you very much!

Accepted Answer

Shubh Dhyani
Shubh Dhyani on 27 Feb 2023
Hi Fernando,
I understand that you are trying to train a neural network and your dataset has a lot of NaN values. So you are converting the NaN values to zero to train the model. As a result of this, your training loss is NaN too.
The issue you are encountering is likely due to the fact that you have many zero values in your dataset, which can cause numerical instability during training. When a network encounters a large number of zeros in the input, the gradients can become very small, leading to numerical precision issues such as overflow, underflow, or division by zero. This can result in the loss becoming NaN during training.
One possible solution is to preprocess your data to remove or modify the zero values. Here are a few strategies you could try:
1. Remove the zero values: If the zero values in your dataset are not critical to the analysis, you could consider removing them from the dataset. However, this could lead to loss of important information, so it should be done with caution.
2. Add noise to the zero values: You could add a small amount of noise to the zero values in the dataset to help break the symmetry and improve the stability of the gradients during training. One way to add noise is to sample from a small Gaussian distribution centered at zero.
3. Replace the zero values with a small constant: Instead of removing the zero values, you could replace them with a small constant value, such as 1e-6. This can help prevent the gradients from becoming too small and improve the stability of the training.
4. Use a different activation function: If you are using an activation function such as ReLU, which can become inactive (i.e., output zero) for negative inputs, this can exacerbate the issue of numerical instability. You could try using a different activation function, such as Leaky ReLU, which does not completely block negative inputs.
5. Use batch normalization: Batch normalization can help stabilize the gradients during training by normalizing the inputs to each layer. This can be especially helpful when dealing with large amounts of zero values.
It's worth noting that converting NaN values to zeros is not always a good strategy, as it can lead to confusion between true zero values and missing values. If possible, it's better to keep the NaN values and handle them explicitly during training.
  1 Comment
FERNANDO CALVO RODRIGUEZ
FERNANDO CALVO RODRIGUEZ on 2 Mar 2023
Thank you very much!
My problem is that each row, each observation, has different additives so to speak, and in some cases these additives are less or not present, so in that space this value does not exist, i.e. it is zero. I understand that by giving the value zero matlab understands that there is no such additive.
I don't know if it would be a good idea to give a small value greater than zero when there is no additive in an observation but I will try anyway.
Maybe the best is to keep the values in NaN.
Anyway, my database is very large and depends on many variables so I understand that the neural network has a hard time finding patterns.
I will implement the options that you have told me and see how it turns out.
I wish there was more detailed information about deep learning in matlab because it has a lot of potential.
Again, thank you very much!

Sign in to comment.

More Answers (0)

Categories

Find more on Sequence and Numeric Feature Data Workflows in Help Center and File Exchange

Products


Release

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!