Different classification results for varaying miniBatchSize?

Question

0 votes

Hello,

I am training an LSTM network and then using the classify function to predict classes, however, when changing the MiniBatchSize in the classify function, the output results change as well, which should not be the case according to my understanding. In the documentation of the MiniBatchSize property of the classify function, it is only stated that it is faster to compute predictions when choosing a larger MiniBatchSize. So is this a bug? or am I missing on something?

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Viren Gupta on 28 Sep 2018

0 votes

Having a larger miniBatchSize helps in faster predictions. But the results can also change. LSTM accepts all the sequences in a single mini-batch to be of same size. Hence to achieve this, padding is done to make all sequences in a single mini-batch to be of same size. Hence varying the mini-batch size at prediction time for LSTMs can change the results depending on how much padding is applied to the test sequences. Depending on the mini-batch size, the amount of padding needed in each mini-batch of sequences can vary and therefore result in different classification results.

The same happens at training time, so as general advice it's good to keep the same mini-batch size for training and testing if possible.For more information on how padding works and how one can minimize its effect please see : padding in lstm.

1 Comment
Show -1 older comments Hide -1 older comments

Abolfazl Nejatian on 10 Jun 2023

Dear Viren,

I hope this email finds you well. I am currently working on a complex neural network architecture that combines a hybrid GoogleNet with an LSTM layer. My goal is to train this model using a large dataset consisting of over 4 million images. During the training phase, I have found that utilizing a larger mini-batch size significantly improves the speed and coverage of the training process.

However, I have encountered an issue during the testing and real-time classification phase. In these stages, I aim to classify individual samples that represent the latest state of the FOREX markets. To achieve this, I need to classify each sample separately rather than using a mini-batch. Surprisingly, I have observed substantial differences in the classification results compared to the training phase.

Upon investigating this matter, I learned that the varying mini-batch size during prediction can lead to differences in classification outcomes. This can be attributed to the fact that LSTM requires uniform sequence lengths within a mini-batch, resulting in the application of padding to adjust the sequence sizes. Consequently, the amount of padding can differ depending on the mini-batch size, leading to discrepancies in the classification results.

While I understand that maintaining a consistent mini-batch size for both training and testing is generally recommended, my specific requirements necessitate the classification of individual samples in real-time. I would greatly appreciate your expert guidance on how to address this situation effectively, considering the unique characteristics of my network architecture and dataset.

Thank you for your time and support. I look forward to your valuable insights.

Sign in to comment.

Different classification results for varaying miniBatchSize?

0 Comments
Show -2 older comments Hide -2 older comments

Answers (1)

1 Comment
Show -1 older comments Hide -1 older comments

Categories

Products

Release

Tags

Community Treasure Hunt

Different classification results for varaying miniBatchSize?

0 Comments Show -2 older comments Hide -2 older comments

Answers (1)

1 Comment Show -1 older comments Hide -1 older comments

Categories

Products

Release

Tags

See Also

Community Treasure Hunt

0 Comments
Show -2 older comments Hide -2 older comments

1 Comment
Show -1 older comments Hide -1 older comments