How to resolve the error "Expected audioIn to be a column vector."
Show older comments
Hi,
I am using the "Speech Command Recognition Using Deep Learning " to detect words from my own database. The length of each clip in my database is 4 seconds. I am using the default parameters for creating histograms only changing the segmentDuration to the duration in my clips. What wrong I might be doing here?
segmentDuration = 3;
frameDuration = 0.025;
hopDuration = 0.010;
numBands = 40;
The problem is whenever I am trying to run the code, its throwing an error
"Computing speech spectrograms...
Error using auditorySpectrogram
Expected audioIn to be a column vector.
Error in auditorySpectrogram>validateRequiredInputs (line 196)
validateattributes(x,{'single','double'},...
Error in auditorySpectrogram (line 66)
validateRequiredInputs(x,fs)
Error in speechSpectrograms (line 24)
spec = auditorySpectrogram(x,fs, ...
Error in Speech (line 30)
XTrain = speechSpectrograms(adsTrain,segmentDuration,frameDuration,hopDuration,numBands);"
What could be the possible reason and a solution for this. Thannks
Answers (2)
Walter Roberson
on 26 Oct 2018
0 votes
You might be passing in stereo sound instead of mono.
17 Comments
Mohammed Faridul Haque Siddiqui
on 28 Oct 2018
Walter Roberson
on 28 Oct 2018
Transpose the data to make it into a column vector instead of a row vector.
Mohammed Faridul Haque Siddiqui
on 28 Oct 2018
Walter Roberson
on 28 Oct 2018
XTrain = speechSpectrograms(adsTrain(:), segmentDuration, frameDuration, hopDuration, numBands);"
Mohammed Faridul Haque Siddiqui
on 28 Oct 2018
Mohammed Faridul Haque Siddiqui
on 28 Oct 2018
Walter Roberson
on 12 Nov 2018
Okay instead, in speechSpectrograms edit the line
spec = auditorySpectrogram(x,fs, ...
to use x(:, 1) instead of x
Mohammed Faridul Haque Siddiqui
on 12 Nov 2018
Edited: Mohammed Faridul Haque Siddiqui
on 12 Nov 2018
Mohammed Faridul Haque Siddiqui
on 13 Nov 2018
Edited: Mohammed Faridul Haque Siddiqui
on 13 Nov 2018
Mohammed Faridul Haque Siddiqui
on 18 Nov 2018
Mohammed Faridul Haque Siddiqui
on 12 Dec 2018
Walter Roberson
on 12 Dec 2018
Any file can be read as a sequence of bytes using fopen/fread/fclose and specifying '*uint8' as the "precision" parameter.
Once you have a sequence of bytes, use your existing code to transmit sequences of bytes.
To use it on the receiving end, you might need to write the bytes to a file.
Mohammed Faridul Haque Siddiqui
on 13 Dec 2018
Walter Roberson
on 13 Dec 2018
Edited: Walter Roberson
on 13 Dec 2018
Mathworks does not supply code for that, and there does not appear to be anything suitable in the File Exchange. I did not attempt to Google for any code for that purpose.
Note: "class" is rather vague for this purpose. Are you wanting to classify by singing range? By gender identity? By ethnicity? Are you doing emotion recognition ?
Mohammed Faridul Haque Siddiqui
on 13 Dec 2018
Walter Roberson
on 2 Mar 2019
How do you resolve the errors? Can you tell me where wrong is? Thank
Pablo Torres
on 27 Nov 2019
0 votes
Audio must be in mono, you can use Audacity for that purpose.
Categories
Find more on Speech Recognition in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!