How to resolve the error "Expected audioIn to be a column vector."

Hi,
I am using the "Speech Command Recognition Using Deep Learning " to detect words from my own database. The length of each clip in my database is 4 seconds. I am using the default parameters for creating histograms only changing the segmentDuration to the duration in my clips. What wrong I might be doing here?
segmentDuration = 3;
frameDuration = 0.025;
hopDuration = 0.010;
numBands = 40;
The problem is whenever I am trying to run the code, its throwing an error
"Computing speech spectrograms...
Error using auditorySpectrogram
Expected audioIn to be a column vector.
Error in auditorySpectrogram>validateRequiredInputs (line 196)
validateattributes(x,{'single','double'},...
Error in auditorySpectrogram (line 66)
validateRequiredInputs(x,fs)
Error in speechSpectrograms (line 24)
spec = auditorySpectrogram(x,fs, ...
Error in Speech (line 30)
XTrain = speechSpectrograms(adsTrain,segmentDuration,frameDuration,hopDuration,numBands);"
What could be the possible reason and a solution for this. Thannks

Answers (2)

You might be passing in stereo sound instead of mono.

17 Comments

Thanks for the answer, but I am passing a mono sound. Its still showing these errors.
Transpose the data to make it into a column vector instead of a row vector.
Is it possible to do it in the datastore object of MATLAB? Here I am using audioDatastore.
XTrain = speechSpectrograms(adsTrain(:), segmentDuration, frameDuration, hopDuration, numBands);"
I tried this, but another error popped in, it says"Array formation and parentheses-style indexing with objects of class 'audioDatastore' is not allowed. Use objects of class 'audioDatastore' only as scalars or use a cell array.
Error in Speech (line 30) XTrain = speechSpectrograms(adsTrain(:), segmentDuration, frameDuration, hopDuration, numBands);"
This is one of the files from the database that I am using. Please if you can tell me what is the issue with this type of file as the sample database provided with the example is working perfectly with the program. Do I need to pre-process this file or something else that needs to be done to rectify that error? Thanks
Okay instead, in speechSpectrograms edit the line
spec = auditorySpectrogram(x,fs, ...
to use x(:, 1) instead of x
Hi, thanks for the answer. That worked. But as it progressed, it started throwing another error for spectrogram calculation of background noise. "Computing background spectrograms... Unable to perform assignment because the size of the left side is 40-by-398 and the size of the right side is 40-by-98.
Error in backgroundSpectrograms (line 39) Xbkg(:,:,:,ind) = auditorySpectrogram(x,fs, ...
Error in Speech (line 85) XBkg = backgroundSpectrograms(adsBkg,numBkgClips,volumeRange,segmentDuration,frameDuration,hopDuration,numBands);
39 Xbkg(:,:,:,ind) = auditorySpectrogram(x,fs, ..." For the sample data, this was working fine, but in my case, this error popped in. What could be wrong here?
Hi, I think the problem lies in these two variables
"numBkgClips = 4000;
volumeRange = [1e-4,1];"
The length of my sound clip file is 4 seconds. How should I change these values for this length of sound file? I tried changing numBkgClips to 1000 but same error was there. Am I thinking in right direction?
Hi, can you please let me know how these parameters should be changed to accomodate my database? Thanks
hello, i was able to resolve the errors. the network is getting trained. although i have one question. If I want to pass an audiofile through the network to check how can i pass. there is sample code to pass speech through microphone, but how can i modify the code to accept file instead of stream to give an answer.
thanks
Any file can be read as a sequence of bytes using fopen/fread/fclose and specifying '*uint8' as the "precision" parameter.
Once you have a sequence of bytes, use your existing code to transmit sequences of bytes.
To use it on the receiving end, you might need to write the bytes to a file.
Yes I get that. I was looking for some code that would take an audio file as an input, pass it through the trained CNN and give us output in terms of which class it belongs to (probabilities of each class). Is there any pre built function or code snippet that could do this for me? Thanks
Mathworks does not supply code for that, and there does not appear to be anything suitable in the File Exchange. I did not attempt to Google for any code for that purpose.
Note: "class" is rather vague for this purpose. Are you wanting to classify by singing range? By gender identity? By ethnicity? Are you doing emotion recognition ?
I am working on emotion recognition. I wish to know the probability of the identified class of a sample audio file. I tried googling fot any leads in this regard but couldn't find anything that's why reverted to this option.
How do you resolve the errors? Can you tell me where wrong is? Thank

Sign in to comment.

Audio must be in mono, you can use Audacity for that purpose.

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!