release

Allow property values and input characteristics to change

Since R2021a

collapse all in page

Syntax

release(ivs)

Description

release(ivs) allows property values and input characteristics of the i-vector system ivs to change.

example

Examples

collapse all

Train Environmental Sound Classification System

Open Live Script

Download and unzip the environment sound classification data set. This data set consists of recordings labeled as one of 10 different audio sound classes (ESC-10).

loc = matlab.internal.examples.downloadSupportFile("audio","ESC-10.zip");
unzip(loc,pwd)

Create an audioDatastore object to manage the data and split it into training and validation sets. Call countEachLabel to display the distribution of sound classes and the number of unique labels.

ads = audioDatastore(pwd,IncludeSubfolders=true,LabelSource="foldernames");
countEachLabel(ads)

ans=10×2 table
        Label         Count
    ______________    _____

    chainsaw           40  
    clock_tick         40  
    crackling_fire     40  
    crying_baby        40  
    dog                40  
    helicopter         40  
    rain               40  
    rooster            38  
    sea_waves          40  
    sneezing           40

Listen to one of the files.

[audioIn,audioInfo] = read(ads);
fs = audioInfo.SampleRate;
sound(audioIn,fs)
audioInfo.Label

ans = categorical
     chainsaw

Split the datastore into training and test sets.

[adsTrain,adsTest] = splitEachLabel(ads,0.8);

Create an audioFeatureExtractor to extract all possible features from the audio.

afe = audioFeatureExtractor(SampleRate=fs, ...
    Window=hamming(round(0.03*fs),"periodic"), ...
    OverlapLength=round(0.02*fs));
params = info(afe,"all");
params = structfun(@(x)true,params,UniformOutput=false);
set(afe,params);
afe

afe = 
  audioFeatureExtractor with properties:

   Properties
                     Window: [1323×1 double]
              OverlapLength: 882
                 SampleRate: 44100
                  FFTLength: []
    SpectralDescriptorInput: 'linearSpectrum'
        FeatureVectorLength: 862

   Enabled Features
     linearSpectrum, melSpectrum, barkSpectrum, erbSpectrum, mfcc, mfccDelta
     mfccDeltaDelta, gtcc, gtccDelta, gtccDeltaDelta, spectralCentroid, spectralCrest
     spectralDecrease, spectralEntropy, spectralFlatness, spectralFlux, spectralKurtosis, spectralRolloffPoint
     spectralSkewness, spectralSlope, spectralSpread, pitch, harmonicRatio, zerocrossrate
     shortTimeEnergy

   Disabled Features
     none


   To extract a feature, set the corresponding property to true.
   For example, obj.mfcc = true, adds mfcc to the list of enabled features.

Create two directories in your current folder: train and test. Extract features from the training and the test data sets and write the features as MAT files to the respective directories. Pre-extracting features can save time when you want to evaluate different feature combinations or training configurations.

if ~isdir("train")
    mkdir("train")
    mkdir("test")

    outputType = ".mat";
    writeall(adsTrain,"train",WriteFcn=@(x,y,z)writeFeatures(x,y,z,afe))
    writeall(adsTest,"test",WriteFcn=@(x,y,z)writeFeatures(x,y,z,afe))
end

Create signal datastores to point to the audio features.

sdsTrain = signalDatastore("train",IncludeSubfolders=true);
sdsTest = signalDatastore("test",IncludeSubfolders=true);

Create label arrays that are in the same order as the signalDatastore files.

labelsTrain = categorical(extractBetween(sdsTrain.Files,"ESC-10"+filesep,filesep));
labelsTest = categorical(extractBetween(sdsTest.Files,"ESC-10"+filesep,filesep));

Create a transform datastore from the signal datastores to isolate and use only the desired features. You can use the output from info on the audioFeatureExtractor to map your chosen features to the index in the features matrix. You can experiment with the example by choosing different features.

featureIndices = info(afe)

featureIndices = struct with fields:
          linearSpectrum: [1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 … ]
             melSpectrum: [663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694]
            barkSpectrum: [695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726]
             erbSpectrum: [727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769]
                    mfcc: [770 771 772 773 774 775 776 777 778 779 780 781 782]
               mfccDelta: [783 784 785 786 787 788 789 790 791 792 793 794 795]
          mfccDeltaDelta: [796 797 798 799 800 801 802 803 804 805 806 807 808]
                    gtcc: [809 810 811 812 813 814 815 816 817 818 819 820 821]
               gtccDelta: [822 823 824 825 826 827 828 829 830 831 832 833 834]
          gtccDeltaDelta: [835 836 837 838 839 840 841 842 843 844 845 846 847]
        spectralCentroid: 848
           spectralCrest: 849
        spectralDecrease: 850
         spectralEntropy: 851
        spectralFlatness: 852
            spectralFlux: 853
        spectralKurtosis: 854
    spectralRolloffPoint: 855
        spectralSkewness: 856
           spectralSlope: 857
          spectralSpread: 858
                   pitch: 859
           harmonicRatio: 860
           zerocrossrate: 861
         shortTimeEnergy: 862

idxToUse = [...
    featureIndices.harmonicRatio ...
    ,featureIndices.spectralRolloffPoint ...
    ,featureIndices.spectralFlux ...
    ,featureIndices.spectralSlope ...
    ];
tdsTrain = transform(sdsTrain,@(x)x(:,idxToUse));
tdsTest = transform(sdsTest,@(x)x(:,idxToUse));

Create an i-vector system that accepts feature input.

soundClassifier = ivectorSystem(InputType="features");

Train the extractor and classifier using the training set.

trainExtractor(soundClassifier,tdsTrain,UBMNumComponents=128,TVSRank=64);

Calculating standardization factors ....done.
Training universal background model .....done.
Training total variability space ......done.
i-vector extractor training complete.

trainClassifier(soundClassifier,tdsTrain,labelsTrain,NumEigenvectors=32,PLDANumIterations=0)

Extracting i-vectors ...done.
Training projection matrix .....done.
i-vector classifier training complete.

Enroll the labels from the training set to create i-vector templates for each of the environmental sounds.

enroll(soundClassifier,tdsTrain,labelsTrain)

Extracting i-vectors ...done.
Enrolling i-vectors .............done.
Enrollment complete.

Calibrate the i-vector system.

calibrate(soundClassifier,tdsTrain,labelsTrain)

Extracting i-vectors ...done.
Calibrating CSS scorer ...done.
Calibration complete.

Use the identify function on the test set to return the system's inferred label.

inferredLabels = labelsTest;
inferredLabels(:) = inferredLabels(1);
for ii = 1:numel(labelsTest)
    features = read(tdsTest);
    tableOut = identify(soundClassifier,features,"css",NumCandidates=1);
    inferredLabels(ii) = tableOut.Label(1);
end

Create a confusion matrix to visualize performance on the test set.

uniqueLabels = unique(labelsTest);
cm = zeros(numel(uniqueLabels),numel(uniqueLabels));
for ii = 1:numel(uniqueLabels)
    for jj = 1:numel(uniqueLabels)
        cm(ii,jj) = sum((labelsTest==uniqueLabels(ii)) & (inferredLabels==uniqueLabels(jj)));
    end
end
labelStrings = replace(string(uniqueLabels),"_"," ");
heatmap(labelStrings,labelStrings,cm)
colorbar off
ylabel("True Labels")
xlabel("Predicted Labels")
accuracy = mean(inferredLabels==labelsTest);
title(sprintf("Accuracy = %0.2f %%",accuracy*100))

Release the i-vector system.

release(soundClassifier)

Supporting Functions

function writeFeatures(audioIn,info,~,afe)
    % Convert to single-precision
    audioIn = single(audioIn);

    % Extract features
    features = extract(afe,audioIn);

    % Replace the file extension of the suggested output name with MAT.
    filename = strrep(info.SuggestedOutputName,".wav",".mat");

    % Save the MFCC coefficients to the MAT file.
    save(filename,"features")
end

Input Arguments

collapse all

`ivs` — i-vector system
`ivectorSystem` object

i-vector system, specified as an object of type ivectorSystem.

Version History

Introduced in R2021a