Unable to load datasets

Hello guys.
I have a program coding which I found in the Internet. To run the program, it needs to load datasets. But in the coding, the writter load the zip file of the datasets (.gz file for Linux).
I'm having a problem because I have the datasets but already in the unzip file (because I'm using Windows).
So anyone can help me modify the coding to load the datasets?
I have try few ways, but still get error.
I hope that there's anyone that can help me.
Thank you.

5 Comments

It is not easy to help without seeing the code that does the loading at the moment.
The coding below is for the data loading:
function [ labels, file_list ] = load_dtf_data( params, file_type )
%LOAD_DTF_DATA Load DTF data
% Assume that the dtf files have similar hierachy with UCF 101
% Inputs:
% dtf_dir - directory of gzipped DTF descriptors
% tt_list_dir - directory of train/test file list
% file_pattern - pattern of train/test list, such as 'train*' or 'test*'
% num_feats - number of features extracted from each dtf file
%
% Outputs:
% L -labels for DTF features
switch file_type
case 'train'
tt_list_dir=params.train_list_dir;
reg_pattern='train*';
case 'test'
tt_list_dir=params.test_list_dir;
reg_pattern='test*';
otherwise
error('Unknown file pattern!');
end
% extract training/test list
tt_list=[]; % train/test files
labels=[]; % labels
tlists=dir(fullfile(tt_list_dir,reg_pattern));
for i=1:length(tlists)
fid=fopen(fullfile(tt_list_dir,tlists(i).name));
tmp=textscan(fid,'%s %d');
tt_list=[tt_list;tmp{1}];
labels=[labels;tmp{2}];
end
file_list=cell(length(tt_list),1);
for i=1:length(tt_list)
clip_name=regexprep(tt_list{i},'\.avi$',''); % get video clip name
clip_name=regexprep(clip_name,'.*/','');
file_list{i}=[clip_name,'.txt'];
end
% extract DTF data and set labels
%try
% matlabpool close;
%catch exception
%end
%matlabpool open 4
parfor i=1:length(tt_list)
action=regexprep(tt_list{i},'/v_(\w*)\.avi','');
act_dir=fullfile(params.dtf_dir,action);
clip_name=regexprep(tt_list{i},'\.avi$',''); % get video clip name
clip_name=regexprep(clip_name,'.*/','');
dtf_file=fullfile(act_dir,[clip_name,'.dtf.gz']);
saved_feat_file=sprintf('%s.txt',clip_name);
switch file_type
case 'train'
HOG_file=fullfile(params.HOG_train_data,saved_feat_file);
HOF_file=fullfile(params.HOF_train_data,saved_feat_file);
MBHx_file=fullfile(params.MBHx_train_data,saved_feat_file);
MBHy_file=fullfile(params.MBHy_train_data,saved_feat_file);
case 'test'
HOG_file=fullfile(params.HOG_test_data,saved_feat_file);
HOF_file=fullfile(params.HOF_test_data,saved_feat_file);
MBHx_file=fullfile(params.MBHx_test_data,saved_feat_file);
MBHy_file=fullfile(params.MBHy_test_data,saved_feat_file);
otherwise
error('Unknown file pattern!');
end
[HOG,HOF,MBHx,MBHy]=extract_dtf_feats(dtf_file,params);
% To save memory, write DTF features into txt files instead of stored in memory
dlmwrite(HOG_file,HOG,'delimiter',' ');
dlmwrite(HOF_file,HOF,'delimiter',' ');
dlmwrite(MBHx_file,MBHx,'delimiter',' ');
dlmwrite(MBHy_file,MBHy,'delimiter',' ');
end
%matlabpool close
end
In the coding below, there are part where it seems to be functioned as to load the data:
function [ fvt ] = compute_fisher( params, pca_coeff, gmm, file_list )
%COMPUTE_FISHER Computer postitive/negative Fisher Vectors for each class
fvt=[];
fisher_params.alpha = single(0.5); % power notmalization, 1 to disable
fisher_params.grad_weights = false; % soft BOW
fisher_params.grad_means = true; % 1st order
fisher_params.grad_variances = true; % 2nd order
% Make sure all files are compressed
%check_gzip_cmd=sprintf('sh ./gzip_dtf_files -i %s > /dev/null 2>&1',params.dtf_dir);
%system(check_gzip_cmd);
%feat_idx=find(strcmp(params.feat_list,feat_type)); % find the index of feature in feat_list
% TODO:
% Should the following code be executed in parallel?
% The label of each FV is not an issue, because all of them could be
% assigned 1(positive files) or -1(negative files).
% What about the concatenation of Fisher vectors?
for j=1:length(file_list)
action=regexprep(file_list{j},'/v_(\w*)\.avi','');
act_dir=fullfile(params.dtf_dir,action);
clip_name=regexprep(file_list{j},'\.avi$',''); % get video clip name
clip_name=regexprep(clip_name,'.*/','');
file=fullfile(act_dir,[clip_name,'.dtf.gz']);
[HOG,HOF,MBHx,MBHy]=extract_dtf_feats(file, params, -1);
%fv_hog=fisher_encode(HOG,pca_coeff{1},gmm{1});
%fv_hof=fisher_encode(HOF,pca_coeff{2},gmm{2});
%fv_MBHx=fisher_encode(MBHx,pca_coeff{3},gmm{3});
%fv_MBHy=fisher_encode(MBHy,pca_coeff{4},gmm{4});
fv_hog=fisher_encode_vgg(HOG,pca_coeff{1},gmm{1},fisher_params);
fv_hof=fisher_encode_vgg(HOF,pca_coeff{2},gmm{2},fisher_params);
fv_MBHx=fisher_encode_vgg(MBHx,pca_coeff{3},gmm{3},fisher_params);
fv_MBHy=fisher_encode_vgg(MBHy,pca_coeff{4},gmm{4},fisher_params);
fv=[fv_hog;fv_hof;fv_MBHx;fv_MBHy];
fvt=[fvt fv]; % concatenate all features together
end
% power normalization
fvt = sign(fvt) .* sqrt(abs(fvt));
% L2 normalization
fvt = double(yael_fvecs_normalize(single(fvt)));
fvt(find(isnan(fvt))) = 123456;
end
And the final coding, there part that used to decompress the '.gz' file. But in my case, I already have the original datasets. So i don't need that parts.
function [ HOG,HOF,MBHx,MBHy ] = extract_dtf_feats( dtf_file, params, num_feats )
%EXTRACT_DTF_FEATS extract DTF features.
% The first 10 elements for each line in dtf_file are information about the trajectory.
% The trajectory info(default 30 dimensions) should also be discarded.
%
% Subsampling:
% randomly choose 100 descriptors from each video clip(dtf file)
% To use all the DTF fatures, set num_feats to a negative number.
%
HOG=zeros(params.feat_len_map('HOG'),1);
HOF=zeros(params.feat_len_map('HOF'),1);
MBHx=zeros(params.feat_len_map('MBHx'),1);
MBHy=zeros(params.feat_len_map('MBHy'),1);
if ~exist(dtf_file,'file')
warning('File %s does not exist! Skip now...',dtf_file);
return;
else
tmpfile=dir(dtf_file);
if tmpfile.bytes < 1024
warning('File %s is too small! Skip now...',dtf_file);
return;
end
end
unzip_cmd=sprintf('gunzip %s',dtf_file); % Suppose dtf files have suffix .gz
system(unzip_cmd);
unzip_dtf_file=regexprep(dtf_file,'\.gz$',''); % remove suffix .gz
x=load(unzip_dtf_file); % change unzip_dtf_file to dtf_file
zip_cmd=sprintf('gzip -f %s',unzip_dtf_file); % Suppose dtf files have suffix .gz
system(zip_cmd);
hog_range=params.feat_start:params.feat_start+params.feat_len_map('HOG')-1;
hof_range=hog_range(end)+1:hog_range(end)+params.feat_len_map('HOF');
mbhx_range=hof_range(end)+1:hof_range(end)+params.feat_len_map('MBHx');
mbhy_range=mbhx_range(end)+1:mbhx_range(end)+params.feat_len_map('MBHy');
if num_feats<0 % To use all the DTF fatures, set num_feats to a negative number
num_feats=size(x,1);
end
if size(x,1)<=num_feats
idx=1:size(x,1); % randomly subsampling
else
idx=randperm(size(x,1),num_feats); % randomly subsampling
%idx=floor(linspace(1,size(x,1),num_feats)); % linearly subsampling
end
HOG=x(idx,hog_range)';
HOF=x(idx,hof_range)';
MBHx=x(idx,mbhx_range)';
MBHy=x(idx,mbhy_range)';
end
Some of the datasets' name and label:
ApplyEyeMakeup/v_ApplyEyeMakeup_g08_c01.avi 1
ApplyEyeMakeup/v_ApplyEyeMakeup_g08_c02.avi 1
ApplyEyeMakeup/v_ApplyEyeMakeup_g08_c03.avi 1
ApplyEyeMakeup/v_ApplyEyeMakeup_g08_c04.avi 1
ApplyEyeMakeup/v_ApplyEyeMakeup_g08_c05.avi 1
ApplyEyeMakeup/v_ApplyEyeMakeup_g09_c01.avi 1
ApplyEyeMakeup/v_ApplyEyeMakeup_g09_c02.avi 1
ApplyEyeMakeup/v_ApplyEyeMakeup_g09_c03.avi 1
ApplyEyeMakeup/v_ApplyEyeMakeup_g09_c04.avi 1
ApplyEyeMakeup/v_ApplyEyeMakeup_g09_c05.avi 1
ApplyEyeMakeup/v_ApplyEyeMakeup_g09_c06.avi 1
ApplyEyeMakeup/v_ApplyEyeMakeup_g09_c07.avi 1
ApplyEyeMakeup/v_ApplyEyeMakeup_g10_c01.avi 1
ApplyEyeMakeup/v_ApplyEyeMakeup_g10_c02.avi 1
ApplyEyeMakeup/v_ApplyEyeMakeup_g10_c03.avi 1
ApplyEyeMakeup/v_ApplyEyeMakeup_g10_c04.avi 1
ApplyEyeMakeup/v_ApplyEyeMakeup_g10_c05.avi 1
ApplyEyeMakeup/v_ApplyEyeMakeup_g11_c01.avi 1
ApplyEyeMakeup/v_ApplyEyeMakeup_g11_c02.avi 1
Are these informations enough?
Hello Mohamad, the dtf_file above was written in binary file. You need to process the raw video into low-level feature extraction (for this case is IDT). Unfortunately, for this process, you cannot use matlab, however, I already share all the process in here (https://baitulaadiyat.blogspot.com/2020/07/dense-trajectory-and-improve-dense.html).
After you successfully, get the biinary files then you can proceed on matlab process.
Many Thanks.
I see no reason at the moment to expect that the original Question had anything to do with Dense Trajectory ??
@Walter Roberson, absolutely correct.
Change this part:
switch file_type
case 'train'
tt_list_dir=params.train_list_dir;
reg_pattern='train*';
case 'test'
tt_list_dir=params.test_list_dir;
reg_pattern='test*';
otherwise
error('Unknown file pattern!');
end
into
switch file_type
case 'train'
tt_list_dir=params.train_list_dir;
reg_pattern='train/*.txt'; % it will be list down all the .txt file (not .avi) that you downloaded from UCF101 Improve IDT Train data.
case 'test'
tt_list_dir=params.test_list_dir;
reg_pattern='test/*.txt';
otherwise
error('Unknown file pattern!');
end

Sign in to comment.

Answers (0)

Asked:

on 19 Mar 2019

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!