Parallel Computing for video compression

2 views (last 30 days)
Hi guys,
I need some help with parallel programming in MATLAB. To be clear, I have never implemented parallelization techniques in any of my codes before.
I have a video compression engine, developed as part of my university project. It is a basic verion of H.264 video compression engine. I have to implement the parallel proceesing techniques available in MATALB to this engine. Basically, I have a function which divides an image frame into a number of blocks (predtermined by the size of the block). I'm trying to partially or fully parallelize this block of the code. I have used "parfor" when there was no dependency between the blocks, and this worked out well. I have uploaded this implementation. Now I'm trying to parallalize a case were there are dependencies between blocks.
function [reconstructed_frames, residual_blocks, encoded_data_cell, bit_count_coeff_per_frame, bit_count_mv_per_frame_cell, real_avg_bit_count_per_row_per_frame, total_bit_count_per_frame, QP_used_in_row, scene_change_frames, SAD_value_per_frame] = block_prediction_parallalized(Y, block_size, srch_rng, QP, I_period,pathToResiduals, no_ref_frames, VBS_enable, Fast_ME_enable,Frac_ME_enable,lambda, RC_flag, avg_bit_count_row_vary_QP, target_bits_per_frame)
%Function to predcit frames based on inter prediction and intra prediction,
%with the given I-period
Y = int64(Y);
[no_rows, no_cols, no_frames] = size(Y);
no_blocks_in_row = (no_cols*block_size)/(block_size*block_size);
no_blocks_in_col = (no_rows*block_size)/(block_size*block_size);
total_blocks_per_frame = (no_rows*no_cols)/(block_size*block_size);
encoded_data_cell = cell(1,total_blocks_per_frame,no_frames);
encoded_data_per_frame = cell(1, total_blocks_per_frame);
ref_frame_inter = zeros(no_rows, no_cols, 1, 'int64') + 128;
bit_count_coeff_per_frame = 0;
bit_count_mv_per_frame_cell = 0;
real_avg_bit_count_per_row_per_frame = 0;
QP_used_in_row = zeros(1,no_blocks_in_col,no_frames);
QP_used_in_row(:,:,:) = QP;
scene_change_frames = [];
SAD_value_per_frame = 0;
ref_frame_index_count = 1;
for k = 1:no_frames
if k>1
ref_frame_inter(:,:,1) = Y(:,:,k-1);
end
block_segment = 0;
bitCountMV = 0;
for row = 1 : block_size : no_rows - block_size + 1
for col = 1 : block_size : no_cols - block_size + 1
block_segment = block_segment + 1;
row_start = row;
row_end = row_start + block_size - 1;
col_start = col;
col_end = col_start + block_size - 1;
row_end = min(row_end, no_rows);
col_end = min(col_end, no_cols);
% Making an array of blocks of size block_size
block_list_currframe(:,:,block_segment) = Y(row_start:row_end, col_start:col_end, k);
location_pointers(block_segment,:) = [row_start row_end col_start col_end];
end
end
%Parallelizing the block encoding process
max_index = size(block_list_currframe,3);
%Loop for processing blocks concurrently
parfor block_index = 1:max_index
% Funtion for inter-prediction
[encoded_data, reconstructed_block, residual_block, bit_count_per_block] = paral_debug_funct(block_index, location_pointers, block_list_currframe, ref_frame_inter, block_size, srch_rng, QP, no_rows, no_cols, ref_frame_index_count, VBS_enable, Fast_ME_enable, Frac_ME_enable, lambda);
%Buffering the output of each worker
reconstructed_blocks(:,:,block_index) = reconstructed_block;
residual_blocks_in_frame(:,:,block_index) = residual_block;
encoded_data_per_frame(:,:, block_index) = encoded_data;
total_bit_count_per_block(block_index) = bit_count_per_block;
end
%Processing the buffered outputs obtained after processing all the
%blocks.
for block_index = 1:size(block_list_currframe,3)
% [row_start, row_end, col_start, col_end] = location_pointers(block_index,:);
row_start = location_pointers(block_index, 1);
row_end = location_pointers(block_index, 2);
col_start = location_pointers(block_index, 3);
col_end = location_pointers(block_index, 4);
reconstructed_frames(row_start:row_end, col_start:col_end, k) = reconstructed_blocks(:,:,block_index);
residual_blocks(:,:,block_index,k) = residual_blocks_in_frame(:,:,block_index);
encoded_data_cell(:,:,block_index,k) = encoded_data_per_frame(:,:,block_index);
end
total_bit_count_per_frame(k) = sum(total_bit_count_per_block, 'all');
end
In the above code, the blocks dont have to communicate with each other. Now, I require them to communicate with each other at some point, as the processing of some blocms will have to wait for a previous block to finish.
I think the image below will help make it clearer.
I have come to know that there are two type of parallel processing available, multi-threading and multi-processing. I think multi-threading is what is apt for my use case. I have read about spmd and parfeval but, the examples I've come across are usually not very detailed. As I am new to parallel processing, these options feel very confusing and it is difficult to choose which one to focus on. I think what I want is that the workers to be able to communicate with each other during exection?, I'm not sure. If you need a general idea of the data size: video_frame size = 288x352(CIF format)
block size = 16
no of frames = 21

Accepted Answer

Vishnu Pradeep
Vishnu Pradeep on 2 Dec 2021
Someone on another forum helped me with this answer, if it's any help to anyone, I'm posting it here. Feel free to ask any questions! :)
You can use a parfor inside a non parallel for, something like this:
previous_blocks = {};
for color : ["green", "red", "blue"]
input_blocks = extract cell array of blocks with same color from the image
processed_blocks = cell(1, numel(input_blocks));
parfor i=1:numel(input_blocks)
processed_blocks{i} = process_based_on_previous_blocks (i, input_blocks{i}, previous_blocks);
end
previous_blocks = processed_blocks;
place processed_blocks in their original position in the image;
end

More Answers (0)

Categories

Find more on Image Processing and Computer Vision in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!