break an image file into multiple image files in matlab

I need to process a source file(in .jpg). It contains a paragraph of text. I want to separate each line and produce it into separate files.
for eg I have an image file which contains:
Chicken Sandwich
Heavy Motor
Sign Language
I want to create 3 files. Each should contain one line. I want to break the file w.r.t the line spacing. How should I proceed ?

 Accepted Answer

Try collapsing the image horizontally:
verticalProfile = sum(grayImage, 2);
then look for bright and dark regions that indicate where the lines of text start and stop.

11 Comments

Can you please elaborate a little please? I am so sorry I didn't get you. I am quite new to maltab.
Where did you upload your image to, so I can use it. Make it easy for me - don't make me go look for an image, because then you'll just say "but my image doesn't look like that."
Here is the image. I need to extract each line or better the equations only. however I think that I can extract the equations myself, if I know how to extract the lines first.
Thanks in advance.
Wow, because building the equations would be the hardest part. Here, try this. I wrote code for you to extract every line of text from the image and show it to you: (Obviously you have to change the folder and the base file name of the image in the code to reflect where it is on your hard drive.)
clc; % Clear the command window.
close all; % Close all figures (except those of imtool.)
imtool close all; % Close all imtool figures if you have the Image Processing Toolbox.
clear; % Erase all existing variables. Or clearvars if you want.
workspace; % Make sure the workspace panel is showing.
format long g;
format compact;
fontSize = 20;
% Check that user has the Image Processing Toolbox installed.
hasIPT = license('test', 'image_toolbox');
if ~hasIPT
% User does not have the toolbox installed.
message = sprintf('Sorry, but you do not seem to have the Image Processing Toolbox.\nDo you want to try to continue anyway?');
reply = questdlg(message, 'Toolbox missing', 'Yes', 'No', 'Yes');
if strcmpi(reply, 'No')
% User said No, so exit.
return;
end
end
% Read in a standard MATLAB gray scale demo image.
folder = 'C:\Users\Swarnadeep\Documents\Temporary';
baseFileName = 'text 003.jpg';
% Get the full filename, with path prepended.
fullFileName = fullfile(folder, baseFileName);
% Check if file exists.
if ~exist(fullFileName, 'file')
% File doesn't exist -- didn't find it there. Check the search path for it.
fullFileName = baseFileName; % No path this time.
if ~exist(fullFileName, 'file')
% Still didn't find it. Alert user.
errorMessage = sprintf('Error: %s does not exist in the search path folders.', fullFileName);
uiwait(warndlg(errorMessage));
return;
end
end
grayImage = imread(fullFileName);
% Get the dimensions of the image.
% numberOfColorBands should be = 1.
[rows, columns, numberOfColorBands] = size(grayImage);
if numberOfColorBands > 1
% It's not really gray scale like we expected - it's color.
% Convert it to gray scale by taking only the green channel.
grayImage = grayImage(:, :, 2); % Take green channel.
end
% Display the original gray scale image.
subplot(2, 3, 1);
imshow(grayImage, []);
axis on;
title('Original Grayscale Image', 'FontSize', fontSize);
% Enlarge figure to full screen.
set(gcf, 'units','normalized','outerposition',[0 0 1 1]);
% Give a name to the title bar.
set(gcf,'name','Demo by ImageAnalyst','numbertitle','off')
% Let's compute and display the histogram.
[pixelCount, grayLevels] = imhist(grayImage);
subplot(2, 3, 2);
bar(pixelCount);
grid on;
title('Histogram of original image', 'FontSize', fontSize);
xlim([0 grayLevels(end)]); % Scale x axis manually.
% Binarize the image.
binaryImage = grayImage < 210;
subplot(2, 3, 3);
imshow(binaryImage, []);
axis on;
title('Binary Image', 'FontSize', fontSize);
% Find the lines and plot them.
verticalProfile = any(binaryImage, 2);
subplot(2, 3, 4);
plot(verticalProfile);
grid on;
title('Vertical Profile', 'FontSize', fontSize);
% Find out where each line starts
rowStarts = find(diff(verticalProfile)>0)+1
rowEnds = find(diff(verticalProfile)<0)
subplot(2, 3, 5:6);
title('This Line', 'FontSize', fontSize);
for row = 1 : length(rowStarts)
thisRow1 = rowStarts(row);
thisRow2 = rowEnds(row);
croppedImage = grayImage(thisRow1:thisRow2, :);
imshow(croppedImage);
caption = sprintf('Line of text #%d is between lines %d and %d (inclusive)', ...
row, thisRow1, thisRow2)
title(caption, 'FontSize', fontSize);
promptMessage = sprintf('Showing image between lines %d and %d (inclusive).\nDo you want to Continue processing,\nor Cancel to abort processing?',...
thisRow1, thisRow2);
titleBarCaption = 'Continue?';
button = questdlg(promptMessage, titleBarCaption, 'Continue', 'Cancel', 'Continue');
if strcmpi(button, 'Cancel')
return;
end
end
Wow thank you so much, it works perfectly. I was using a more manual method(i.e using the image matrix and doing processing on it) but it wasn't full proof enough.
Now I need to work on how to get the equations.
Any tips regarding that ?
No. Do you recall that you said "I think that I can extract the equations myself, if I know how to extract the lines first" and that I was surprised because that was the hardest part and you already knew how to do it? Well I extracted the lines for you, so now you can do the part you said you could do. Sorry, but I can't do every part of your project for you.
No no you don't need to. I got the approach and I will work on it.
I just want to ask you one thing, what is this line for? binaryImage = grayImage < 210;
I know its to binarize the image but can't seem to understand how its working.
It gives a binary image (true or false) that says if the pixel is below or above 210. Look at the binary image and you'll see that the pixels that are darker than that will be true (white), so it thresholds the image to identify the text.
I got that but why 210? Why not say 500 or even 10 ?
I plotted the histogram. Did you not notice it? What value do you think would split it into foreground and background the best?
Ah, Got it now. Thanks a lot sir for your help.

Sign in to comment.

More Answers (0)

Categories

Find more on Convert Image Type in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!