is there are a way to take the result of this figure with out input data ?

2 views (last 30 days)
Hussien Alaa
Hussien Alaa on 10 Jul 2021
Edited: DGM on 23 Sep 2021
  1 Comment
DGM
DGM on 10 Jul 2021
What exactly are you asking? The result is the figure. If you have the figure, you have the result already. Are you asking to back-calculate the input data from the figure?

Sign in to comment.

Answers (2)

DGM
DGM on 10 Jul 2021
Edited: DGM on 23 Sep 2021
EDIT: I know this is a dead question, but I'm using this as a place for a reference answer.
I've posted similar answers before, and each time I have to point out that the quality of any extracted information will be questionable. You're working with a destructively compressed low-resolution representation of data that's been presented with enough ambiguity that certain details are simply unrecoverable (e.g. thick lines that obscure each other, using the same color for multiple overlapping curves). Starting with a very poor source of information and expecting to automagically extract the data without human effort is simply an exercise in tedious oversight. In other words, you either have to invest effort in transcribing the data, or you'll have to invest at least as much effort in babysitting a very fragile automation of the same.
Following are three approaches to recovering the data, starting with the worst and ending with the best. There may be better or worse than these three examples, but this is what I'm presenting.
1: Direct raster image analysis
Let's attempt to extract the curve from the image directly. Since the red curves are ambiguous and the black curve is the most obscured, I'll used the blue curve.
inpict = imread('crank.jpg');
% try to extract the blue trace
mb = inpict(:,:,1)<100 & inpict(:,:,2)<100 & inpict(:,:,3)>50;
lpict = bwmorph(mb,'thin',100);
[y x] = find(lpict);
imshow(mb)
% data range from graph
xrange = [0 1000];
yrange = [-200 400];
% process with a really wide edge filter
% only select largest objects for each axis
w = 100;
fk = repmat([-1 1],[w 1])./w;
inpict = rgb2gray(inpict)<128;
a = bwareafilt(imfilter(inpict,fk),2);
b = bwareafilt(imfilter(inpict,fk.'),2);
% get plot box extents
S = regionprops(a,'centroid');
C = vertcat(S.Centroid);
xl = C(:,1);
S = regionprops(b,'centroid');
C = vertcat(S.Centroid);
yl = C(:,2);
% rescale to fit data range
x = xrange(1) + diff(xrange)*(x-xl(1))./diff(xl);
y = yrange(2) - diff(yrange)*(y-yl(1))./diff(yl);
% since this is the blue curve, shift back 180d
x = x-180;
% get rid of nonunique points; smooth
[x,idx,~] = unique(x);
y = y(idx);
xn = smooth(x);
yn = smooth(y);
plot(xn,yn); grid on
xlim(xrange)
ylim(yrange)
As lumpy and imperfect as this is, it doesn't usually work out even this well. For the rest of the curves, just offset the curve by 90 degrees. If there are any meaningful differences between the traces, that information was lost long ago. Again, images like this are not accurate data sources. They are merely crude visualizations of data.
2: Manual transcription to SVG with raster image analysis
These last two methods are what I've recommended before. Open the image in a vector image editor (I used Inkscape), overlay a rectangle on the plot box, use a bezier tool to manually fit a curve by placing smooth nodes at each inflection point. Select the rectangle and curve, export as png. Now at least half the problem has been eliminated. The compression artifacts, plot box, labels, grid lines, and overlapping nonsense are all gone. It's just a single thin curve. The geometry of the image itself corresponds directly to the plot box extents.
Like in the prior example, edge reduction and point sorting can get us a representation of the data.
% reduce curve, find points
lpict = rgb2gray(imread('cranktrace1.png'));
lpict = bwmorph(lpict>128,'thin',100);
[y x] = find(lpict);
% data range from graph
xrange = [0 1000];
yrange = [-200 400];
% rescale to fit data range
x = xrange(1) + diff(xrange)*x./size(lpict,2);
y = yrange(2) - diff(yrange)*y./size(lpict,1);
% get rid of nonunique points; smooth
[x,idx,~] = unique(x);
y = y(idx);
xn = smooth(x);
yn = smooth(y);
plot(xn,yn); grid on
xlim(xrange)
ylim(yrange)
I've attached the svg file (in a zip archive since the site doesn't know what a svg file is), and the rasterized output for curve 1.
3: Manual transcription to SVG with direct SVG import
Method 2 is only an incremental improvement on method 1. It leverages your ability to make intelligent decisions about the intended shape of potentially ambiguous lines among surrounding noise -- but it throws away a lot of that potential by converting back to a raster image. Without inbuilt SVG import tools, I was left assuming I would have to write one capable of path handling, but apparently I didn't see that there already was one on the FEX:
This simplifies everything. As in method 2, manually transcribe the image using a vector image editor. The objects of interest are a single rectangle corresponding to the plot box, and a path object corresponding to the plot trace. Now the SVG file itself can be imported with significantly reduced error (as good as the transcription itself, anyway.
% filename of manually-fit svg file
fname = 'cranktrace.svg';
% data range from original image axis labels
xrange = [0 1000];
yrange = [-200 400];
% spline discretization parameter [0 1]
coarseness = 0.001;
% get plot box geometry
str = fileread(fname);
str = regexp(str,'((?<=<rect)(.*?)(?=\/>))','match');
pbx = regexp(str,'((?<=x=")(.*?)(?="))','match');
pby = regexp(str,'((?<=y=")(.*?)(?="))','match');
pbw = regexp(str,'((?<=width=")(.*?)(?="))','match');
pbh = regexp(str,'((?<=height=")(.*?)(?="))','match');
pbrect = [str2double(pbx{1}{1}) str2double(pby{1}{1}) ...
str2double(pbw{1}{1}) str2double(pbh{1}{1})];
% get coordinates representing the curve
S = loadsvg(fname,coarseness,false);
x = S{1}(:,1); % assuming the first path is the correct one
y = S{1}(:,2);
% if there are multiple paths you want to extract
% you'll need to do do the rescaling, etc for each element of S
% rescale to fit data range
x = xrange(1) + diff(xrange)*(x-pbrect(1))/pbrect(3);
y = yrange(1) + diff(yrange)*(pbrect(4) - (y-pbrect(2)))/pbrect(4);
% get rid of nonunique points
[x,idx,~] = unique(x);
y = y(idx);
% plot
plot(x,y); grid on; hold on
xlim(xrange)
ylim(yrange)
Conclusion
By overlaying these three examples in the same plot (zoomed in), we can see the amount of jitter caused by the fact that the two curves are derived from binarized raster images. The red trace (method 1) is the worst due to the segmentation defects, but the magenta trace (method 2) is still fairly close to the smooth curve provided by method 3.
Especially for smooth curves, direct transcription and SVG import yields smooth results which are as accurate as the user bothers to transcribe the image. The susceptibility to problems caused by compression, annotations, grid lines, overlapping curves, linetype, linewidth, or tight cusps is minimized. Relying on direct, automated analysis of crude and heavily compressed raster images is severely limited.
These examples are merely examples, not robust and generally applicable code. The raster methods especially are likely to require adjustment for use with other images.

Juan Navarro
Juan Navarro on 10 Jul 2021
See this:
https://www.mathworks.com/matlabcentral/answers/383567-how-to-extract-x-y-data-values-from-matlab-figure
  1 Comment
DGM
DGM on 10 Jul 2021
It's worth noting that this works if the figure itself is available. If the figure window is open or is saved as a .fig file, the plotted data is embedded and can be extracted exactly. While this would be preferable, it's not an option if the only copy is a raster image. OP should probably clarify...

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!