How can I extract the data from this jpeg file. Any suggestion ?

17 views (last 30 days)

Answers (1)

DGM
DGM on 24 Sep 2021
Edited: DGM on 24 Sep 2021
Depending on what your expectations are, the answer is one of the following:
  1. You can't get what you want.
  2. You can, but you won't want to.
Let me explain. If you want the original data, you're out of luck. It's gone. Simply put, the image is not the data. All images of plotted data are simplified representations of the underlying data, intended for visualization only. The size of the image, the width of the lines, and the degree of destructive compression used all influence the precision of any recovery attempt. In this specific case, the image is a contour plot. A contour plot represents a height map which has been quantized. That means that the original height data is already lost. The best you can hope to recover is a roughly quantized version of it (with plenty of added inaccuracy in the steep areas).
If a crude quantized map is adequate for your needs, then you'll have to deal with all of the issues with the image. You could try a number of things, but bear in mind that steep areas where the isolines are very close or tangent are ambiguous and unrecoverable without simply guessing. You could try segmentation by color, but given that it's a JPG and some of the regions are no wider than the isolines themselves, I wouldn't bother. As always, I recommend to manually transcribe the data using a vector graphics editor of some sort and then process the result. Since a lot of guesswork is required, you can do a better job of making meaningful guesses than any ad-hoc script can.
This is an example of recovering a smooth line plot, but similar can be done to recover the isolines. Once you have the isolines (and make sure they don't intersect each other), you can interpolate between them (perhaps using scatteredInterpolant() or similar).
Given that you have multiple curves (the isolines) and that x-values will be repeated, the third method would be the best. You will need to omit the part where nonunique points are removed.
You may also need to bear in mind how exactly the image was generated in order to know how to translate the image+colorbar into the intended height values. If it came from MATLAB and was generated by contourf(), then consider that each isoline is adjacent to two colored regions. The height at the isoline is equal to the maximum of the two heights indicated by the colorbar. For example:
In other words, the height indicated by the color of each region is not an intermediate value between the isolines, it's the minimum value. Depending on where the plot came from, that's not always the case.

Categories

Find more on Images in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!