MATLAB Answers

Shifting Baselines of Raman Spectra

51 views (last 30 days)
I have been trying to detrend raman spectra to bring the baseline down and reduce the affects of flouresence on analysis but when I use the code pasted below (with various adjustments based on the data), it definitely does not come out right at all. Trying to get a xlim of 0-800 with wavenumber (col1) on the x axis and intensity (col2, what I need corrected) on the y axis. I am relatively new to the program but if anyone has advice I would greatly appreciate! (also attatched a csv of one of the data sets in case that helps)
t = 0:300;
dailyFluct = gallery('normaldata',size(t),2);
sdata = cumsum(dailyFluct) + 20 + t/100;
legend('Original Data','Location','northwest');
xlabel('Time (days)');
ylabel('Stock Price (dollars)');
detrend_sdata = detrend(sdata);
trend = sdata - detrend_sdata;
hold on
legend('Original Data','Trend','Detrended Data',...
'Mean of Detrended Data','Location','northwest')
xlabel('Time (days)');
ylabel('Stock Price (dollars)');


Sign in to comment.

Accepted Answer

Star Strider
Star Strider on 17 Sep 2019
This is an interesting problem!
It took me a while to figure out the correct approach, however it is deceptively simple, and uses only core MATLAB functions (although it requires R2017b or later for the ischange function). The essence of it is to use the 'linear' option of ischange to get the slopes of the various line segments, then use hiscounts to find the most numerous slopes, corresponding to the linear baseline of your Raman signal data. After that, it¹s just a polyfit call to calculate the slope of the line to be detrended. It requires a bit of interaction at the outset to do the initial thresholding, with the rest taking care of itself.
The Code —
D = csvread('raman-1 week.csv');
x = D(:,1);
y = D(:,2);
x = x(y >= 1E+4); % Threshold Data
y = y(y >= 1E+4); % Threshold Data
[Cp,Sl,Ic] = ischange(y,'linear'); % Detect Changes, Calculates Slopes (& Intercepts)
[Cts,Edg,Bin] = histcounts(Sl, 50); % Histogram Of Slopes
[Max,Binmax] = max(Cts); % Find Largest Bin
LinearRegion = (Bin==Binmax); % Logical Vector Of Values Corresponding To Largest Number Of Slopes
B = polyfit(x(LinearRegion), y(LinearRegion), 1) % Linear Fit
L = polyval(B, x); % Evaluate
yc = y - L; % Detrend
plot(x, yc)
The Plot —
That’s likely as close as it’s possible to get. I encourage you to experiment with it to fine-tune it to your requirements.
For comparison, the original signal is:


Csaba on 30 Sep 2019
It is a very elegant and clever solution. However, I have to add, that there are some limitations of this solution. It assumes that
1./ most of the spectrum is baseline, and
2./ the baseline is linear.
If these conditions are met than it is very nice. If the baseline is i.e. quadratic than you are stuck. If there are a lot of peaks and you can just assume a baseline you are stuck either.
Anyway I do not want to critcize the solution, moreover I got VERY good ideas, and thank you for it!
Star Strider
Star Strider on 30 Sep 2019
@Csaba — My pleasure!
Thank you!

Sign in to comment.

More Answers (2)

Lindsay Moon
Lindsay Moon on 17 Sep 2019
That is perfect thank you so much for your help!! I had a feeling it shouldn't have been that difficult but spectra are a hard thing to work with and flouresence is normally not an issue hence why our software wasn't able to correct for it! But this is great and will be very useful in the future.

  1 Comment

Star Strider
Star Strider on 17 Sep 2019
As always, my pleasure!
Thank you!
The solution wasn’t immediately obvious to me, and the key was discovering how to identify the linearly-increasing baseline.

Sign in to comment.

Lindsay Moon
Lindsay Moon on 17 Sep 2019
Right I understand and what you did makes complete sense now that I see it but would you mind briefly explaining why a normal linear regression didn't work? That still seems a little fuzzy in my mind since it, more or less, should automatically follow some sort of linear trend.

  1 Comment

Star Strider
Star Strider on 17 Sep 2019
A linear regression does work, The problem is finding the values to use in the linear regression. That’s what the ‘LinearRegion’ variable (and the way I constructed it) does.
Doing a linear regression over the entire data set doesn’t work because it’s necessary to isolate the trend first. The trend applies to the entire data set, however without the ‘LinearRegion’ subset of the data being defined first, the linear regression routine (I use polyfit and polyval here, since I’m only calling it once) regresses on all the data presented to it, including the peaks and offsets, not just the isolated region where the trend is most obvious. That leads to a result that does not produce the parameters that when evaluated will correctly detrend the spectrum.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!