Shifting Baselines of Raman Spectra

87 views (last 30 days)
Hi,
I have been trying to detrend raman spectra to bring the baseline down and reduce the affects of flouresence on analysis but when I use the code pasted below (with various adjustments based on the data), it definitely does not come out right at all. Trying to get a xlim of 0-800 with wavenumber (col1) on the x axis and intensity (col2, what I need corrected) on the y axis. I am relatively new to the program but if anyone has advice I would greatly appreciate! (also attatched a csv of one of the data sets in case that helps)
t = 0:300;
dailyFluct = gallery('normaldata',size(t),2);
sdata = cumsum(dailyFluct) + 20 + t/100;
mean(sdata)
figure
plot(t,sdata);
legend('Original Data','Location','northwest');
xlabel('Time (days)');
ylabel('Stock Price (dollars)');
detrend_sdata = detrend(sdata);
trend = sdata - detrend_sdata;
mean(detrend_sdata)
hold on
plot(t,trend,':r')
plot(t,detrend_sdata,'m')
plot(t,zeros(size(t)),':k')
legend('Original Data','Trend','Detrended Data',...
'Mean of Detrended Data','Location','northwest')
xlabel('Time (days)');
ylabel('Stock Price (dollars)');

Accepted Answer

Star Strider
Star Strider on 17 Sep 2019
This is an interesting problem!
It took me a while to figure out the correct approach, however it is deceptively simple, and uses only core MATLAB functions (although it requires R2017b or later for the ischange function). The essence of it is to use the 'linear' option of ischange to get the slopes of the various line segments, then use hiscounts to find the most numerous slopes, corresponding to the linear baseline of your Raman signal data. After that, it¹s just a polyfit call to calculate the slope of the line to be detrended. It requires a bit of interaction at the outset to do the initial thresholding, with the rest taking care of itself.
The Code —
D = csvread('raman-1 week.csv');
x = D(:,1);
y = D(:,2);
x = x(y >= 1E+4); % Threshold Data
y = y(y >= 1E+4); % Threshold Data
[Cp,Sl,Ic] = ischange(y,'linear'); % Detect Changes, Calculates Slopes (& Intercepts)
[Cts,Edg,Bin] = histcounts(Sl, 50); % Histogram Of Slopes
[Max,Binmax] = max(Cts); % Find Largest Bin
LinearRegion = (Bin==Binmax); % Logical Vector Of Values Corresponding To Largest Number Of Slopes
B = polyfit(x(LinearRegion), y(LinearRegion), 1) % Linear Fit
L = polyval(B, x); % Evaluate
yc = y - L; % Detrend
figure
plot(x, yc)
grid
The Plot —
That’s likely as close as it’s possible to get. I encourage you to experiment with it to fine-tune it to your requirements.
For comparison, the original signal is:
  2 Comments
Star Strider
Star Strider on 30 Sep 2019
@Csaba — My pleasure!
Thank you!

Sign in to comment.

More Answers (2)

Lindsay Moon
Lindsay Moon on 17 Sep 2019
That is perfect thank you so much for your help!! I had a feeling it shouldn't have been that difficult but spectra are a hard thing to work with and flouresence is normally not an issue hence why our software wasn't able to correct for it! But this is great and will be very useful in the future.
  1 Comment
Star Strider
Star Strider on 17 Sep 2019
As always, my pleasure!
Thank you!
The solution wasn’t immediately obvious to me, and the key was discovering how to identify the linearly-increasing baseline.

Sign in to comment.


Lindsay Moon
Lindsay Moon on 17 Sep 2019
Right I understand and what you did makes complete sense now that I see it but would you mind briefly explaining why a normal linear regression didn't work? That still seems a little fuzzy in my mind since it, more or less, should automatically follow some sort of linear trend.
  1 Comment
Star Strider
Star Strider on 17 Sep 2019
Sure.
A linear regression does work, The problem is finding the values to use in the linear regression. That’s what the ‘LinearRegion’ variable (and the way I constructed it) does.
Doing a linear regression over the entire data set doesn’t work because it’s necessary to isolate the trend first. The trend applies to the entire data set, however without the ‘LinearRegion’ subset of the data being defined first, the linear regression routine (I use polyfit and polyval here, since I’m only calling it once) regresses on all the data presented to it, including the peaks and offsets, not just the isolated region where the trend is most obvious. That leads to a result that does not produce the parameters that when evaluated will correctly detrend the spectrum.

Sign in to comment.

Categories

Find more on Interpolation in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!