| Bioinformatics Toolbox™ | ![]() |
IntensitiesOut =
msalign(MZ, Intensities, RefMZ)
... = msalign(..., 'Weights', WeightsValue,
...)
... = msalign(..., 'Range', RangeValue,
...)
... = msalign(..., 'WidthOfPulses', WidthOfPulsesValue,
...)
... = msalign(..., 'WindowSizeRatio', WindowSizeRatioValue,
...)
... = msalign(..., 'Iterations', IterationsValue,
...)
... = msalign(..., 'GridSteps', GridStepsValue,
...)
... = msalign(..., 'SearchSpace', SearchSpaceValue,
...)
... = msalign(..., 'ShowPlot', ShowPlotValue,
...)
[IntensitiesOut, RefMZOut]
= msalign(..., 'Group', GroupValue,
...)
| MZ | Vector of mass/charge (m/z) values for a spectrum or set of spectra. The number of elements in the vector equals n or the number of rows in the matrix Intensities. |
| Intensities | Either of the following:
The number of rows equals n or the number of elements in vector MZ. |
| RefMZ | Vector of m/z values of known reference masses in a sample spectrum. |
| WeightsValue | Vector of positive values, with the same number of elements as RefMZ. The default vector is ones(size(RefMZ)). |
| RangeValue | Two-element vector, in which the first element is negative and the second element is positive, that specifies the lower and upper limits of a range, in m/z units, relative to each peak. No peak will shift beyond these limits. Default is [-100 100]. |
| WidthOfPulsesValue | Positive value that specifies the width, in m/z units, for all the Gaussian pulses used to build the correlating synthetic spectrum. The point of the peak where the Gaussian pulse reaches 60.65% of its maximum is set to the width specified by WidthOfPulsesValue. Default is 10. |
| WindowSizeRatioValue | Positive value that specifies a scaling factor that determines the size of the window around every alignment peak. The synthetic spectrum is compared to the sample spectrum only within these regions, which saves computation time. The size of the window is given in m/z units by WidthOfPulsesValue * WindowSizeRatioValue. Default is 2.5, which means at the limits of the window, the Gaussian pulses have a value of 4.39% of their maximum. |
| IterationsValue | Positive integer that specifies the number of refining iterations. At every iteration, the search grid is scaled down to improve the estimates. Default is 5. |
| GridStepsValue | Positive integer that specifies the number of steps for the search grid. At every iteration, the search area is divided by GridStepsValue^2. Default is 20. |
| SearchSpaceValue | String that specifies the type of search space. Choices are:
|
| ShowPlotValue | Controls the display of a plot of an original and aligned spectrum
over the reference masses specified by RefMZ.
Choices are true, false, or I,
an integer specifying the index of a spectrum in Intensities.
If set to true, the first spectrum in Intensities is
plotted. Default is:
|
| GroupValue | Controls the creation of RefMZOut, a new vector of m/z values to be used as reference masses for aligning the peaks. This vector is created by adjusting the values in RefMZ, based on the sample data from multiple spectra in Intensities, such that the overall shifting and scaling of the peaks is minimized. Choices are true or false (default). |
| IntensitiesOut | Either of the following:
The intensity values represent a shifting and scaling of the data. |
| RefMZOut | Vector of m/z values of reference masses, calculated from RefMZ and the sample data from multiple spectra in Intensities, when GroupValue is set to true. |
IntensitiesOut = msalign(MZ, Intensities, RefMZ) aligns the peaks in a raw mass spectrum or spectra, represented by Intensities and MZ, to reference peaks, provided by RefMZ. First, it creates a synthetic spectrum from the reference peaks using Gaussian pulses centered at the m/z values specified by RefMZ. Then, it shifts and scales the m/z scale to find the maximum alignment between the input spectrum or spectra and the synthetic spectrum. (It uses an iterative multiresolution grid search until it finds the best scale and shift factors for each spectrum.) Once the new m/z scale is determined, the corrected spectrum or spectra are created by resampling their intensities at the original m/z values, creating IntensitiesOut, a vector or matrix of corrected intensity values. The resampling method preserves the shape of the peaks.
Note The msalign function works best with three to five reference peaks (marker masses) that you know will appear in the spectrum. If you use a single reference peak (internal standard), there is a possibility of aligning sample peaks to the incorrect reference peaks as msalign both scales and shifts the MZ vector. If using a single reference peak, you might need to only shift the MZ vector. To do this, use IntensitiesOut = interp1(MZ, Intensities, MZ-(ReferenceMass-ExperimentalMass). For more information, see Aligning Mass Spectrum with One Reference Peak. |
... = msalign(..., 'PropertyName', PropertyValue, ...) calls msalign with optional properties that use property name/property value pairs. You can specify one or more properties in any order. Each PropertyName must be enclosed in single quotation marks and is case insensitive. These property name/property value pairs are as follows:
... = msalign(..., 'Weights', WeightsValue,
...) specifies the relative weight for each mass in RefMZ,
the vector of reference m/z values. WeightsValue is
a vector of positive values, with the same number of elements as RefMZ.
The default vector is ones(size(RefMZ)),
which means each reference peak is weighted equally, so that more
intense reference peaks have a greater effect in the alignment algorithm.
If you have a less intense reference peak, you can increase its weight
to emphasize it more in the alignment algorithm.
... = msalign(..., 'Range', RangeValue, ...) specifies the lower and upper limits of the range, in m/z units, relative to each peak. No peak will shift beyond these limits. RangeValue is a two-element vector, in which the first element is negative and the second element is positive. Default is [-100 100].
Note Use these values to tune the robustness of the algorithm. Ideally, you should keep the range within the maximum expected shift. If you try to correct larger shifts by increasing the limits, you increase the possibility of picking incorrect peaks to align to the reference masses. |
... = msalign(..., 'WidthOfPulses', WidthOfPulsesValue, ...) specifies the width, in m/z units, for all the Gaussian pulses used to build the correlating synthetic spectrum. The point of the peak where the Gaussian pulse reaches 60.65% of its maximum is set to the width specified by WidthOfPulsesValue. Choices are any positive value. Default is 10. WidthOfPulsesValue may also be a function handle. The function is evaluated at the respective m/z values and returns a variable width for the pulses. Its evaluation should give reasonable values between 0 and max(abs(Range)); otherwise, the function returns an error.
Note Tuning the spread of the Gaussian pulses controls a tradeoff between robustness (wider pulses) and precision (narrower pulses). However, the spread of the pulses is unrelated to the shape of the observed peaks in the spectrum. The purpose of the pulse spread is to drive the optimization algorithm. |
... = msalign(..., 'WindowSizeRatio', WindowSizeRatioValue, ...) specifies a scaling factor that determines the size of the window around every alignment peak. The synthetic spectrum is compared to the sample spectrum only within these regions, which saves computation time. The size of the window is given in m/z units by WidthOfPulsesValue * WindowSizeRatioValue. Choices are any positive value. Default is 2.5, which means at the limits of the window, the Gaussian pulses have a value of 4.39% of their maximum.
... = msalign(..., 'Iterations', IterationsValue, ...) specifies the number of refining iterations. At every iteration, the search grid is scaled down to improve the estimates. Choices are any positive integer. Default is 5.
... = msalign(..., 'GridSteps', GridStepsValue, ...) specifies the number of steps for the search grid. At every iteration, the search area is divided by GridStepsValue^2. Choices are any positive integer. Default is 20.
... = msalign(..., 'SearchSpace', SearchSpaceValue, ...) specifies the type of search space. Choices are:
'regular' — Default. Evenly spaced lattice.
'latin' — Random Latin hypercube with GridStepsValue^2 samples.
... = msalign(..., 'ShowPlot', ShowPlotValue, ...) controls the display of a plot of an original and aligned spectrum over the reference masses specified by RefMZ. Choices are true, false, or I, an integer specifying the index of a spectrum in Intensities. If set to true, the first spectrum in Intensities is plotted. Default is:
false — When return values are specified.
true — When return values are not specified.
[IntensitiesOut, RefMZOut] = msalign(..., 'Group', GroupValue, ...) controls the creation of RefMZOut, a new vector of m/z values to be used as reference masses for aligning the peaks. This vector is created by adjusting the values in RefMZ, based on the sample data from multiple spectra in Intensities, such that the overall shifting and scaling of the peaks is minimized. Choices are true or false (default).
Tip Set GroupValue to true only if Intensities contains data for a large number of spectra, and you are not confident of the m/z values used for your reference peaks in RefMZ. Leave GroupValue set to false if you are confident of the m/z values used for your reference peaks in RefMZ. |
Aligning Mass Spectrum with Three or More Reference Peaks
Load sample data, reference masses, and parameter data for synthetic peak width.
load sample_lo_res R = [3991.4 4598 7964 9160]; W = [60 100 60 100];
Display a color image of the mass spectra before alignment.
msheatmap(MZ_lo_res,Y_lo_res,'markers',R,'range',[3000 10000])
title('before alignment')

Align spectra with reference masses and display a color image of mass spectra after alignment.
YA = msalign(MZ_lo_res,Y_lo_res,R,'weights',W);
msheatmap(MZ_lo_res,YA,'markers',R,'range',[3000 10000])
title('after alignment')

Aligning Mass Spectrum with One Reference Peak
It is not recommended to use the msalign function if you have only one reference peak. Instead, use the following procedure, which shifts the MZ vector, but does not scale it.
Load sample data and view the first sample spectrum.
load sample_lo_res MZ = MZ_lo_res; Y = Y_lo_res(:,1); msviewer(MZ, Y)

Use the tall peak around 4000 m/z as the reference
peak. To determine the reference peak's m/z value, click
, and then click-drag
to zoom in on the peak. Right-click in the center of the peak, and
then click Add Marker to label the peak with
its m/z value.

Shift a spectrum by the difference between RP, the known reference mass of 4000 m/z, and SP, the experimental mass of 4051.14 m/z.
RP = 4000; SP = 4051.14; YOut = interp1(MZ, Y, MZ-(RP-SP));
Plot the original spectrum in red and the shifted spectrum in blue and zoom in on the reference peak.
plot(MZ,Y,'r',MZ,YOut,'b:')
xlabel('Mass/Charge (M/Z)')
ylabel('Relative Intensity')
legend('Y','YOut')
axis([3600 4800 -2 60])

[1] Monchamp, P., Andrade-Cetto, L., Zhang, J.Y., and Henson, R. (2007) Signal Processing Methods for Mass Spectrometry. In Systems Bioinformatics: An Engineering Case-Based Approach, G. Alterovitz and M.F. Ramoni, eds. (Artech House Publishers).
Bioinformatics Toolbox™ functions: msbackadj, msheatmap, mspalign, mspeaks, msresample, msviewer
![]() | molviewer | msbackadj | ![]() |
| © 1984-2008- The MathWorks, Inc. - Site Help - Patents - Trademarks - Privacy Policy - Preventing Piracy - RSS |