what is the relationship between noise and outliers ??

2 views (last 30 days)
i know the difference between noise and outlier but i want to know the relationships of them and the effects of noises ... are they cause outliers? and if you know the link of data-sets with outliers please give me the links. thank you

Accepted Answer

Walter Roberson
Walter Roberson on 9 Oct 2015
Yes and no.
  • the measurement might have been contaminated
  • a measurement might have been entered in the wrong file
  • the wrong file might have been input (such as due to a small mistake in entering the file name)
  • the measurement device might not have been tuned properly for the measurement
  • the wrong sample might have been measured
  • the measuring device might not be working properly, perhaps because it is being used in conditions it is not designed for
  • some kinds of noise can smear readings by a large amount even if the probability is low ("1 in a million chances happen 9 times out of 10") -- a case where the noise "caused" the outlier in the sense you probably meant
  • there might be additional processes at work that were not accounted for, and might not always occur, such as you might get a sudden spike in chemicals in water right as fish lay eggs
  • some kinds of noise can trigger large reactions, where the reading of what actually happened might be pretty accurate but the large reaction does not occur much -- a case where the noise "caused" an actual change, which is probably not what you meant about causing outliers.
An example of noise triggering a large reaction: Suppose you flip a coin on to a surface and time how long it takes to come to a rest on its side. If the process is truly random (uncertainty in the initial velocities overwhelm the prediction calculations), then from time to time the coin will land on its edge, and may roll for a time and may even end up stopping on edge. There may be a low probability of the system entering a quite different dynamic state then typical, but it happens. This can be actually be quite important in quantum mechanical processes.

More Answers (2)

Bjorn Gustavsson
Bjorn Gustavsson on 9 Oct 2015
Edited: Bjorn Gustavsson on 9 Oct 2015
Outliers are (very loosely speaking) noise with very large deviations from the expected values. This might be because your noise distribution has very long tails - this might give you a few observations where the noise-contribution becomes very large. From an observation viewpoint outliers might also be caused by malfunctioning/disturbances to your observation system that otherwise has a very nice and well-behaved noise characteristics. So in reality everything becomes complicated. To make yourself a data-set you can do something like this:
x = linspace(0,100,1001);
d = 12 + x/10+x.^2/100 + ...
2*sin(2*pi/100*3*x) + ...
0.5*randn(size(x)) + 10*sprandn(1,length(x),0.03).^2;
Adjust the distributions of the "noise" and the "outliers" to suit your needs.
HTH

Thorsten
Thorsten on 9 Oct 2015
You need a model of your data. There is no general relation between noise and outliers. You may find the following useful http://charuaggarwal.net/outlierbook.pdf

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!