Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

error message using polyfit (nonlinear regression)

Asked by Locks on 21 Apr 2013

hi,

I get the following error meesage using the polyfit function:

Warning: Polynomial is badly conditioned. Add points with distinct X
         values, reduce the degree of the polynomial, or try centering
         and scaling as described in HELP POLYFIT. 

Has anybody see that before and has an idea what I need to do? I tried it with the help function, but I didn't understand what excatly could be false

the code I am using if the following, in case it helps:

   if length(dataT(:,1))==1
              SlopeSkew(number)=0;
              elseif length(dataT(:,1))==2
              SlopeSkew(number)=0;
              else
              % x is the Strike
              x= dataT(:,2);
              %is the implied volatility
              y=dataT(:,10);
              p = polyfit(x,y,2);
              f = polyval(p,x);
  thanks!
              a=p(3);
              b=p(2);
              c=p(1);
              SlopeSkew(number)=b+2*c.*x;
              Slope=SlopeSkew';

0 Comments

Locks

Products

No products are associated with this question.

3 Answers

Answer by Image Analyst on 21 Apr 2013
Accepted answer

So what is the length of x and y, and do you have any repeated x values?

4 Comments

Image Analyst on 21 Apr 2013

That did not answer the questions.

Locks on 21 Apr 2013

yes, I have repeated x and y values

the full code looks like this:

for putCall =1:2
    dataPutCall = data16(data16(:,3)==putCall,:);
    dates=unique(dataPutCall(:,6));
    for i=1:length(dates)
        date=dates(i);
          dataDate = dataPutCall(dataPutCall(:,6) == date,:);
          Ts = unique(dataDate(:,5));
          for indexT = 1:length(Ts)
              T = Ts(indexT);
              dataT = dataDate(dataDate(:,5) == T,:);
              number=dataT(:,13);
              if length(dataT(:,1))==1
              SlopeSkew(number)=0;
              elseif length(dataT(:,1))==2
              SlopeSkew(number)=0;
              else
              % x is the Strike
              x= dataT(:,2);
              %is the implied volatility
              y=dataT(:,10);
              p = polyfit(x,y,2);
              f = polyval(p,x);
              a=p(3);
              b=p(2);
              c=p(1);
              SlopeSkew(number)=b+2*c.*x;
              Slope=SlopeSkew';
              end
          end
      end
  end

so there is not a specific lenght of x, because x is for each iteration different and from the error message I can't see, which x is causing a problem.

do you know what the error message stands for?

Image Analyst on 21 Apr 2013

If you have two Y values for the same x value, then it doesn't like that and will complain. All your x values have to be unique. You can use

uniqueX = unique(x)

and see if length(uniqueX) is the same length as length(x). If they're the same then there are no repeats. If unique() returns a shorter vector, then at least one x value is repeated and you need to decide how to handle that. You might be able to just add a very small amount to one of the x's, like 0.000001, just to make sure they are not the same anymore.

Image Analyst
Answer by Tom Lane on 22 Apr 2013

There was a time when this function issued an error asking you not to have repeated X values. But the new error message is more accurate. You don't need unique X values. It's just that repeated X values won't allow you to estimate higher-order polynomials. So for instance:

x = [1;2;3;3;4];
y = (1:5)';
polyfit(x,y,2)
polyfit(x,y,4)

The first call to polyfit works. The second would work if we had 5 points with distinct X values, but it doesn't work here because the 4 distinct X values allow polynomials only up to an exponent of 3.

In your example of fitting up to power 2, it seems like you either don't have 3 distinct points, or you have very ill-conditioned data.

2 Comments

Locks on 22 Apr 2013

I tried it first for a smaller sample and there I had no issues doing that, but for the larger sample the error message appeared. This part of the code here:

                if length(dataT(:,1))==1
                SlopeSkew(number)=0;
                elseif length(dataT(:,1))==2
                SlopeSkew(number)=0;
                else

should make sure that for the computation, there should be always at least 3 datapoints, so do you have any idea what else it could be

Tom Lane on 23 Apr 2013

I don't know exactly. Of the following three, the first one works. The second does not because there are only two distinct x values. The third does not because it is very ill-conditioned.

polyfit([0;1;2],[10;20;30],2);
polyfit([0;1;1],[10;20;30],2);
polyfit([0;eps;1],[10;20;30],2);

I don't know what the issue is in your case. Try to boil this down to a specific call to polyfit, then examine the x and y values in that call and see how they look.

Tom Lane
Answer by Jan Simon on 23 Apr 2013

Instead of repeated values, did you test the condition of the problem already? The docs suggest to use

[p, S, mu] = polyfit(x,y,n)

for a proper scaling. The matrix for the least-squares fit is ill-conditioned, when the values of x have a wide range and are far away from zero. Therefore the scaling does:

xx = (x - mean(x)) / std(x)

to get all data near to zero. The conversion back to the original values in POLIVAL is trivial.

3 Comments

Image Analyst on 23 Apr 2013

I use s and mu when it specifically tells me to, not by default, though it certainly can't hurt, though, like you said, there's the extra step of converting the output of polyval. He says, and MATLAB also says, that there are repeated values so I think that will have to be fixed.

Locks on 23 Apr 2013

sorry but I am not sure what excatly I need to change. I looked at the docs youre mentionning http://www.mathworks.ch/ch/help/matlab/ref/polyfit.html before startig with the code and p = polyfit(x,y,n) should work as well or am I mistaken? a bit strage is, that everything is working fine for different datasets with each 8000 rows, but for a dataset that consists of 30'000 rows it's not worwing. I am tring to find out at which point the problem is but it would be really heplful to get a hint what I need to focus at.

Locks on 24 Apr 2013

I have found the error, thanks for the support!

Jan Simon

Contact us