Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
Simple optimisation question

Subject: Simple optimisation question

From: Frédéric Bergeron

Date: 28 Jun, 2010 13:36:23

Message: 1 of 15

Hi all,

First, thanks for the answer I got in the past in this forum, hope I'll get some today! :)

My situation:
I got a set of 1-dimensional data (data) and I want to find the scalar x that respect a previously defined ratio R:

R=sum(data(over)-x)/sum(x-data(under));
where:
data(over) represent all the data over x: data(over)=(data>=x);
data(under) represent all the data under x: data(under)=(data<x);

So R is the ratio of the sum of difference between data over x and x, over the sum of difference between data under x and x.

My problem is that I need the value of x to define data(over) and data(under). Is there's a function that could help me? I read about lsqlin() that can solve a problem with inequality constraint but I didn't get success at using it.

Thanks in advance!
Best regards,

Subject: Simple optimisation question

From: Alan Weiss

Date: 28 Jun, 2010 13:53:12

Message: 2 of 15

On 6/28/2010 9:36 AM, Frédéric Bergeron wrote:
> Hi all,
>
> First, thanks for the answer I got in the past in this forum, hope I'll
> get some today! :)
>
> My situation:
> I got a set of 1-dimensional data (data) and I want to find the scalar x
> that respect a previously defined ratio R:
>
> R=sum(data(over)-x)/sum(x-data(under));
> where:
> data(over) represent all the data over x: data(over)=(data>=x);
> data(under) represent all the data under x: data(under)=(data<x);
>
> So R is the ratio of the sum of difference between data over x and x,
> over the sum of difference between data under x and x.
>
> My problem is that I need the value of x to define data(over) and
> data(under). Is there's a function that could help me? I read about
> lsqlin() that can solve a problem with inequality constraint but I
> didn't get success at using it.
>
> Thanks in advance!
> Best regards,

I think you would do better to use fzero.

Alan Weiss
MATLAB mathematical toolbox documentation

Subject: Simple optimisation question

From: Matt J

Date: 28 Jun, 2010 14:56:23

Message: 3 of 15

"Frédéric Bergeron" <frederic.bergeron@logiag.com> wrote in message <i0a8gm$kju$1@fred.mathworks.com>...

> My problem is that I need the value of x to define data(over) and data(under). Is there's a function that could help me? I read about lsqlin() that can solve a problem with inequality constraint but I didn't get success at using it.
=======

I could be wrong, but it seems like you can solve this by simple algebra. Let's make the following definitions

N=length(data); %known
S=sum(data); %known
O=sum(data(over)-x)
U=sum(x-data(under))

Then you have 2 equations that allow you to solve for O and U:

O-R*U=0
O+U=S;

Once you know O and U, you have the relationship

O-U=S-N*x

which can be used to solve explicitly for x

x=(S+U-O)/N

Subject: Simple optimisation question

From: Matt J

Date: 28 Jun, 2010 15:40:26

Message: 4 of 15

"Matt J " <mattjacREMOVE@THISieee.spam> wrote in message <i0ad6n$91i$1@fred.mathworks.com>...

>
> I could be wrong, but it seems like you can solve this by simple algebra. Let's make the following definitions
>
> N=length(data); %known
> S=sum(data); %known
> O=sum(data(over)-x)
> U=sum(x-data(under))
>
> Then you have 2 equations that allow you to solve for O and U:
>
> O-R*U=0
> O+U=S;
========

Forget it. I pulled the equation O+U=S from my imagination somehow.

In any case, there are at most N different possibilities for the sets
data(over) and data(under). For each one, x ranges over a known interval separating data(over) from data(under) and in that interval O -R*U is a known linear function of x. You could simply loop over each of these N sets and test whether
O(x)-R*U(x) has a root in that interval.

 

Subject: Simple optimisation question

From: Matt J

Date: 28 Jun, 2010 16:14:05

Message: 5 of 15

"Matt J " <mattjacREMOVE@THISieee.spam> wrote in message <i0afpa$3is$1@fred.mathworks.com>...

> In any case, there are at most N different possibilities for the sets
> data(over) and data(under). For each one, x ranges over a known interval separating data(over) from data(under) and in that interval O -R*U is a known linear function of x. You could simply loop over each of these N sets and test whether
> O(x)-R*U(x) has a root in that interval.
===========

The following code implements this, assuming that all data points are unique. I leave the generalization as an exercise.

%data
R=0.75;
data=rand(1000,1);


%engine
data=sort(data);
S=sum(data);
N=length(data);

Sunder=cumsum(data(1:end-1));
Sover=S-Sunder;
Nunder=(1:N-1).';
Nover=N-Nunder;


X=(Sover+R*Sunder)./(Nover+R*Nunder);

x=X(X>=data(1:end-1) & X<=data(2:end))

%Check
over=(data>=x);
under=~over;
Rcheck=sum(data(over)-x)/sum(x-data(under))

Subject: Simple optimisation question

From: Frédéric Bergeron

Date: 28 Jun, 2010 16:25:30

Message: 6 of 15

"Matt J " <mattjacREMOVE@THISieee.spam> wrote in message <i0afpa$3is$1@fred.mathworks.com>...
> "Matt J " <mattjacREMOVE@THISieee.spam> wrote in message <i0ad6n$91i$1@fred.mathworks.com>...
>
> >
> > I could be wrong, but it seems like you can solve this by simple algebra. Let's make the following definitions
> >
> > N=length(data); %known
> > S=sum(data); %known
> > O=sum(data(over)-x)
> > U=sum(x-data(under))
> >
> > Then you have 2 equations that allow you to solve for O and U:
> >
> > O-R*U=0
> > O+U=S;
> ========
>
> Forget it. I pulled the equation O+U=S from my imagination somehow.
>
> In any case, there are at most N different possibilities for the sets
> data(over) and data(under). For each one, x ranges over a known interval separating data(over) from data(under) and in that interval O -R*U is a known linear function of x. You could simply loop over each of these N sets and test whether
> O(x)-R*U(x) has a root in that interval.


Hey thank you,


The O+U=S equation is fine for me, I just don't get the O-U=S-N*x.
Anyway, I decided to do a loop for a lot of values in the interval [min(data) max(data)], than calulate the ratio R for each of them and then take the value associated with the ratio R the closest than the previously defined one.

So I started by making a for loop, than I tought a while loop will be better and more efficient to locate the value with the least error on R. I redefine the upper or lower value of the interval for each while loop, and it's way faster. Here's my code if someone interested:

*****************************
clear all;

%Input
R=2; %Cut/fill ratio
datamin=55;
datamax=65;
data=datamin+(datamax-datamin)*rand(1,50);

%% First method
tic;
x=min(data):(max(data)-min(data))/200:max(data);
for i=1:length(x)
    y(i)=abs(R*sum(x(i)-data(data<x(i)))-sum(data(data>=x(i))-x(i)));
end
x_best1=x(y==min(y));
R2_1=sum(data(data>=x_best1)-x_best1)/sum(x_best1-data(data<x_best1));
erreur_ratio1=abs(R2_1-R)/R;
t1=toc;

%% Second method
tic;
erreur_ratio2=1; %error on R
x1=min(data); %Lower value of the interval around the x_best
x2=max(data); %Upper value
                    
while erreur_ratio2>0.01

    %Calculs
    x_best2=x1+(x2-x1)/2; %Taking the middle value of the interval

    R2_2=sum(data(data>=x_best2)-x_best2)/sum(x_best2-data(data<x_best2));
    if R2_2==R;
        break %If the good ratio is found
    elseif R2_2<R
        x2=x_best2;
    else %R2>R
        x1=x_best2;
    end
    erreur_ratio2=abs(R2_2-R)/R;
end
t2=toc;

%% Methods comparison
fprintf('\nt1= %g\nerreur_ratio1= %g\nx_best1= %g\n',t1,erreur_ratio1,x_best1);
fprintf('\nt2= %g\nerreur_ratio2= %g\nx_best2= %g\n\n',t2,erreur_ratio2,x_best2);

****************
and I get:

t1= 0.030075
erreur_ratio1= 0.0113545
x_best1= 59.3732

t2= 0.0003564
erreur_ratio2= 0.00205718
x_best2= 59.3845

Thanks again for the answers!!
Fred

Subject: Simple optimisation question

From: Frédéric Bergeron

Date: 28 Jun, 2010 16:37:05

Message: 7 of 15

"Matt J " <mattjacREMOVE@THISieee.spam> wrote in message
> The following code implements this, assuming that all data points are unique. I leave the generalization as an exercise.
> ...
> %data
> R=0.75;
> data=rand(1000,1);
>
>
> %engine
> data=sort(data);
> S=sum(data);
> N=length(data);
>
> Sunder=cumsum(data(1:end-1));
> Sover=S-Sunder;
> Nunder=(1:N-1).';
> Nover=N-Nunder;
>
>
> X=(Sover+R*Sunder)./(Nover+R*Nunder);
>
> x=X(X>=data(1:end-1) & X<=data(2:end))
>
> %Check
> over=(data>=x);
> under=~over;
> Rcheck=sum(data(over)-x)/sum(x-data(under))

=========

Hey,

Thanks for this code, but since I got a working code in my last message, I think I'll use the code with the while loop. It's maybe longer than your code but since I don't understand all the things you do, I'll use my code, while referring to yours if I get any problem. Thanks!

Subject: Simple optimisation question

From: Matt J

Date: 28 Jun, 2010 17:11:08

Message: 8 of 15

"Frédéric Bergeron" <frederic.bergeron@logiag.com> wrote in message <i0aj3h$e0m$1@fred.mathworks.com>...

> Thanks for this code, but since I got a working code in my last message, I think I'll use the code with the while loop. It's maybe longer than your code but since I don't understand all the things you do, I'll use my code, while referring to yours if I get any problem. Thanks!
==============

That's fine, as long as you only need a modestly accurate result. Just for your reference though, I reran your test, but substituting your Method #1 with my closed-form method. As you can see, the closed-form method accuracy achieves machine precision accuracy in approximately the same amount of time.

%Closed form
t1= 0.000141917
erreur_ratio1= 2.96059e-016
x_best1= 0.524563

%Iterative approx.
t2= 0.000346971
erreur_ratio2= 0.00396906
x_best2= 0.52504

Subject: Simple optimisation question

From: Frédéric Bergeron

Date: 28 Jun, 2010 18:33:05

Message: 9 of 15

> That's fine, as long as you only need a modestly accurate result. Just for your reference though, I reran your test, but substituting your Method #1 with my closed-form method. As you can see, the closed-form method accuracy achieves machine precision accuracy in approximately the same amount of time.
>
> %Closed form
> t1= 0.000141917
> erreur_ratio1= 2.96059e-016
> x_best1= 0.524563
>
> %Iterative approx.
> t2= 0.000346971
> erreur_ratio2= 0.00396906
> x_best2= 0.52504

Cool thanks again... Yes I only need modestly accurate result. This was for a GUI leveling agricultural fields. So I have just finished transforming my program in 3D and instead of finding a single scalar, it finds the constant of the plane equation, i.e. c in z=ax+by+c, that respect the ratio R.

Subject: Simple optimisation question

From: Roger Stafford

Date: 29 Jun, 2010 17:24:05

Message: 10 of 15

"Matt J " <mattjacREMOVE@THISieee.spam> wrote in message <i0ahod$f3u$1@fred.mathworks.com>...
> The following code implements this, assuming that all data points are unique. I leave the generalization as an exercise.
>
> R=0.75;
> data=rand(1000,1);
> %engine
> data=sort(data);
> S=sum(data);
> N=length(data);
> Sunder=cumsum(data(1:end-1));
> Sover=S-Sunder;
> Nunder=(1:N-1).';
> Nover=N-Nunder;
> X=(Sover+R*Sunder)./(Nover+R*Nunder);
> x=X(X>=data(1:end-1) & X<=data(2:end))
- - - - - - - - - -
  Hello Matt J. I want to congratulate you on that solution you came up with so quickly. I worked for quite a while on the problem before looking at yours and came up with something that was based on the same principles but a lot more complicated. That is, I directly solved for all the R values that correspond to boundaries between the discrete (and sorted) data values, and then checked to see which pair of these the given R lay in. Then I solved for x using the appropriate Sunder/Sover pair. Your way is more efficient.

  However, I notice that both our methods are subject to a possible error which, with an unfortunate choice of R, can actually occur. Because of round off errors it is possible that the x you obtain in

 x=X(X>=data(1:end-1) & X<=data(2:end))

can come up with either two answers which are essentially the same or no answer. I have checked and found this to happen when the given R value is very near one of the R boundary values mentioned above. The case of two values is easily dealt with by just choosing the first one since they are essentially equal. However when no answer is provided, that requires a little more effort. Perhaps the two inequalities above can be slightly widened by a sufficient "epsilon" amount based on possible round off errors to always provide at least one value. Or possibly using a more robust method of defining an "interval distance" function which is zero within an interval but equal to the distance from the closer end otherwise, and then choosing the nearest interval with the min function.

  Hello Fred. My advice to you is to make use of Matt's solution. It is a very efficient and accurate algorithm. The rare difficulty I have discussed here is quite easily overcome.

Roger Stafford

Subject: Simple optimisation question

From: Matt J

Date: 29 Jun, 2010 18:49:21

Message: 11 of 15

"Roger Stafford" <ellieandrogerxyzzy@mindspring.com.invalid> wrote in message <i0da7l$e7j$1@fred.mathworks.com>...

> However, I notice that both our methods are subject to a possible error which, with an unfortunate choice of R, can actually occur. Because of round off errors it is possible that the x you obtain in
>
> x=X(X>=data(1:end-1) & X<=data(2:end))
>
> can come up with either two answers which are essentially the same or no answer. I have checked and found this to happen when the given R value is very near one of the R boundary values mentioned above. The case of two values is easily dealt with by just choosing the first one since they are essentially equal. However when no answer is provided, that requires a little more effort. Perhaps the two inequalities above can be slightly widened by a sufficient "epsilon" amount based on possible round off errors to always provide at least one value. Or possibly using a more robust method of defining an "interval distance" function which is zero within an interval but equal to the distance from the closer end otherwise, and then choosing the nearest interval with the min function.
======================

Roger - I see what you mean. In the case where x=[], my solution would probably be to assign x to the X(i) that minimizes norm(X(i)-data(i)).

The idea being that this numerical problem happens when the computed X(i) values are barely distinguishable from one of the data points.

Subject: Simple optimisation question

From: Roger Stafford

Date: 29 Jun, 2010 19:22:15

Message: 12 of 15

"Matt J " <mattjacREMOVE@THISieee.spam> wrote in message <i0df7h$br3$1@fred.mathworks.com>...
> Roger - I see what you mean. In the case where x=[], my solution would probably be to assign x to the X(i) that minimizes norm(X(i)-data(i)).
>
> The idea being that this numerical problem happens when the computed X(i) values are barely distinguishable from one of the data points.
- - - - - - - -
  How about this for a fix, Matt? It seems the most robust. There is no way it can obtain other than a single value and we don't have to go through messy round-off error estimates. Also you don't have to make a special case of x = [] beforehand.

 [ignore,ix] = min((max(data(1:end-1)-X,0)+max(X-data(2:end),0));
 x = X(ix)

Roger Stafford

Subject: Simple optimisation question

From: Matt J

Date: 29 Jun, 2010 20:55:26

Message: 13 of 15

"Roger Stafford" <ellieandrogerxyzzy@mindspring.com.invalid> wrote in message <i0dh57$hnq$1@fred.mathworks.com>...

> - - - - - - - -
> How about this for a fix, Matt? It seems the most robust. There is no way it can obtain other than a single value and we don't have to go through messy round-off error estimates. Also you don't have to make a special case of x = [] beforehand.
>
> [ignore,ix] = min((max(data(1:end-1)-X,0)+max(X-data(2:end),0));
> x = X(ix)
>
============

I guess that'll do it. I'm assuming the approach you started out with was something like the following? It works with the boundary values of R like you were describing. If so, I'm not sure why you abandoned it, since it seems a lot faster and also numerically well behaved as long as R is not too close to 0


if R==0
   
  x=max(data);
    
elseif R==inf
    
  x=min(data);
    
else % min(data) < x < max(data)
    
data=sort(data(:));
S=sum(data);
N=length(data);

Xl=data(1:end-1);

Sunder=cumsum(Xl);
Sover=S-Sunder;
Nunder=(1:N-1).';
Nover=N-Nunder;


Rgrid=(Sover-Xl.*Nover)./(Xl.*Nunder-Sunder);

idx=find(R-Rgrid>0,1);


x=(Sover(idx)+R*Sunder(idx))./(Nover(idx)+R*Nunder(idx));

end

Subject: Simple optimisation question

From: Roger Stafford

Date: 29 Jun, 2010 21:26:06

Message: 14 of 15

"Matt J " <mattjacREMOVE@THISieee.spam> wrote in message <i0dmju$g1k$1@fred.mathworks.com>...
> I guess that'll do it. I'm assuming the approach you started out with was something like the following? It works with the boundary values of R like you were describing. If so, I'm not sure why you abandoned it, since it seems a lot faster and also numerically well behaved as long as R is not too close to 0
> ........
- - - - - - - -
  Well, it was something like that except that I fooled around with transforming R to W by w = 1/(R+1) which maps R onto [0,1], thinking that might simplify things. As it turned out it was no particular advantage, and about that time I saw your solution and said, "Aw heck, Matt J has beaten me to it again, and stopped work." So it's all your fault for being so quick - no, I'm kidding.

Roger Stafford

Subject: Simple optimisation question

From: Matt J

Date: 29 Jun, 2010 22:10:23

Message: 15 of 15

"Roger Stafford" <ellieandrogerxyzzy@mindspring.com.invalid> wrote in message <i0dode$eg3$1@fred.mathworks.com>...

> Well, it was something like that except that I fooled around with transforming R to W by w = 1/(R+1) which maps R onto [0,1], thinking that might simplify things. As it turned out it was no particular advantage, and about that time I saw your solution and said, "Aw heck, Matt J has beaten me to it again, and stopped work." So it's all your fault for being so quick - no, I'm kidding.
============

Well, just goes to show. You shouldn't give up so quickly...

Tags for this Thread

No tags are associated with this thread.

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us