Thread Subject: the mathematic relationship between two series of data

Subject: the mathematic relationship between two series of data

From: ZHANG Hong

Date: 13 Jul, 2008 04:44:02

Message: 1 of 11

Hi,everyone,

I tried to find out the mathematic relationship expression
of two datasets listed as follows:

A B
0.0772 99.92%
0.104191429 99.61%
0.131182857 98.99%
0.158174286 98.37%
0.185165714 97.37%
0.212157143 96.36%
0.239148571 95.20%
0.26614 93.65%
0.293131429 92.34%
0.320122857 90.63%
0.347114286 88.85%
0.374105714 87.00%
0.401097143 85.91%
0.428088571 84.29%
0.45508 82.35%
0.482071429 81.11%
0.509062857 76.55%
0.536054286 72.37%
0.563045714 68.34%
0.590037143 65.87%
0.617028571 62.15%
0.64402 59.44%
0.671011429 57.28%
0.698002857 54.26%
0.724994286 51.01%
0.751985714 49.30%
0.778977143 46.36%
0.805968571 44.43%
0.83296 42.96%
0.859951429 40.40%
0.886942857 39.01%
0.913934286 36.84%
0.940925714 34.06%
0.967917143 33.05%
0.994908571 30.96%

if A is the independent vairable and B is the dependent
variable, how to find out their mathematic relationships in
Matlab? I have tried in EXCEL but it is not simple linear,
exponential, log~~

Thank you very much.






Subject: the mathematic relationship between two series of data

From: Matt Fig

Date: 13 Jul, 2008 06:21:02

Message: 2 of 11


>
> A B
> 0.0772 99.92%
> 0.104191429 99.61%
> 0.131182857 98.99%
> 0.158174286 98.37%
> 0.185165714 97.37%
> 0.212157143 96.36%
> 0.239148571 95.20%
> 0.26614 93.65%
> 0.293131429 92.34%
> 0.320122857 90.63%
> 0.347114286 88.85%
> 0.374105714 87.00%
> 0.401097143 85.91%
> 0.428088571 84.29%
> 0.45508 82.35%
> 0.482071429 81.11%
> 0.509062857 76.55%
> 0.536054286 72.37%
> 0.563045714 68.34%
> 0.590037143 65.87%
> 0.617028571 62.15%
> 0.64402 59.44%
> 0.671011429 57.28%
> 0.698002857 54.26%
> 0.724994286 51.01%
> 0.751985714 49.30%
> 0.778977143 46.36%
> 0.805968571 44.43%
> 0.83296 42.96%
> 0.859951429 40.40%
> 0.886942857 39.01%
> 0.913934286 36.84%
> 0.940925714 34.06%
> 0.967917143 33.05%
> 0.994908571 30.96%
>
> if A is the independent vairable and B is the dependent
> variable, how to find out their mathematic relationships in
> Matlab? I have tried in EXCEL but it is not simple linear,


Certainly an 11th order polynomial fits the data quite well,
but a piecewise quadratic isn't bad either:

P1 = polyfit(A(1:15),B(1:15),2);
X1 = A(1):.001:A(15);
Y1 = polyval(P1,X1);
P2 = polyfit(A(16:end),B(16:end),2);
X2 = A(16):.001:A(end);
Y2 = polyval(P2,X2);
plot(A,B,X1,Y1,X2,Y2)


I guess it depends on what you expect, and what you want out
of the data.

Subject: the mathematic relationship between two series of data

From: ZHANG Hong

Date: 13 Jul, 2008 08:44:02

Message: 3 of 11

Hi, Mat,

Thank you very much for your help and it really works out.

I feel puzzled whether when we had got the scatter point
graph of two datasets, if we want to find a function to
feed it, it is to some extent depends on our experience. Is
that right or any factors can be considered?

Cheers!

Hong ZHANG

Subject: the mathematic relationship between two series of data

From: John D'Errico

Date: 13 Jul, 2008 09:03:01

Message: 4 of 11

"Matt Fig" <spamanon@yahoo.com> wrote in message
<g5c6se$rhd$1@fred.mathworks.com>...
>
> >
> > A B
> > 0.0772 99.92%

(snip)

> > 0.994908571 30.96%
> >
> > if A is the independent vairable and B is the dependent
> > variable, how to find out their mathematic relationships in
> > Matlab? I have tried in EXCEL but it is not simple linear,
>
>
> Certainly an 11th order polynomial fits the data quite well,
> but a piecewise quadratic isn't bad either:
>
> P1 = polyfit(A(1:15),B(1:15),2);
> X1 = A(1):.001:A(15);
> Y1 = polyval(P1,X1);
> P2 = polyfit(A(16:end),B(16:end),2);
> X2 = A(16):.001:A(end);
> Y2 = polyval(P2,X2);
> plot(A,B,X1,Y1,X2,Y2)
>
>
> I guess it depends on what you expect, and what you want out
> of the data.

The OP should accept that a piecewise quadratic
done in this way will not even be a continuous
function overall.

John

Subject: the mathematic relationship between two series of data

From: John D'Errico

Date: 13 Jul, 2008 09:32:27

Message: 5 of 11

"ZHANG Hong" <oceanzhhd@gmail.com> wrote in message
<g5cf8i$eg8$1@fred.mathworks.com>...
> Hi, Mat,
>
> Thank you very much for your help and it really works out.
>
> I feel puzzled whether when we had got the scatter point
> graph of two datasets, if we want to find a function to
> feed it, it is to some extent depends on our experience. Is
> that right or any factors can be considered?

I looked at your data. Your question is a not
uncommon one at all. What is "the" function
that represents my data?

The problem is that you need to bring along
much information that you have as the person
who generated this data, and as the person
who has a need to find a model for the data.

- Is there noise in your measurements?

- Is that noise significant, and must it be
smoothed out?

- What form of a function is acceptable to you?

- Will the resulting model be used for simple
prediction, or do you wish to then write down
that model, and study perhaps for a paper?

- What are your needs for the resulting model?

- How accurately must the model fit your data?

- What assumptions are you willing to make
about that model?

- What knowledge do you have about the
system that generated this data? For
example, do you know it to be monotone?

- Will you try to extrapolate this curve to
some point(s)?

Some questions that are specific to the data
you listed might focus on what I saw when I
plotted it.

- It appears that the curve might have a
small break in the derivative near the middle.
Is this something that you know exists, or is
that merely noise? I've often seen artifacts
like this created when an instrument is
recalibrated in the middle of an experiment.

There are entire realms of mathematics that
try to deal with these questions, and the
issues that arise from those questions. You
may find those realms referred to by the
various names modeling, approximation,
curvefitting, and interpolation. In fact, there
are complete toolboxes from the MathWorks
that attempt to help you with these problems,
in the form of the splines toolbox, as well as
the curvefitting, optimization, neural net
toolboxes, etc. You will also find large
numbers of submission on the file exchange,
entire categories of tools, that will help you
too.

But first, you must resolve some of the
questions I've posed above.

John

Subject: the mathematic relationship between two series of data

From: ZHANG Hong

Date: 13 Jul, 2008 10:44:01

Message: 6 of 11

Hi, John,

I had read your reply repeatly and I do appreciate your
help and suggestions.

Yes, As you say both data itself including its
orgin/background, noise, and what we expected it to be will
affect the final result of datafitting. It is not easy to
relate the curvefitting function to the real life meaning
of the dataset especially when you got unexpected result
which you at first thought it would be the same as many
experimental studies had proved to. In this condition, i
always look back to my data and see whether it is the
geographic condition, data collectiong process or other
factors that affect it, or if it is really an emergency
that many other system might have.

Cheers!

Hong






Subject: the mathematic relationship between two series of data

From: John D'Errico

Date: 13 Jul, 2008 11:26:02

Message: 7 of 11

"ZHANG Hong" <oceanzhhd@gmail.com> wrote in message
<g5cm9h$r70$1@fred.mathworks.com>...
> Hi, John,
>
> I had read your reply repeatly and I do appreciate your
> help and suggestions.
>
> Yes, As you say both data itself including its
> orgin/background, noise, and what we expected it to be will
> affect the final result of datafitting. It is not easy to
> relate the curvefitting function to the real life meaning
> of the dataset especially when you got unexpected result
> which you at first thought it would be the same as many
> experimental studies had proved to. In this condition, i
> always look back to my data and see whether it is the
> geographic condition, data collectiong process or other
> factors that affect it, or if it is really an emergency
> that many other system might have.
>
> Cheers!
>
> Hong

As a continuation of my last response,
normally, I'd suggest a spline fit as a
starting point. Is an interpolating spline
appropriate for you? It all depends on
what you will do with the curve and what
your goals are.

A least squares spline or a smoothing
spline are also options to be considered.
Since it sounds as if you do not have any
mechanistic or physical model for your
data, splines are often a good choice. But
even then there are issues to consider. Do
you have important information in your
knowledge of the process? Must it be
monotone? Do you know something about
the curvature of the relationship to be
modeled? Do you have a measure of the
noise variance on this data?

For example, my own utility, estimatenoise,
estimates the standard deviation to be
roughly 0.005. But is this consistent with
your own knowledge of the process?

sqrt(estimatenoise(A,B))
ans =
    0.0047606

John

Subject: the mathematic relationship between two series of data

From: Hong Zhang

Date: 13 Jul, 2008 13:45:03

Message: 8 of 11

Hi,John,

"it sounds as if you do not have any mechanistic or
physical model for yourdata", That's to the point.
Actually, in my former dataset, A is the interval of a
series of measurment value and B is corresponding
cumulative probability. As there is no reference about the
distribution rule of such measurment, at first i think it
would be power-law which may be accord with the real life
condition.

I had no clear idea until now. Why it appears to be a
piecewise quadratic? why it has a small break? In fact, A
is the result of another matalb programming which is point
to an adjacency matrix.

Your suggestions do give me some hints and clues. I need to
think carefully about the mechanism of the independent
variable and its real life meaning.

BTW, Is estimatenoise to used to evaluate or improve the
curvefitting precision? It is obiviously important to be
considered. But for me, this curvefitting process is
something like data mining. i concerns what relationship
the dataset emerge and why it appears like that.

Cheers!

Hong



Subject: the mathematic relationship between two series of data

From: John D'Errico

Date: 13 Jul, 2008 14:58:01

Message: 9 of 11

"Hong Zhang" <oceanzhhd@gmail.com> wrote in message
<g5d0sv$2un$1@fred.mathworks.com>...
> Hi,John,
>
> "it sounds as if you do not have any mechanistic or
> physical model for yourdata", That's to the point.
> Actually, in my former dataset, A is the interval of a
> series of measurment value and B is corresponding
> cumulative probability. As there is no reference about the
> distribution rule of such measurment, at first i think it
> would be power-law which may be accord with the real life
> condition.

If it should be some sort of a power law,
think about the form. You might consider
reading through my nonlinear shapes
submission:

http://www.mathworks.com/matlabcentral/fileexchange/loadFile.do?
objectId=10864&objectType=FILE


> I had no clear idea until now. Why it appears to be a
> piecewise quadratic? why it has a small break? In fact, A
> is the result of another matalb programming which is point
> to an adjacency matrix.
>
> Your suggestions do give me some hints and clues. I need to
> think carefully about the mechanism of the independent
> variable and its real life meaning.

Exactly. It is this introspection that is very
important when you do modeling. It helps
you to learn about your process, and perhaps
discover things that you know about the
system that you might not have seen
otherwise.

 
> BTW, Is estimatenoise to used to evaluate or improve the
> curvefitting precision? It is obiviously important to be
> considered.

Estimatenoise might help you if you are
using a smoothing spline to approximate
the relationship, since they can use that
information.


> But for me, this curvefitting process is
> something like data mining. i concerns what relationship
> the dataset emerge and why it appears like that.

Curvefitting can be a voyage of discovery,
helping you to learn about the process you
will fit. Or it can be as simple as a brute
force interpolation, or polynomial curve fit.
You may receive returns that are directly
related to the effort you expend in the
modeling process.

John

Subject: the mathematic relationship between two series of data

From: Per Sundqvist

Date: 13 Jul, 2008 18:05:03

Message: 10 of 11

"John D'Errico" <woodchips@rochester.rr.com> wrote in
message <g5d55p$a3$1@fred.mathworks.com>...
> "Hong Zhang" <oceanzhhd@gmail.com> wrote in message
> <g5d0sv$2un$1@fred.mathworks.com>...
> > Hi,John,
> >
> > "it sounds as if you do not have any mechanistic or
> > physical model for yourdata", That's to the point.
> > Actually, in my former dataset, A is the interval of a
> > series of measurment value and B is corresponding
> > cumulative probability. As there is no reference about the
> > distribution rule of such measurment, at first i think it
> > would be power-law which may be accord with the real life
> > condition.
>
> If it should be some sort of a power law,
> think about the form. You might consider
> reading through my nonlinear shapes
> submission:
>
>
http://www.mathworks.com/matlabcentral/fileexchange/loadFile.do?
> objectId=10864&objectType=FILE
>
>
> > I had no clear idea until now. Why it appears to be a
> > piecewise quadratic? why it has a small break? In fact, A
> > is the result of another matalb programming which is point
> > to an adjacency matrix.
> >
> > Your suggestions do give me some hints and clues. I need to
> > think carefully about the mechanism of the independent
> > variable and its real life meaning.
>
> Exactly. It is this introspection that is very
> important when you do modeling. It helps
> you to learn about your process, and perhaps
> discover things that you know about the
> system that you might not have seen
> otherwise.
>
>
> > BTW, Is estimatenoise to used to evaluate or improve the
> > curvefitting precision? It is obiviously important to be
> > considered.
>
> Estimatenoise might help you if you are
> using a smoothing spline to approximate
> the relationship, since they can use that
> information.
>
>
> > But for me, this curvefitting process is
> > something like data mining. i concerns what relationship
> > the dataset emerge and why it appears like that.
>
> Curvefitting can be a voyage of discovery,
> helping you to learn about the process you
> will fit. Or it can be as simple as a brute
> force interpolation, or polynomial curve fit.
> You may receive returns that are directly
> related to the effort you expend in the
> modeling process.
>
> John

It looks like you should have y(1)=30 and y(0)=100 and
y'(0)=0. I think you could use the mechanical fourth order
differential equation in 1D to model this, using appropriate
BC and stiffnes parameters. Then fit your data to this
analytic formula, its related to linear combinations of
sinh, cosh, sin and cos in some way.

Subject: the mathematic relationship between two series of data

From: John D'Errico

Date: 13 Jul, 2008 22:52:01

Message: 11 of 11

"Per Sundqvist" <sunkan@fy.chalmers.se> wrote in message
<g5dg4f$f6n$1@fred.mathworks.com>...

> It looks like you should have y(1)=30 and y(0)=100 and
> y'(0)=0. I think you could use the mechanical fourth order
> differential equation in 1D to model this, using appropriate
> BC and stiffnes parameters. Then fit your data to this
> analytic formula, its related to linear combinations of
> sinh, cosh, sin and cos in some way.

An interesting point is that this is just a spline.

The 4'th order differential equation described
is the same one that generates a cubic spline.

An axial tension term merely turns this into a
tension spline, the solutions to which can be
written in terms of tanh.

A smoothness term allows you to turn it into a
smoothing spline, minimizing a combination
of the residual errors plus the potential energy
due to bending stored in the spline. And of
course, the end conditions described are also
modeled easily as a spline.

So the idea of posing a model in terms of a
differential equation, then fitting the result,
is achieved more simply by just using a least
squares spline in some form. The boundary
conditions described are all achievable using
splines.

John


Tags for this Thread

Everyone's Tags:

Add a New Tag:

Separated by commas
Ex.: root locus, bode

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Tag Activity for This Thread
Tag Applied By Date/Time
plot Hong Zhang 13 Jul, 2008 00:45:08
rssFeed for this Thread
 

MATLAB Central Terms of Use

NOTICE: Any content you submit to MATLAB Central, including personal information, is not subject to the protections which may be afforded information collected under other sections of The MathWorks, Inc. Web site. You are entirely responsible for all content that you upload, post, e-mail, transmit or otherwise make available via MATLAB Central. The MathWorks does not control the content posted by visitors to MATLAB Central and, does not guarantee the accuracy, integrity, or quality of such content. Under no circumstances will The MathWorks be liable in any way for any content not authored by The MathWorks, or any loss or damage of any kind incurred as a result of the use of any content posted, e-mailed, transmitted or otherwise made available via MATLAB Central. Read the complete Terms prior to use.

Contact us at files@mathworks.com