How to make one 53228x1 double matrix a function of another 53228x1 double matrix?

1 view (last 30 days)
I have two matrixes, GalList.mass and GalList.dist, that I need to create a singular function out of.
Both have dimensions of a 53228x1 double matrix.
GalList.mass is essentially a collective matrix of masses in the universe, while GalList.dist is a collective matrix of the distances of galaxies in the universe. I need to make a mass a function of distance, or M(r), but I do not know how to do that. I have tried nested functions, but that did not help.
SIDE NOTE: X1 = GalList.dist Y1= GalList.mass
I am fairly new to Matlab, so any help would be appreciated!

Accepted Answer

Star Strider
Star Strider on 8 Mar 2015
Is the mass predicted by the equation:
Mass = rho*(4/3)*pi*(X1).^3
the mean mass or the total mass as a function of distance?
If it’s the total mass (since the equation is similar to the volume of a sphere), then fitting the equation to the cumulative sum of the ‘Y1’ masses as a function of the ‘X1’ distances might well follow the cubic relation.
If you then are confident of ‘rho’ and you only want to estimate the exponent, you can use a nonlinear regression function (such as nlinfit) to estimate the parameter of:
M = @(b,x) rho*(4/3)*pi*(X1).^p;
and if you want to estimate both ‘rho’ and the exponent as parameters, use:
M = @(b,x) b(1).*(4/3).*pi.*(X1).^b(2);
fitting the appropriate version of ‘M’ as your objective function in nlinfit, depending on what you want to do.
Just a guess on my part. This is far from my areas of expertise.
  2 Comments
jgillis16
jgillis16 on 8 Mar 2015
Yes, mass is predicted by that equation.
So I set Y3 = cumsum(Mass) [mass being defined as rho*(4/3)*pi*(X1).^3] Then, produced a scatter plot of scatter(X1,Y3). But there was no relation derived from the graph again.
How would I use nlinfit to estimate the parameter M = @(b,x) rho*(4/3)*pi*(X1).^p;?
I'm sorry for the basic questions, but you seem to be on the right track of my line of thinking. I'm just having trouble translating this all into code.
Star Strider
Star Strider on 8 Mar 2015
I don’t have your data, so I can’t do my own analyses, but thinking about it a bit more, if the masses in your data are ordered (or sorted) by increasing distance, and there is a 1:1 ordered correspondence between ‘X1’ and ‘Y1’, then instead of cumsum, the cumtrapz function would likely be more appropriate:
Mcum = cumtrapz(X1, Y1); % Cumulative Mass As Function Of Distance
If my idea is correct (no guarantees), this plot:
plot(X1, Mcum)
should show the cubic relation, or something close to it.
Using nlinfit (that I prefer in situations appropriate to it because it has more statistics options), you would use this function:
M = @(b,x) rho*(4/3)*pi*(X1).^p;
as:
B = nlinfit(X1, Mcum, M, 3);
the ‘3’ is your initial estimate of the exponent.
If I understand correctly what you want to do, that should produce an exponent you can compare to the value 3. Note that you can have nlinfit output more data:
[beta,R,J,CovB,MSE,ErrorModelInfo] = nlinfit(___)
of which ‘CovB’ might be the most useful in determining the probability that the estimated parameter ‘B’ (in my example) is (or is not) statistically different from 3. With one parameter and more than 30 observations, you can use the normal distribution to estimate that probability, assuming that is the hypothesis you want to test.
As I mentioned, astronomy is far from my areas of expertise.

Sign in to comment.

More Answers (2)

Image Analyst
Image Analyst on 8 Mar 2015
What kind of function do you expect? Is there a relationship? Did you try scatter(X1, Y1) to check?
  4 Comments
Image Analyst
Image Analyst on 8 Mar 2015
I don't see any reason why the mass would change with distance from us unless we held some special place in the universe, like at the center where the big bang was. If you think you see more spread as distance gets higher, I think that's just because there are more data points out there. Of course there will be more galaxies between 20 and 25 mpc than between 0 and 5 mpc because it covers more actual volume so you will have more data points out there that you will see when you plot them. If you were able to have more data points in close, then you might see a similar spread. Just because data points are missing does not mean the probability distribution function would be different. For a given distance, if the probability of some mass is an inverse/decreasing function, it will probably be that same function no matter where in space you're looking - close to us or far away from us, unless like I said we have a special place like at the center or edge of the universe. Is there any theory that says that galaxies closer to us tend to be smaller than galaxies farther away? If you chopped up the universe into a cube of say, 10 by 10 by 10 array of boxes, do you think that the mean galaxy size in the cubes varies by more than randomness would suggest, and varies in a deterministic way that depends on how far the cube is away from the cube where our Milky Way lies? I could see that possibly the galaxy mass mean might vary from how far it is from the center of the universe (like big chunks from the big bang ended up closer or farther away from the epicenter), but from our galaxy? I just don't see a reason. But I could be wrong - I'm not an astrophysicist.
John D'Errico
John D'Errico on 8 Mar 2015
I do agree with the point that Image makes, that unless we are special, then no relationship can hold.
However, there MAY easily be a censoring bias. That is, if we can't see the low mass objects that lie far away, then they won't appear in the data set, so there might be a false relationship inferred. In fact, you would need to show that such a censoring problem was NOT happening were you to try to convince me that you had found something of any significance.

Sign in to comment.


John D'Errico
John D'Errico on 8 Mar 2015
Edited: John D'Errico on 8 Mar 2015
This is NOT what nested functions are for. :) In fact, your question was not at all about functions in a MATLAB context.
It seems that you have two variables, distance and mass. You wish to postulate that they follow a cubic relationship. I'll assume that you are saying that galaxy(1) has measured mass(1) and distance(1), and so on for each of the 53228 objects in this sample. If those two vectors do NOT follow that relationship, then you can do absolutely nothing here.
But if we wish to show that mass is proportional to the cube of distance, then just plot them, using a log-log scale. That is, if we had
mass = k*distance^3
then
log(mass) = log(k) + 3*log(distance)
So IF that plot follows a straight line, (yeah, a noisy straight line, but a straight one) with a slope of 3, then you have essentially shown that power law to be a viable model.
IF there is curvature in that plot, then you could not make such a claim. So I would use a modeling tool to estimate a simple polynomial model (say quadratic or at most cubic) in the log-log domain. If the higher order power(s) of distance have a statistically significant coefficient, then you would need to reconsider your model. Likewise, if the coefficient of distance is itself statistically different from 3, then again, your model fails to hold up.
Finally, because of the rather noisy data, with clear outliers, you will need to use a tool that can handle robust fitting. I'd start with robustfit from the stats toolbox.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!