Data points in regression with different colors and shapes

Hi everyone,
I would like to know whether there is a way to make the data points have different colors and shapes?
Each data point has 2 attributes. So for example all the points with attribute number 1: equal to "x" I would color them all with red and all points with attribute number 2: equal to "y" I would give them a different shape (square, circle, star, etc.).
Thanks!
Extra details, for the above plot:
I have the following matrix M, rows give the first "attribute" while colomuns give the second "attribute". Then I colapse them into one vector using M=M(:) of size (1 x n )
Then I use these attributes (each correspond to the number of hubs and the number of nodes a newly node is added to of a scale free network, the values in the matrix give the "alpha" parameter of the power law associated to the degree distribution)
I feed the scale free network with both attributes to a function that returns the value "Y" (vector of size 1 x n) on the y-axis.
Then given M and Y, I do the following:
mdl=fitlm(M,Y)
plot(mdl)

8 Comments

You need to tell us how you plotted the data.
thanks I just edited my post!
...""Y" (vector of size 1 x n) on the y-axis. ...mdl=fitlm(M,Y) ...plot(mdl)"
line attributes are constant for a given line object; and plot creates one line for each vector in the argument list. Hence, in the plot you have created there's only one line object and so it can only have one set of attributes.
In your description of attributes in the array M you state rows are one attribute while columns are another. That implies there are seven (7) row attributes, let's say A, B, C ... G and eight (8) columns, 1 thru 8. Thus, every element in the array has a unique combination of attributes
A1, A2, A3, ... A7,A8
B1, B2, B3, ... B7,B8
...
G1, G2, G3, ... G7,G8
So, the only way to put a unique marker on each element would be to plot each array element separately or use scatter which lets you set a color by element.
Hi yes correct each element has a unique combination of attributes.
But I was thinking, for example:
consider line 1 of the matrix M, this correpsonds to attribute "h=5" so for all those data points A1, A2, ... A8 , I give then the color "yellow".
now consider column 1 of the matrix M, this corresponds to attribute "c=1" so for all those data points A1, B1, ...., G1 I use the marker '*'. For column 2 use marker '+' , etc.
So that entry A1 will have color yellow and marker '*'. The entry A2 will be yellow but will have as a marker '+'
Ok I will plot seperatley thanks
Attach your data and code to read it in if you need more help.
Yes, even though all of one row or column may have one attribute in common, the combination is different and so each needs its own handle or the attribute must accept an array of values.
As noted above plot has only one value for both the marker type and color so would have to set each point separately to use it.
scatter allows an array of color triplets so could use it to do all elements in a column with the color for each row and one marker which cuts down the amount of code required by the number of rows.
Why TMW didn't allow the properties to be set within the line object is a real shortcoming; the half-fixed it w/ scatter but only half; the markers can't be set by point. The fact there is such a limited number of markers is annoying, too.
thank you all very much! I ploted the data seperatley and used a colormap, the answer below seems more eficient than what I have done.

Sign in to comment.

 Accepted Answer

Since you collapsed M into a column vector using M(:) and Y apparently has the same size and shape, then your linear model has 1 predictor and 1 response variable which is why plot(mdl) is returning a simple scatter plot instead of an added-value-plot.
Here's an example using a dataset provided by Matlab where W is a 5x6 matrix of car weight and MPG is a 5x6 matrix of fuel efficiency. You can imagine that each row and each column is some attribute such as manufacturer and year.
load carsmall
W = reshape(Weight(1:30), 5, 6);
MPG = reshape(MPG(1:30), 5, 6);
Now we'll collapse them vertically into column vectors,
W = W(:);
MPG = MPG(:);
and run them through a linear model using weight (W) as a predictor of fuel efficiency (MPG)
mdl = fitlm(W,MPG);
And plot the results
h = plot(mdl);
Note that this produces 4 line objects: Two dotted lines for the confidence bounds, one solid line for the predictor, and one line-object containing just the markers showing the raw data.
get(gca, 'Children')
ans =
4×1 Line array: Line Line (Confidence bounds) Line (Fit) Line (Data)
In fact, we could plot the raw data directly on top of the simple scatter plot (see caveat at the end of this answer),
figure()
plot(mdl)
hold on
plot(W,MPG, 'go')
But if you want to use different marker colors for each column and different marker shapes for each row of the original predictor matrix, you'll need to plot each point individually or you could use scatter().
figure()
plot(mdl)
hold on
% Define color for each column
color = lines(6);
% Define marker for each row
marker = ['so*ph'];
% I prefer to use plot at the moment but you could
% also use scatter with a different setup
for i = 1:numel(W)
[row, col] = ind2sub([5,6], i); % Get row and col num
plot(W(i), MPG(i), 'Marker', marker(row), 'Color', color(col,:), 'LineWidth',2, 'MarkerSize', 12)
end
% Remove legend (to keep the demo simple)
delete(findobj(gcf, 'type', 'legend'))
An important caveat is that if your model contains more than 1 predictor, the result of plot(mdl) will be a added-variable plot which does not have a direct relationship to the predictore/response variables. In that case, you'd need to use the model along with your predcitors to compute those scatter point positions.

1 Comment

thank you very much for the detailed answer, it's perfect!

Sign in to comment.

More Answers (0)

Categories

Asked:

Sha
on 22 Oct 2020

Edited:

Sha
on 23 Oct 2020

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!