Can I color my PCA data by column, and if so, how?
5 views (last 30 days)
Show older comments
I have a data set in a table object, with 11 columns. The first column is a string, corresponding to a group of orthologous genes (basically, genes that do the same thing in different species). The other ten columns are numbers that describe how much that gene, in species 1 or 2, is being expressed at a given point in time in this species - let's say we've got columns for species 1 timepoints 1-5, and species 2 timepoints 1-5.
I want to perform a PCA on it, but I want to color the data on it by column, such that I can figure out which species/timepoint a given datapoint belongs to.
Is this possible, and if so, how can I do it?
0 Comments
Answers (1)
Image Analyst
on 15 Jun 2018
I'm not sure this makes sense, at least to me it doesn't. So you run PCA on your data and for each observation, you'll get 10 principal components. Now if you want to do a scatterplot where you color each point with a color representing a certain range of a certain PC, you can use gscatter() to do that. But I don't know what it means to "color the data on it by column". My guess is that you want Machine Learning, perhaps discriminant classification or KNN, rather than PCA. See the chart on https://www.mathworks.com/help/stats/machine-learning-in-matlab.html
2 Comments
Image Analyst
on 15 Jun 2018
Still not sure I'm visualizing it correctly. So you have 10 columns of data, and 1 column that defines what gene group, of two possible gene groups (species), each row of the table is? And you want to get, for each column the 2 PCs for that column? Attach your .mat file if you want more help.
See Also
Categories
Find more on Dimensionality Reduction and Feature Extraction in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!