I have a data set in a table object, with 11 columns. The first column is a string, corresponding to a group of orthologous genes (basically, genes that do the same thing in different species). The other ten columns are numbers that describe how much that gene, in species 1 or 2, is being expressed at a given point in time in this species - let's say we've got columns for species 1 timepoints 1-5, and species 2 timepoints 1-5.
I want to perform a PCA on it, but I want to color the data on it by column, such that I can figure out which species/timepoint a given datapoint belongs to.
Is this possible, and if so, how can I do it?
I'm not sure this makes sense, at least to me it doesn't. So you run PCA on your data and for each observation, you'll get 10 principal components. Now if you want to do a scatterplot where you color each point with a color representing a certain range of a certain PC, you can use gscatter() to do that. But I don't know what it means to "color the data on it by column". My guess is that you want Machine Learning, perhaps discriminant classification or KNN, rather than PCA. See the chart on https://www.mathworks.com/help/stats/machine-learning-in-matlab.html