1. It depends on the classifier and the data/application you are using. For example, if you are trying to solve your classification problem using a linear classifier that predicts whether cancer is there or not? In this case making a third category (unknown) is not going to help. Whereas if you are trying to group all the data into clusters () then making them as "NAN" or 'Unknown' helps you.
2. Principal component analysis is a quantitatively rigorous method for achieving this simplification. The method generates a new set of variables, called principal components. Each principal component is a linear combination of the original variables. All the principal components are orthogonal to each other, so there is no redundant information. The principal components as a whole form an orthogonal basis for the space of the data. For example, in the cancer dataset, if you are using x predictors and then MATLAB PCA reduces this to y (<=x). These are not the actual data (columns) which you are using, these are derived columns out of the predictors by MATLAB. If you want to see the data of these 7 components out of the trained classifier, then you can use the following command
Also, for seeing how to use the trained classifier, use the following command, this command will give the whole description on how this particular model should be used and how to predict the response variable from the input data
3. MATLAB trained model will know whether PCA is used or not, so it will handle the conversions, you just need to pass the observation which you want to test. However, if you want to ensure that if the trained classifier used PCA before then, you can use the above suggested 'HowToPredict' function.
See the following documentation link that explains about PCA: