Stepwiselm and removing predictors with negative coefficients

3 views (last 30 days)
I am doing regression analysis with stepwiselm for some flow data. As predictors I have the flow from a previous pumping station, rain data, the hour and the day of the week. For flow and rain I have 25 predictors with different lag options (i.e. I have 25 columns for both rain and flow that have been delayed from 0 to 24 hours). This is all in a 4704*80 table called 'DataTable' and I would like to use stepwiselm to pick the best combination of these predictors. The code is:
mdl = stepwiselm(DataTable,'linear')
However, the rain can not have a negative effect on the flow and therefore I would like to exclude any rain predictors from the results of stepwiselm that have negative coefficients. Somehow I get rain predictors with low p-values (less than 0.05) and negative coefficients. How do I remove those?

Accepted Answer

the cyclist
the cyclist on 17 Sep 2015
Edited: the cyclist on 17 Sep 2015
I think you need to be careful in interpreting those negative coefficients. It seems likely to me that the rain data is autocorrelated, which makes the interpretation of the coefficients trickier (although I don't think strictly violate the OLS assumptions). For example, the model coefficient for yesterday's rain might be positive, but then the coefficient for two-days-ago rain might be negative, because of the autocorrelation. The "true" coefficient is actually some linear combination.
There are a few ways to deal with this, but it's difficult to explain it all here. (You could google search some of the keywords here.) There are models that explicitly deal with lagged variables. (I'm not an expert here. ARIMA, I think?) Another possibility is to build a model of the rain itself (maybe using principal components analysis?) that gives you better-behaved explanatory variables for your flow model. (This might still be difficult to interpret, though.)
Another possibility is to use some other framework than linear regression. Various machine learning techniques are well suited to this. I think a lot depends on whether you are primarily interested in the prediction capability, or the interpretation of the coefficients.
To explicitly answer your question, though ...
I am not aware of a way to tell MATLAB to restrict the coefficient range. I think you will just need to use stepwiselm to guide you, and then manually remove the terms that don't make sense in your model.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!