3 views (last 30 days)

Show older comments

I am trying to model the relationship between Load & variables say X and (T - 1,2,3,4,5,6) according to the following equation:

Load = [ alpha(X) + B1*T1 + B2*T3 + B3*T4 + B4*T4 + B5*T5 + B6*T6] for X = 1 to 672

1) I have Load in the form of 15 minute interval data for a few months

2) X is a variable that is defined like this based on time:

Monday 00.00 am to 00.15 am = 1

Monday 00.15 am to 00.30 am = 2

.

.

Sunday 11.45 pm to 00.00 am = 672

Note:

This repeats again from 1 to 672 for the next week and is not a running number

T1 T2 T3 T4 T5 T6 are temperatures at each 15 min interval

Additional Info :

I can feed L, X, and T1 to T6. How can i perform regression on my equation to get coefficients alpha and B1 to B6. Observe B1 to B6 do not change with X but alpha does. So my regression output needs to be a vector of coefficients for Alpha, one for each X from 1 to 672 and a single value for B1 B2 B3 B4 B5 & B6 since they dont chage with X. I tries various ways and looked online.. All of them only say how to do this

Load = Alpha*X + B1*T1 + B2*T3 + B3*T4 + B4*T4 + B5*T5 + B6*T6

I have attached a subset of the data - about 8 weeks

- Ok ! Let me go in detail. I have several months of load data for a chiller at 15 minute intervals. The assumption is that chiller load not only depends on temperature but also on time of week.
- For ex, Lets say on a Wednesday at 10.00 - 10.15 am there is generally less occupancy so chiller load might be less than some other day with similar Outside air temperature. So the chiller load dependency is not just purely Outside temperature but also time of week.
- The temperature at each interval is broken down into 6 components to get a piecewise continuous linear equation. (not important). So thats the T1 to T6 you see.
- Then to incorporate time of week, we break a week into 672 15 minute intervals. The first X=1 starting at Monday 00.00 am to 00.15 am and so on till X = 672.
- So the chiller load equation is modelled as:

[ Load = Alpha(function of time of week variable X) + B1*T1 + B2*T3 + B3*T4 + B4*T4 + B5*T5 + B6*T6

X = 1 to 672 ] where Alpha and B1 to B6 are regression coefficients

In a week there are 672, 15 minute intervals = 7 days * 24 * hours * 60 minutes / 15 minutes = 672 intervals

- So I want to feed Load, X, T1 to T6 using several months of data. In the sample file we have 8 weeks of data.
- In 8 weeks we will have 8 instances/datapoints of Monday 00.00 to 00.15 am (X=1) and so on. These are to be used to estimate alpha at X = 1. Similarily for X = 2 till 672. This is just a sample set. If you try to find a regression coefficient Alpha for each X using 8 weeks of data since you have only 8 datapoints for each 15 minute interval or X you will likely overfit alpha. I am not sure of this ..just FYI
- In 8 weeks of data, you will have so many more data points to estimate B1 to B6 since these have no time of week or X dependency.
- The load curve over time will look roughly like the +ve half of a sine curve

its based on this paper - If anyone is interested you can look into it - https://buildings.lbl.gov/publications/quantifying-changes-building

Again, Thank you all !

Matt J
on 25 Jul 2021

we have 8 weeks of data.

If the same parameters are to be used every week, then you can equivalently just average together Load data samples that were taken at the same time-of-week, reducing the fitting problem to just one week of data.

Load= mean( reshape(Load,672,[]) ,2);

Again, though, without further constraints on alpha, it is a trivial result. Just set all the B variables to zero and alpha(X)=Load.

Scott MacKenzie
on 25 Jul 2021

Edited: Scott MacKenzie
on 25 Jul 2021

This is probably too simple to be correct, but I'll toss it out there anyway. Admittedly, I haven't considered anything you written about time intervals, and such, because I think this is already present in the time variable, but I might be wrong.

Bottom line: You've got empirical data for eight variables (load, X or time, T1, T2, T3, T4, T5, and T6) and you want to build a model with one of the variables as the response variable and the other seven as predictors. Here's your model:

load = alpha*X+ b1*T1 + b2*T2 + b3*T3 + b4*T4 + b5*T5 + b6*T6

The script below generates a regression model using mvregress (with requires the Statistics and Machine Learning Toolbox):

f = 'https://www.mathworks.com/matlabcentral/answers/uploaded_files/694834/Data%20Subset%20for%20Matlab%20Central.xlsx';

T = readtable(f);

% dependent/response variable

X = T.load;

% predictor variables (Note: time is 'X' in the question)

Y = [T.time, T.t1, T.t2, T.t3, T.t4, T.t5, T.t6];

format longg;

beta = mvregress(X,Y)

The seven model coefficients (alpha, b1, b2, etc.) are above. Visit the documentation for mvregress for other options you might want to explore. Good luck.

the cyclist
on 25 Jul 2021

It's definitely an interesting modeling problem. Here is a plot of your data, where I used errorbar to plot the mean and error of the mean.

chillerData = readtable('https://www.mathworks.com/matlabcentral/answers/uploaded_files/694834/Data%20Subset%20for%20Matlab%20Central.xlsx');

chillerData = chillerData(1:6048,:); % Only doing this step out of laziness, to get a multiple of 672

chillerLoad = chillerData.load;

chillerLoad = reshape(chillerLoad,9,672);

figure

errorbar(mean(chillerLoad),std(chillerLoad)/sqrt(size(chillerLoad,1)))

This does look close to sinusoidal (but I don't think only the positive portion?), so I think my first pass at a model would be one that varies sinusoidally in your X variable (scaled so that one cycle is 24 hours). And of course include the other terms.

I would not recommended doing averaging over the days, because you will then lose the ability of estimate the error. Just include all the data.

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!