gusl | the reality of really real data

I'm currently working with some really messy time-series data, about power consumption in office buildings. There's missing data, multiple periodicities (daily, weekly, yearly), freakish outliers (e.g. holidays), and bursty anomalies (summer days generally use less power, except during heat waves, when ACs use *tons* of power). The task is daunting.

There are some things I want to do with the data for which I have no probabilistic interpretation (e.g. filter out certain frequencies).

I've spent the first several days exploring the data, making scatterplots, etc. I've seen some weird patterns, puzzling clusters. Modeling these would entail non-parametric density estimation, but this wouldn't tell me what to do wrt making actual predictions.

I should get some basic predictions working.

But there are so many possible models! Even though I'm only considering past power usage! (I'm not even looking at temperature)

Here are some basic ideas that have been floated:
* model the function using Gaussian Processes (for some kernel(s))
* model [prev n hours, next k hours] as a multivariate Gaussian (maybe this is the same as the above idea)
* autoregressive models, e.g. ridge regression on a subset of past times (including polynomial basis expansion, etc.)
* nearest neighbor (for some geometry(s))
* parameterized functional forms: model variation in daily bumps as a parameterized family of bumps, e.g. height, fatness, tail skewness, etc., using splines
* State-Space Models, a.k.a. continuous-state HMMs, (for some family of functions)
* Gaussian Lilypads, (I need to read up on this)
* ... and of course, ensembles of the above.

Following the principle of starting really simple, I plan to start by modeling daily totals, rather than hourly data.

S	M	T	W	T	F	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29

Gustavo Lacerda

the reality of really real data

(no subject)

Profile

February 2020

Most Popular Tags

Page Summary

Style Credit

Expand Cut Tags