Earlier today our monitor stopped working and left us without heat when it was −35°F outside. I drove home and swapped the broken heater with our spare, but the heat was off for several hours and the temperature in the house dropped into the 50s until I got the replacement running. While I waited for the house to warm up, I took a look at the heat loss data for the building.

To do this, I experimented with the “Python scientific computing stack,”: the
`IPython` shell (I used the notebook functionality to produce the majority of
this blog post), `Pandas` for data wrangling, `matplotlib` for plotting, and
`NumPy` in the background. Ordinarily I would have performed the entire
analysis in R, but I’m much more comfortable in Python and the IPython notebook
is pretty compelling. What is lacking, in my opinion, is the solid graphics
provided by the `ggplot2` package in R.

First, I pulled the data from the database for the period the heater was off (and probably a little extra on either side):

```
import psycopg2
from pandas.io import sql
con = psycopg2.connect(host = 'localhost', database = 'arduino_wx')
temps = sql.read_frame("""
SELECT obs_dt, downstairs,
(lead(downstairs) over (order by obs_dt) - downstairs) /
interval_to_seconds(lead(obs_dt) over (order by obs_dt) - obs_dt)
* 3600 as downstairs_rate,
upstairs,
(lead(upstairs) over (order by obs_dt) - upstairs) /
interval_to_seconds(lead(obs_dt) over (order by obs_dt) - obs_dt)
* 3600 as upstairs_rate,
outside
FROM arduino
WHERE obs_dt between '2013-03-27 07:00:00' and '2013-03-27 12:00:00'
ORDER BY obs_dt;""", con, index_col = 'obs_dt')
```

SQL window functions calculate the rate the temperature is changing from one observation to the next, and convert the units to the change in temperature per hour (Δ°F/hour).

Adding the `index_col` attribute in the `sql.read_frame()` function is very
important so that the Pandas data frame doesn’t have an arbitrary numerical
index. When plotting, the index column is typically used for the x-axis /
independent variable.

Next, calculate the difference between the indoor and outdoor temperatures, which is important in any heat loss calculations (the greater this difference, the greater the loss):

```
temps['downstairs_diff'] = temps['downstairs'] - temps['outside']
temps['upstairs_diff'] = temps['upstairs'] - temps['outside']
```

I took a quick look at the data and it looks like the downstairs temperatures are smoother so I subset the data so it only contains the downstairs (and outside) temperature records.

```
temps_up = temps[['outside', 'downstairs', 'downstairs_diff', 'downstairs_rate']]
print(u"Minimum temperature loss (°f/hour) = {0}".format(
temps_up['downstairs_rate'].min()))
temps_up.head(10)
Minimum temperature loss (deg F/hour) = -3.7823079517
```

obs_dt | outside | downstairs | diff | rate |
---|---|---|---|---|

2013-03-27 07:02:32 | -33.09 | 65.60 | 98.70 | 0.897 |

2013-03-27 07:07:32 | -33.19 | 65.68 | 98.87 | 0.661 |

2013-03-27 07:12:32 | -33.26 | 65.73 | 98.99 | 0.239 |

2013-03-27 07:17:32 | -33.52 | 65.75 | 99.28 | -2.340 |

2013-03-27 07:22:32 | -33.60 | 65.56 | 99.16 | -3.782 |

2013-03-27 07:27:32 | -33.61 | 65.24 | 98.85 | -3.545 |

2013-03-27 07:32:31 | -33.54 | 64.95 | 98.49 | -2.930 |

2013-03-27 07:37:32 | -33.58 | 64.70 | 98.28 | -2.761 |

2013-03-27 07:42:32 | -33.48 | 64.47 | 97.95 | -3.603 |

2013-03-27 07:47:32 | -33.28 | 64.17 | 97.46 | -3.780 |

You can see from the first bit of data that when the heater first went off, the differential between inside and outside was almost 100 degrees, and the temperature was dropping at a rate of 3.8 degrees per hour. Starting at 65°F, we’d be below freezing in just under nine hours at this rate, but as the differential drops, the rate that the inside temperature drops will slow down. I'd guess the house would stay above freezing for more than twelve hours even with outside temperatures as cold as we had this morning.

Here’s a plot of the data. The plot looks pretty reasonable with very little code:

```
import matplotlib.pyplot as plt
plt.figure()
temps_up.plot(subplots = True, figsize = (8.5, 11),
title = u"Heat loss from our house at −35°F",
style = ['bo-', 'ro-', 'ro-', 'ro-', 'go-', 'go-', 'go-'])
plt.legend()
# plt.subplots_adjust(hspace = 0.15)
plt.savefig('downstairs_loss.pdf')
plt.savefig('downstairs_loss.svg')
```

You’ll notice that even before I came home and replaced the heater, the temperature in the house had started to rise. This is certainly due to solar heating as it was a clear day with more than twelve hours of sunlight.

The plot shows what looks like a relationship between the rate of change inside and the temperature differential between inside and outside, so we’ll test this hypothesis using linear regression.

First, get the data where the temperature in the house was dropping.

```
cooling = temps_up[temps_up['downstairs_rate'] < 0]
```

Now run the regression between rate of change and outside temperature:

```
import pandas as pd
results = pd.ols(y = cooling['downstairs_rate'], x = cooling.ix[:, 'outside'])
results
```

-------------------------Summary of Regression Analysis------------------------- Formula: Y ~ <x> + <intercept> Number of Observations: 38 Number of Degrees of Freedom: 2 R-squared: 0.9214 Adj R-squared: 0.9192 Rmse: 0.2807 F-stat (1, 36): 421.7806, p-value: 0.0000 Degrees of Freedom: model 1, resid 36 -----------------------Summary of Estimated Coefficients------------------------ Variable Coef Std Err t-stat p-value CI 2.5% CI 97.5% -------------------------------------------------------------------------------- x 0.1397 0.0068 20.54 0.0000 0.1263 0.1530 intercept 1.3330 0.1902 7.01 0.0000 0.9603 1.7057 ---------------------------------End of Summary---------------------------------

You can see there’s a very strong positive relationship between the outside temperature and the rate that the inside temperature changes. As it warms outside, the drop in inside temperature slows.

The real relationship is more likely to be related to the differential between inside and outside. In this case, the relationship isn’t quite as strong. I suspect that the heat from the sun is confounding the analysis.

```
results = pd.ols(y = cooling['downstairs_rate'], x = cooling.ix[:, 'downstairs_diff'])
results
```

-------------------------Summary of Regression Analysis------------------------- Formula: Y ~ <x> + <intercept> Number of Observations: 38 Number of Degrees of Freedom: 2 R-squared: 0.8964 Adj R-squared: 0.8935 Rmse: 0.3222 F-stat (1, 36): 311.5470, p-value: 0.0000 Degrees of Freedom: model 1, resid 36 -----------------------Summary of Estimated Coefficients------------------------ Variable Coef Std Err t-stat p-value CI 2.5% CI 97.5% -------------------------------------------------------------------------------- x -0.1032 0.0058 -17.65 0.0000 -0.1146 -0.0917 intercept 6.6537 0.5189 12.82 0.0000 5.6366 7.6707 ---------------------------------End of Summary---------------------------------

```
con.close()
```

I’m not sure how much information I really got out of this, but I am pleasantly surprised that the house held it’s heat as well as it did even with the very cold temperatures. It might be interesting to intentionally turn off the heater in the middle of winter and examine these relationship for a longer period and without the influence of the sun.

And I’ve enjoyed learning a new set of tools for data analysis. Thanks to my friend Ryan for recommending them.