We monitor the temperatures in the server room at work carefully, and viewing the changes over a while makes it look like the server room is a multistable climate system.
Our interest in temperaturesOn 23 December 2006 we had a complete failure of the airconditioning system in the serverroom at work. One of the worst days of the year to have this, we did not find out until 24 December due to e-mail from our NetApp fileserver telling us the main board was becoming too hot. After that a co-worker went over there and started opening windows and calling support for the airconditioning system. No replacement parts would be available before 2 January 2007 so in that week lot of work was done with makeshift cooling using fans and open windows.
We changed the monitoring system to notify us when the temperature in the serverroom goes above 35 degrees. Which it had to notify us 2 times of on 20 January and 11 March 2007.
The easiest way to get temperature readings is from the temperature sensors in the UPSes at the bottom of each rack. We use the Network UPS Tools package which allows us to check the temperature sensor via the network from a central monitoring system.
The airconditioning system blows cool air via the space below the raised floor into the racks. So the temperature measured by the UPS units is closely related to the output temperature from the airconditioning system.
Watching ntp servers and temperaturesAs we are a big fan of statistics and nice graphs, we also started graphing the temperatures. We already watch the ntp servers very thoroughly (you can view our public ntp graphs) and noticed interesting connections between the temperatures and the PLL loop value. Any change in temperature due to door openings, changes in hardware or outside weather shows as a change in PLL values. Usually after a while the ntp daemon stabilizes at a new PLL value.
The multistable systemThe bigger picture is that the entire server room seems to work like a multistable climate change. Any change to the input parameters, including a simple change like opening the door of a rack changes the system which eventually leads to a slow movement to a new stable situation.
21 December 2007 I added temperature sensors to the top of each rack. These sensors are more precise than those inside the UPS and more exposed to the air temperature.
These sensors show even better how any change influences the system. After a change, a new stable temperature will be found after a few hours. 3 January I moved a floor tile directing the airflow in one rack and the temperature at all rack tops rose, with the one at the top of the affected rack nearly a degree celsius. After nearly a day I moved the tile back which reversed most of the change.
The current idea is to add a lot more temperature sensors in the racks, near the inlet and outlet of the airconditioning unit, below the raised floor in several places and on the inside and outside of the walls of the server room.
Study of this climate systemShould someone be interested in studying this climate system, get in touch!
But the server-room climate is stable!According to Current Weather Conditions in the CSL the server room should have a stable temperature and humidity. We know better.