2011-02-28 Pfew.. most of the day lost to getting t ... 9 years ago
Pfew.. most of the day lost to getting the Xen cluster at work working again. It's based on redhat cluster 2 and that part failed miserably. Both nodes were in a state which reminded me of a zombie cowboy: constantly shooting each other and rebooting. In the end I disabled one node physically (shutdown, removed power cables) and configured the other one to work alone. I think the cause of the problems was all virtual machines starting at once, all going intensive on their disk images via iscsi (all probably doing an fsck because of the unclean shutdown), causing delays and blocked processes, causing non-response to the cluster communications, causing the other node to fence the node, causing more problems, repeat.