With ubuntu 8.04 server on the home server greenblatt I got a daily mail:Subject: CronAnd I couldn't really find the source. But a google search for logrotate mail 're-opening' helps: It is caused by logrotate and mailman, filed as Bug #244233 in mailman (Ubuntu): “Logrotate is noisy with: Re-opening all log files. The fix is simple: make mailman be quiet in /etc/logrotate.d/mailman. A patch is attached to the ubuntu bug.test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.daily ) /etc/cron.daily/logrotate: Re-opening all log files Re-opening all log files Re-opening all log files
After adding surge protection to the ISDN line coming into the asterisk server I went out and got the Trust surge guard PW-3500.Note the This product is no longer available in the current Trust assortment on the Trust website. I bought it at mycom: Trust UPS & Overspanningsbeveiliging Surge PW-3500 - MyCom.nl and I noticed on the outside of the package that it does not mention whether it can be used with an ADSL modem, it just mentions phones and fax machines. A note about suppressing high-frequency interference made me wonder whether it might filter high-frequency signals on the phone line a bit too much, killing ADSL throughput. The salesperson at MyCom could not tell either. But MyCom has customer friendly policies for returning articles which aren't what you expected. Even "I could not make it run with Linux" is a valid reason for returning it. So I took it home, and first thing I noticed in the manual inside the package is that it states that this surge protector is also designed for ADSL. They could have mentioned that on the outside.
So now it is living in the phone line (and power supply) for the ADSL line.
After reading When Lightning Strikes - by Johannes Ullrich - ISC sans diary I was reminded that I could do a bit more about surge protection. We don't live in Florida, but we have do have lightning storms and they can damage equipment.The first step was easy: Rerouted a few cables, and now the incoming ISDN line goes through the network/phone/isdn surge protector on the APC back-ups CS 350VA. The surge protector is there, so it is a better idea to use it. I have seen the inside of an ISDN adsl splitter after a lightning strike and that was not a good sight.
Maybe I'll get a surge protector for the adsl modem too.
A kernel panic because of some tweaking in the ISDN driver made it time for event-driven maintainance on the home server greenblatt. So I shut down the server, did the last syncs and removed the remaining two parallel ata disks and the promise IDE controller which was no longer needed. I re-enabled the "Cool'n'quiet" bios option so linux power saving works again, which should result in a drop in power use.
Update: As the graph shows removing the disks and re-enabling powersave has helped reducing power use.
I found the probable cause of the not so great power saving: when I installed the first new disk I also updated the bios. And the message I get when trying to load the powernow-k8 cpu driver is:powernow-k8: Found 1 AMD Athlon(tm) Dual Core Processor 4850e processors (2 cpu cores) (version 2.20.00) powernow-k8: MP systems not supported by PSB BIOS structure powernow-k8: MP systems not supported by PSB BIOS structureSo the cpu keeps running at maximum speed without throttling. Searching for the error message finds Ubuntu Bug #33116: powernow-k8 refuses to load and Ubuntu Bug #398109: powernow-k8: Your BIOS does not provide ACPI _PSS objects in a way that Linux understands suggests that I need to check the bios settings to enable "Cool'n'Quiet", enable ACPI APIC and disable MCP61 ACPI HPET Table. That's planned for the next hardware changes.
I noticed that the new Western Digital WD15EADS disk spun down way too fast. After some serious testing I found: when I set the "Advanced Power Management" level (using hdparm -B) to 127 or less the "standby (spindown) timeout" (set using hdparm -S) is ignored and the drive spins down afterabout 58 seconds of inactivity. Way too soon when playing a movie, with mplayer the movie stalls about every 10 seconds because a new bit of movie has to be read from disk which causes another start/stop. The smartctl start/stop counter goes up at the same rate. Feels like a firmware bug to me or a difference of opinion between hdparm and the disk. But the hdparm report suggests that these settings should work on the disk:ATA device, with non-removable media Model Number: WDC WD15EADS-00S2B0 Firmware Revision: 04.05G04 Standby timer values: spec'd by Standard, with device specific minimum Advanced power management level: 126 * Power Management feature setI asked Western Digital customer help about this but the first (standard?) answer is from Support for WD products in LINUX or UNIX which comes down to "we don't support anything else than jumper settings for these operating systems".A lot of further searching with google suggests to me that the 'IntelliPark' feature is causing the drive to park its heads after 8 seconds of inactivity which is not a useful default when streaming video from it with a reasonable cache. And the 'Load Cycle Count' will go up fast, which may result in the drive reaching the 'suggested maximum' within a year. I don't need to test the warranty that fast.
As a workaround I set the Advanced Power Management level back to 128 and installed spindown which is a utility which watches the disk activity from userspace and issues a spindown command when no activity (from /proc/diskstats, so for linux at the device level) was measured over the configured period of time. Now it spins down when the filesystems have been idle for 10 minutes which is a lot more usable.
Update: Official answer from Western Digital customer help is that it's not possible to change this 8 second timeout. So I'll stick to the spindown solution.
The resulting power save from adding a new sata disk, moving the data and removing the old pata disks is not spectacular (yet): the 5 pata disks (all with activated automatic spindown) had the UPS at a 40% load, the current 2 sata and 2 pata disks (also with automatic spindown) have the UPS at a 42% load. It'll be interesting to see what happens when the 2 pata disks can be removed. The main original idea was to save a bit of power and make the system less complicated, let's see if that first part works out in the end.
Update: Found the cause of the not so great power saving: probably the recent bios update.
Filesystems have been moved to the new huge sata disk in home server greenblatt and I found time this evening to remove three old ones. There may be a race condition in the startup scripts where lvm2 is not completely up and running when the filesystems are mounted from the fstab but I saw that happen only once.
The new disk in the homeserver greenblatt was another case of a disk not wanting to go to sleep after the set period. Some searching found two answers: spindown, a daemon to monitor disks for inactivity and spin them down with sg_start --stop or hdparm -y. But the other answer was a better answer: hdparm standby timeout not working for WD raptors? has as answer:* I also know of quite a number of drives where hdparm -B settings override the -S settings, even if you set the -S settings after the hdparm -B settings. You could try combinations with various values of hdparm -B, especially 1 and 255.And the manpage of hdparm has this bit:-B Set Advanced Power Management feature, if the drive supports it. A low value means aggressive power management and a high value means better performance. Possible settings range from values 1 through 127 (which permit spin-down), and values 128 through 254 (which do not permit spin-down). The highest degree of power management is attained with a setting of 1, and the highest I/O performance with a setting of 254. A value of 255 tells hdparm to disable Advanced Power Management altogether on the drive (not all drives support disabling it, but most do).Default on the WD drives is indeed 128, which does not permit spindown on idle. I changed it to 127, see if that helps. I prefer it if the drives decide for themselves when to spin down.
Update : Yes, the changed advanced power management setting helps, now the drive spins down when not in use.
The sensors at home are updated with data from the new disk. The cause of the relatively high temperatures is that 3 disks (2 pata and 1 new sata) are in one cage together. I hope to rearrange disks so the airflow improves and they cool better. I might need a bit longer sata cable to make that happen.
Work on home server greenblatt: time for less disks with more storage. So I bought two sata disks, one huge one to store the camera archive and scratch files, and one for the system and home directories. The choice for two disks is so the one with the camera archive and the scratch files can fall asleep when not in use, to save a bit of power. Installing both new disks at once wasn't going to happen due to space and cabling considerations so I started with the big one. When that one is done I can remove three pata disks from the system. I also updated the system bios to the latest version which made the system clock a lot more stable, ntpd now runs without having to use tickadj. Bios updates are easy these days: this bios can update itself from a USB stick. I chose logical volume management (lvm2) again for managing the big disks so it will be easy to expand storage when needed without getting a big tree of filesystem mounts.
The tapedrive-with-changer on the homeserver found itself in a wedged state with at the bottom of the dmesg output:[105715.017656] ch 0:0:1:1: Attempting to queue a TARGET RESET message [105715.017658] CDB: 0x1b 0x20 0x0 0x0 0x2 0x0 [105715.017663] ch 0:0:1:1: Command not found [105715.017664] aic7xxx_dev_reset returns 0x2002 [105718.936191] target0:0:1: FAST-10 SCSI 10.0 MB/s ST (100 ns, offset 15)And still no access to the scsi tape drive. But, there is a bigger hammer nowadays named sg_reset which can fix this:# mt -f /dev/nst0 status /dev/nst0: Input/output error # sg_reset -b /dev/sg0 sg_reset: starting bus reset sg_reset: completed bus reset # mt -f /dev/nst0 status SCSI 2 tape drive: File number=0, block number=0, partition=0. Tape block size 0 bytes. Density code 0x25 (DDS-3). Soft error count since last status=0 General status bits on (41010000): BOT ONLINE IM_REP_ENand it's back, not needing a reboot. The list of options says it all:Usage: sg_reset [-b] [-d] [-h] [-V] DEVICE where: -b attempt a SCSI bus reset -d attempt a SCSI device reset -h attempt a host adapter reset -V print version string then exit {if no switch given then check if reset underway} To reset use '-d' first, if that is unsuccessful, then use '-b', then '-h'
Yesterday evening I installed a 6-tape DDS-3 changer in the homeserver greenblatt and activated the latest ubuntu kernel updates. The tape changer works great but the mISDN drivers got confused because the 'loading drivers' stage at boot loads them without the right parameters which results in confused drivers (hardware not found) which I can't unload because that causes a kernel panic. Workaround: remove the mISDN drivers, reboot the system, reinstall the mISDN drivers and let /etc/init.d/mISDN load the drivers in the correct way.
Free ups test! It seems the power company decided not to deliver at all for 9 minutes. Interesting is that they don't mention a failure on their own website.
I like my home server usually boring and stable, but virus prevention should be at the bleeding edge, especially when it handles mail for multiple domains where other people can receive it. So I don't like messages in the clamav logfile:WARNING: Your ClamAV installation is OUTDATED! WARNING: Local version: 0.92.1 Recommended version: 0.95Using the Ubuntu backports I was able to get a less older version of clamav running. I updated the home server greenblatt documentation with the exact details of just using the clamav backport and no other backports.
The Virtual Bookcase is back online too, and mail is flowing again for all the domains. Lots of typing, checking and everything to move the stuff to the home server. But, finished (I think).
Update: and all the web statistics are working again and updated. Finished?
And it is back! Idefix 4 broke in a major way: the power supply let out the magic smoke in a big way: the hosting company called me to let me know the server was smelling funny and did not want to start up at all. Since the end of idefix 4 in a rack was near anyway the decision was made to move the server home. There I used another power supply to get access to my data again. The old powersupply was a 300 Watt powersupply which seems to be way underrated for a dual Xeon system. My best guess is that the instability the system had came from the powersupply anyway. So, time to move more domains home. Content from idefix.net is now here at home and virtualbookcase will be next when I find time. I had started migrating Camp Wireless so I finished that migration fast. Mail is diverted to a different place so I have a bit of time to configure all the mailing lists and other things.
Ok, the imap storage for asterisk voicemail works like the proverbial charm. I needed some work on the home dialplan and setup before I could test it, but I was able to leave a message to the home mailbox, seeing it stored in the voicemail imap box and retrieve and delete it using a telephone connected to the ISDN port accessing the VoicemailMain application. The access number for voicemail is now set to 0140-1233 to (sort of) stay in line with the Dutch numbering plan. There is no customer-service at 0140-1200 planned...
Ok, got that bit fixed too: asterisk uses imap as storage backend for voicemail. In modules.conf:noload => app_voicemail_odbc.so noload => app_voicemail.so load => app_voicemail_imap.soThis is with the ubuntu package recompiled to use misdn, so the selection of voicemail storage is a question of which .so to load. In voicemail.conf :[general] imapserver=koos.idefix.net imapfolder=INBOX.calls [default] 9911 => 19999,House mailbox,,,Tz=european|imapuser=housemail|imappassword=S3cr1tNow voicemail is saved only on the imap-server, so I can view it with Thunderbird. Or use the asterisk voicemail application to retrieve and delete it. That bit is not tested yet. After all the testing of drivers including heavy torture it's now time to set up a dialplan for the home pbx. Rule 1 of playing with the phones at home is that normal dialing still has to work so my wife can call the numbers without having to dial '0' for an outside line or other tricks, and that the phone in the living room rings when a call comes in. So I have to set up a 'number plan' which allows for special things but also makes all normal numbers work as they should. Solution: I use the 0140 area code, which is reserved (in the Netherlands) for test-numbers for the telecom provider. I am my own telecom provider so I can divert 0140 and do stuff with it, like provide voicemail or internal dialing.
I took the plunge and migrated from the old homeserver gosper to the new homeserver greenblatt. The physical migration was several hours of de-installing and installing hardware in the big tower case. Most software came up as planned, some minor nits to fix after stuff started running. Most statistics were only fixed after I got things running again, but the assorted sensors at home are available again.
A kernel panic because of some tweaking in the ISDN driver made it time for
event-driven maintainance on the
Free ups test! It seems the power company decided not to deliver at all
for 9 minutes. Interesting is that they don't mention a failure on their
own website.