2021-09-15 Linux, serial devices that aren't modems and modemmanager
I always noticed that I had to plug in the USB cable for the remote radio with the radio switched off, otherwise the Kenwood TS480 would switch into transmit mode and stay there until I powered the radio off. Annoying, and I thought it was something in the serial initialization. Recently I was thinking about this and remembered something about query sequences on serial devices triggering weird behaviour in other devices. From what I read about the Kenwood serial protocol the chance of a few stray characters changing something in the radio is quite possible. So I considered what Linux software could do a query as soon as a serial port is added to the system. Well, modemmanager was the ideal candidate for this:Package: modemmanager [..] Description-en: D-Bus service for managing modems ModemManager is a DBus-activated daemon which controls mobile broadband (2G/3G/4G) devices and connections. Whether built-in devices, USB dongles, Bluetooth-paired telephones or professional RS232/USB devices with external power supplies, ModemManager is able to prepare and configure the modems and setup connections with them.And indeed, simply removing modemmanager made the problem go away. I can now plug in the USB cable when the radio is on and nothing happens.
2021-09-11 Adding physical hardware temperatures in telegraf/influxdb/grafana
After starting the collection of a lot of the system data I wanted with telegraf/influxdb/grafana one small part was missing: the system temperature sensors. I like these, so I had a look and found the inputs.temp plugin in telegraf which is normally disabled. Enabling it on hosts that have actual hardware to measure worked ok. On the Raspberry Pi systems it gives one temperature:> SHOW TAG VALUES ON "telegraf" WITH key="sensor" WHERE host='joy' name: temp key value --- ----- sensor cpu_thermal_inputOn the home server conway it gives quite a lot of temperatures:> SHOW TAG VALUES ON "telegraf" WITH key="sensor" WHERE host='conway' name: temp key value --- ----- sensor coretemp_core0_crit sensor coretemp_core0_critalarm sensor coretemp_core0_input sensor coretemp_core0_max sensor coretemp_core1_crit sensor coretemp_core1_critalarm sensor coretemp_core1_input sensor coretemp_core1_max sensor coretemp_core2_crit sensor coretemp_core2_critalarm sensor coretemp_core2_input sensor coretemp_core2_max sensor coretemp_core3_crit sensor coretemp_core3_critalarm sensor coretemp_core3_input sensor coretemp_core3_max sensor coretemp_core4_crit sensor coretemp_core4_critalarm sensor coretemp_core4_input sensor coretemp_core4_max sensor coretemp_core5_crit sensor coretemp_core5_critalarm sensor coretemp_core5_input sensor coretemp_core5_max sensor coretemp_physicalid0_crit sensor coretemp_physicalid0_critalarm sensor coretemp_physicalid0_input sensor coretemp_physicalid0_maxFor the dashboard showing all relevant temperatures for a system this is a bit overkill and makes the dashboard hard to read. Solution: go for all the temperature sensors that end in 'input', with the variable in the dashboard defined as 'ending in input':> SHOW TAG VALUES ON "telegraf" WITH key="sensor" WHERE host='conway' AND sensor=~/input$/ name: temp key value --- ----- sensor coretemp_core0_input sensor coretemp_core1_input sensor coretemp_core2_input sensor coretemp_core3_input sensor coretemp_core4_input sensor coretemp_core5_input sensor coretemp_physicalid0_inputSo far this works with all physical systems.
2021-09-09 Collecting more system data with Telegraf for Influxdb/Grafana
I have been collecting certain system data for ages with rrdtool, but now I see what is possible with Telegraf collecting agent and after some initial attempts I'm all in favour and data is flowing. All the data I collected is already standard in telegraf, including entropy! Other data is also collected that is good to keep an eye on for performance. I made some tweaks to the standard telegraf configuration: collect every 5 minutes, not exactly on the clock since I read The mystery of load average spikes which reminded me of my own experience Be very careful of what you measure. I also avoid gathering data on nfs filesystems (which come and go thanks to autofs). I rolled out telegraf over all systems at home, and now there is a nice 'System info' dashboard in Grafana.
Grafana host dashboard with telegraf data including entropy. The dip in entropy is caused by the dnssec-signzone process
2021-09-01 Wildcard certificates and zerossl via acme protocol
I'm personally not a huge fan of wildcard TLS certificates (risks with reuse of the private key) so I didn't try those yet, but based on my experiences with certificates with multiple names with zerossl I got a response: Stephen Harris on Twitter: Do they support wildcards and I just had to try. And it works! I requested a certificate:Requested Extensions: X509v3 Subject Alternative Name: DNS:gosper.idefix.net, DNS:*.gosper.idefix.netAnd indeed it worked:Issuer: C = AT, O = ZeroSSL, CN = ZeroSSL ECC Domain Secure Site CA Validity Not Before: Sep 1 00:00:00 2021 GMT Not After : Nov 30 23:59:59 2021 GMT Subject: CN = gosper.idefix.net [..] X509v3 Subject Alternative Name: DNS:gosper.idefix.net, DNS:*.gosper.idefix.netSo that works too! The choice for gosper.idefix.net is because I already had dns records setup for dns-01 based verification of that name.
2021-08-30 Going all the way with zerossl: requesting a certificate with multiple names
I assumed the free tier of zerossl doesn't allow for certificates with multiple names but I guess I assumed wrong, because I just got issued a certificate with multiple names. After debugging my earlier issues with zerossl and finding out I forgot the CAA record this time I tried a certificate with the subjectAltName extension in use with more than one name.$ openssl req -in httprenewable/webserver-devvirtualbookcase.csr -noout -text [..] Attributes: Requested Extensions: X509v3 Subject Alternative Name: DNS:developer.virtualbookcase.com, DNS:perl.virtualbookcase.comAnd the certificate dance went fine with dehydrated:$ ./dehydrated/dehydrated --config /etc/dehydrated/config.zerossl -s httprenewable/webserver-devvirtualbookcase.csr > tmp/certificate.crt + Requesting new certificate order from CA... + Received 2 authorizations URLs from the CA + Handling authorization for developer.virtualbookcase.com + Handling authorization for perl.virtualbookcase.com + 2 pending challenge(s) + Deploying challenge tokens... + Responding to challenge for developer.virtualbookcase.com authorization... + Challenge is valid! + Responding to challenge for perl.virtualbookcase.com authorization... + Challenge is valid! + Cleaning challenge tokens... + Requesting certificate... + Order is processing... + Checking certificate... + Done! $ openssl x509 -in tmp/certificate.crt -noout -text | less [..] X509v3 Subject Alternative Name: DNS:developer.virtualbookcase.com, DNS:perl.virtualbookcase.comThe /etc/dehydrated/config.zerossl has the EAB_KID and EAB_HMAC_KEY values set to the ones associated with my account. This means zerossl works as a complete secondary certificate issuer and I could switch over completely in case LetsEncrypt isn't available. Choice is good!
2021-08-19 Trying zerossl as backup certificate provider
Based on the recent article Here's another free CA as an alternative to Let's Encrypt! I decided to check my options for having an alternative to LetsEncrypt. Not because I have or had any problems with LetsEncrypt, but I like having a backup option. So I started with zerossl as option. Sofar I did the whole registration and certificate request dance purely with the dehydrated client, but that gives an error on a certificate request:+ Requesting new certificate order from CA... + Received 2 authorizations URLs from the CA + Handling authorization for developer.virtualbookcase.com + Handling authorization for perl.virtualbookcase.com + 2 pending challenge(s) + Deploying challenge tokens... + Responding to challenge for developer.virtualbookcase.com authorization... + Challenge is valid! + Responding to challenge for perl.virtualbookcase.com authorization... + Challenge is valid! + Cleaning challenge tokens... + Requesting certificate... + Order is processing... ERROR: Order in status invalidCreating a zerossl account with a webbrowser and setting the EAB_KID and EAB_HMAC_KEY to the values from my zerossl account also doesn't help, that also ends with$ ./dehydrated/dehydrated --ca zerossl --config /etc/dehydrated/config.zerossl -s httprenewable/webserver-devvirtualbookcase.csr > tmp/certificate.crt + Requesting new certificate order from CA... + Received 2 authorizations URLs from the CA + Handling authorization for developer.virtualbookcase.com + Handling authorization for perl.virtualbookcase.com + 2 pending challenge(s) + Deploying challenge tokens... + Responding to challenge for developer.virtualbookcase.com authorization... + Challenge is valid! + Responding to challenge for perl.virtualbookcase.com authorization... + Challenge is valid! + Cleaning challenge tokens... + Requesting certificate... + Order is processing... ERROR: Order in status invalid
I realized a certificate for multiple names isn't supported by the free tier of zerossl.Removing one of the names from the certificate still made it end up in status 'invalid'. Also re-creating the account in dehydrated after creating the zerossl account and setting the EAB_KID and EAB_HMAC_KEY variables correctly didn't solve things yet. The same request works fine with LetsEncrypt so the issue is something with dehydrated / zerossl. Update: Sharing my woes gave a suggestion: Stephen Harris on Twitter: "@khoos You have a CAA record for virtualbookcase.com that might be blocking it." / Twitter and Stephen is absolutely right: I set up CAA records ages ago for all my domains. And the zerossl CAA document I can find absolutely agrees I need to add a CAA record allowing certificates by sectigo.com. Updated: And after waiting for DNS propagation and trying again I now have a zerossl.com certificate:Certificate: Data: Version: 3 (0x2) Serial Number: 4e:7b:c8:e9:ad:fd:14:ad:5c:ae:a2:57:fe:45:d9:41 Signature Algorithm: ecdsa-with-SHA384 Issuer: C = AT, O = ZeroSSL, CN = ZeroSSL ECC Domain Secure Site CA Validity Not Before: Aug 19 00:00:00 2021 GMT Not After : Nov 17 23:59:59 2021 GMT Subject: CN = perl.virtualbookcase.com
2021-07-27 Less logging in zigbee2mqtt to save the MicroSD in the Raspberry Pi
The recent MicroSD failure in the Raspberry Pi made me look at the logging in zigbee2mqtt as it is running for a long time and default logging includes every received message which would give a lot of wear on the MicroSD in the Raspberry Pi. So I changed the configuration to only log to console. This is something that can't be changed via an mqtt message, which is logical (otherwise it would have security implications). I may also look at less system logging to the MicroSD. Someone suggested to have a look at log2ram for this. This creates a ramdisk for logging which is synchronized to persistent storage every day or on shutdown.
2021-07-26 MicroSD failure in a Raspberry Pi
The Raspberry Pi in the attic running mainly dump1090 and some other software wasn't showing up in the system monitoring. On checking it turned out the MicroSD card was failing. This is a known issue in the Raspberry Pi which uses MicroSD as root filesystem. For as far as I can tell this card has been running continuously since February 2016, so over five years. I do have a different MicroSD card which is the old card from the Raspberry Pi in the utility closet which became available after I used a different card to reinstall the Raspberry Pi for smart meter monitoring. But that card has seen some wear since it has been running since installing the smart meter and starting energy monitoring on it in August 2016 so maybe it's not a good idea to rescue a system with it, it's also five years old. Time to order some new MicroSD cards! In the mean time I noticed I could get a bit of access to the broken card, but things stopped on mounting the linux root filesystem. It turned out that mount tries to write to the card to update the ext4 journal and the card stops completely on a write. When I mount it really readonly with# mount -o ro,noload /dev/sdb2 /mnt/scratch/almost all files are readable, so I recovered the dump1090 software and other configuration items. Yes, I need to add the Raspberry Pi systems to the backups. It would be really nice if I could monitor the health of the MicroSD card like I monitor other disks (including SSD) with smartmontools. Update: Three MicroSD cards ordered so I can replace this one and have a few spares ready. The size of those cards does mean I now have to make small bags labeled 'spare' or 'old card from system X' so I can see what they are without trying to mount them.
2021-07-12 Checking the rcu_sched messages finds repeated mention of cdrom scans
I was going through some rcu_sched messages and noticed kernel routines related to the cdrom drive showed up a few times in the tasks that were 'behind'.[335894.319961] [<ffffffffc03d864a>] ? scsi_execute+0x12a/0x1d0 [scsi_mod] [335894.320702] [<ffffffffc03da586>] ? scsi_execute_req_flags+0x96/0x100 [scsi_mod] [335894.321820] [<ffffffffc04a7703>] ? sr_check_events+0xc3/0x2c0 [sr_mod] [335894.322551] [<ffffffffb58224a5>] ? __switch_to_asm+0x35/0x70 [335894.323256] [<ffffffffb58224b1>] ? __switch_to_asm+0x41/0x70 [335894.323906] [<ffffffffc047d05a>] ? cdrom_check_events+0x1a/0x30 [cdrom] [335894.324545] [<ffffffffc04a8289>] ? sr_block_check_events+0x89/0xe0 [sr_mod] [335894.325186] [<ffffffffb551a9a9>] ? disk_check_events+0x69/0x150Because the virtual machines don't do anything with the virtual cdrom after the first installation I'm removing them from all virtual machines and see what that does for these messages.
2021-07-08 Another panic in a virtual machineItems with tag linux before 2021-07-08
At the end of this morning I noticed the root filesystem of the shell server on the homeserver had turned itself read-only. Another DRIVER_TIMEOUT error in the kernel messages. And I didn't want to get to a situation with half of the filesystem in lost+found like the previous time. This time I decided to use a different approach in the hopes of getting back to a working system faster. And they worked this time.
After things ran ok for a while I removed the snapshot. I also changed the configuration to use virtio disks and not ide emulation. Ide emulation disks have a timeout (DRIVER_TIMEOUT) after which things are given up. The fact that (emulated) I/O hangs for 30 seconds is bad, but maybe related to the rcu_sched messages. Maybe time for some more updates.
- echo s > /proc/sysrq-trigger to force a sync
- echo u > /proc/sysrq-trigger to force an unmount of all filesystems
- I killed the virtual machine with virsh destroy (the virtualization equivalent of pulling the plug)
- I created a snapshot of the virtual machine disk to make have a state of file system to return to in case of problems in the next steps
- I booted the virtual machine and it had indeed filesystem issues
- So reboot in maintainance mode and did a filesystem check
- After that it booted fine and the filesystem was fine, nothing in lost+found