I always noticed that I had to plug in the USB cable for the remote radio with
the radio switched off, otherwise the Kenwood TS480 would switch into transmit
mode and stay there until I powered the radio off.
Annoying, and I thought it was something in the serial initialization.
Recently I was thinking about this and remembered something about query
sequences on serial devices triggering weird behaviour in other devices.
From what I read about the Kenwood serial protocol the chance of a few stray
characters changing something in the radio is quite possible.
So I considered what Linux software could do a query as soon as a serial
port is added to the system. Well, modemmanager was the ideal candidate for
this:
Package: modemmanager
[..]
Description-en: D-Bus service for managing modems
ModemManager is a DBus-activated daemon which controls mobile broadband
(2G/3G/4G) devices and connections. Whether built-in devices, USB dongles,
Bluetooth-paired telephones or professional RS232/USB devices with external
power supplies, ModemManager is able to prepare and configure the modems and
setup connections with them.
And indeed, simply removing modemmanager made the problem go away. I can
now plug in the USB cable when the radio is on and nothing happens.
After starting the collection of a lot of the system data I wanted with telegraf/influxdb/grafana
one small part was missing: the system temperature sensors. I like these, so I
had a look and found the inputs.temp plugin in telegraf which is
normally disabled.
Enabling it on hosts that have actual hardware to measure worked ok. On the
Raspberry Pi systems it gives one temperature:
> SHOW TAG VALUES ON "telegraf" WITH key="sensor" WHERE host='joy'
name: temp
key value
--- -----
sensor cpu_thermal_input
For the dashboard showing all relevant temperatures for a system this is
a bit overkill and makes the dashboard hard to read. Solution: go for all
the temperature sensors that end in 'input', with the variable in the
dashboard defined as 'ending in input':
> SHOW TAG VALUES ON "telegraf" WITH key="sensor" WHERE host='conway' AND sensor=~/input$/
name: temp
key value
--- -----
sensor coretemp_core0_input
sensor coretemp_core1_input
sensor coretemp_core2_input
sensor coretemp_core3_input
sensor coretemp_core4_input
sensor coretemp_core5_input
sensor coretemp_physicalid0_input
Grafana host dashboard with telegraf data including entropy. The dip in entropy
is caused by the dnssec-signzone process
I have been collecting certain system data for ages with rrdtool, but now I
see what is possible with Telegraf collecting agent
and after some initial attempts I'm all in favour and data is flowing.
All the data I collected is already standard in telegraf, including entropy!
Other data is also collected that is good to keep an eye on for performance.
I made some tweaks to the standard telegraf configuration: collect every
5 minutes, not exactly on the clock since I read The mystery of load average spikes
which reminded me of my own experience Be very careful of what you measure.
I also avoid gathering data on nfs filesystems (which come and go thanks to
autofs).
I rolled out telegraf over all systems at home, and now there is a nice
'System info' dashboard in Grafana.
Requested Extensions:
X509v3 Subject Alternative Name:
DNS:gosper.idefix.net, DNS:*.gosper.idefix.net
And indeed it worked:
Issuer: C = AT, O = ZeroSSL, CN = ZeroSSL ECC Domain Secure Site CA
Validity
Not Before: Sep 1 00:00:00 2021 GMT
Not After : Nov 30 23:59:59 2021 GMT
Subject: CN = gosper.idefix.net
[..]
X509v3 Subject Alternative Name:
DNS:gosper.idefix.net, DNS:*.gosper.idefix.net
So that works too! The choice for gosper.idefix.net is because I
already had dns records setup for dns-01 based verification of that name.
I assumed the free tier of zerossl doesn't allow for certificates with multiple
names but I guess I assumed wrong, because I just got issued a certificate with
multiple names.
After debugging my earlier issues with zerossl and finding out I forgot the CAA record
this time I tried a certificate with the subjectAltName extension
in use with more than one name.
And the certificate dance went fine with dehydrated:
$ ./dehydrated/dehydrated --config /etc/dehydrated/config.zerossl -s httprenewable/webserver-devvirtualbookcase.csr > tmp/certificate.crt
+ Requesting new certificate order from CA...
+ Received 2 authorizations URLs from the CA
+ Handling authorization for developer.virtualbookcase.com
+ Handling authorization for perl.virtualbookcase.com
+ 2 pending challenge(s)
+ Deploying challenge tokens...
+ Responding to challenge for developer.virtualbookcase.com authorization...
+ Challenge is valid!
+ Responding to challenge for perl.virtualbookcase.com authorization...
+ Challenge is valid!
+ Cleaning challenge tokens...
+ Requesting certificate...
+ Order is processing...
+ Checking certificate...
+ Done!
$ openssl x509 -in tmp/certificate.crt -noout -text | less
[..]
X509v3 Subject Alternative Name:
DNS:developer.virtualbookcase.com, DNS:perl.virtualbookcase.com
The /etc/dehydrated/config.zerossl has the EAB_KID and
EAB_HMAC_KEY values set to the ones associated with my account.
This means zerossl works as a complete
secondary certificate issuer and I could switch over completely in case
LetsEncrypt isn't available. Choice
is good!
Based on the recent article Here's another free CA as an alternative to Let's Encrypt!
I decided to check my options for having an alternative to LetsEncrypt.
Not because I have or had any problems with LetsEncrypt, but I like having
a backup option. So I started with zerossl as option.
Sofar I did the whole registration and certificate request dance purely with
the dehydrated client, but that gives an error on a certificate
request:
+ Requesting new certificate order from CA...
+ Received 2 authorizations URLs from the CA
+ Handling authorization for developer.virtualbookcase.com
+ Handling authorization for perl.virtualbookcase.com
+ 2 pending challenge(s)
+ Deploying challenge tokens...
+ Responding to challenge for developer.virtualbookcase.com authorization...
+ Challenge is valid!
+ Responding to challenge for perl.virtualbookcase.com authorization...
+ Challenge is valid!
+ Cleaning challenge tokens...
+ Requesting certificate...
+ Order is processing...
ERROR: Order in status invalid
Creating a zerossl account with a webbrowser and setting the EAB_KID
and EAB_HMAC_KEY to the values from my zerossl account also doesn't
help, that also ends with
$ ./dehydrated/dehydrated --ca zerossl --config /etc/dehydrated/config.zerossl -s httprenewable/webserver-devvirtualbookcase.csr > tmp/certificate.crt
+ Requesting new certificate order from CA...
+ Received 2 authorizations URLs from the CA
+ Handling authorization for developer.virtualbookcase.com
+ Handling authorization for perl.virtualbookcase.com
+ 2 pending challenge(s)
+ Deploying challenge tokens...
+ Responding to challenge for developer.virtualbookcase.com authorization...
+ Challenge is valid!
+ Responding to challenge for perl.virtualbookcase.com authorization...
+ Challenge is valid!
+ Cleaning challenge tokens...
+ Requesting certificate...
+ Order is processing...
ERROR: Order in status invalid
I realized a certificate for multiple names isn't supported by the free tier
of zerossl. Removing one of the names from the certificate still made it end
up in status 'invalid'.
Also re-creating the account in dehydrated after creating the zerossl
account and setting the EAB_KID and EAB_HMAC_KEY variables
correctly didn't solve things yet. The same request works fine with LetsEncrypt
so the issue is something with dehydrated / zerossl.
Update:
Sharing my woes gave a suggestion: Stephen Harris on Twitter: "@khoos You have a CAA record for virtualbookcase.com that might be blocking it." / Twitter
and Stephen is absolutely right: I set up CAA records ages ago for all my
domains. And the zerossl CAA document I can find
absolutely agrees I need to add a CAA record allowing certificates by
sectigo.com.
Updated:
And after waiting for DNS propagation and trying again I now have a zerossl.com
certificate:
Certificate:
Data:
Version: 3 (0x2)
Serial Number:
4e:7b:c8:e9:ad:fd:14:ad:5c:ae:a2:57:fe:45:d9:41
Signature Algorithm: ecdsa-with-SHA384
Issuer: C = AT, O = ZeroSSL, CN = ZeroSSL ECC Domain Secure Site CA
Validity
Not Before: Aug 19 00:00:00 2021 GMT
Not After : Nov 17 23:59:59 2021 GMT
Subject: CN = perl.virtualbookcase.com
The recent MicroSD failure in the Raspberry Pi
made me look at the logging in zigbee2mqtt as it is running for a long time
and default logging includes every received message which would give a lot
of wear on the MicroSD in the Raspberry Pi. So I changed the configuration
to only log to console.
This is something that can't be changed via an mqtt message, which is
logical (otherwise it would have security implications).
I may also look at less system logging to the MicroSD. Someone suggested to
have a look at log2ram
for this. This creates a ramdisk for logging which is synchronized to
persistent storage every day or on shutdown.
The Raspberry Pi in the attic running mainly dump1090 and some other software
wasn't showing up in the system monitoring. On checking it turned out the
MicroSD card was failing. This is a known issue in the Raspberry Pi which
uses MicroSD as root filesystem. For as far as I can tell this card has been
running continuously since February 2016, so over five years.
I do have a different MicroSD card which is the old card from the Raspberry
Pi in the utility closet which became available after I used a different
card to reinstall the Raspberry Pi for smart meter monitoring.
But that card has seen some wear since it has been running since
installing the smart meter and starting energy monitoring on it in August 2016 so maybe it's not a
good idea to rescue a system with it, it's also five years old.
Time to order some new MicroSD cards! In the mean time I noticed I could get a
bit of access to the broken card, but things stopped on mounting the linux root
filesystem. It turned out that mount tries to write to the card to update the
ext4 journal and the card stops completely on a write. When I mount it really
readonly with
# mount -o ro,noload /dev/sdb2 /mnt/scratch/
almost all files are readable, so I recovered the dump1090 software and other
configuration items. Yes, I need to add the Raspberry Pi systems to the
backups.
It would be really nice if I could monitor the health of the MicroSD card like
I monitor other disks (including SSD) with smartmontools.
Update:
Three MicroSD cards ordered so I can replace this one and have a few spares
ready. The size of those cards does mean I now have to make small bags
labeled 'spare' or 'old card from system X' so I can see what they are
without trying to mount them.
I was going through some rcu_sched messages and noticed kernel
routines related to the cdrom drive showed up a few times in the tasks that
were 'behind'.
Because the virtual machines don't do anything with the virtual
cdrom after the first installation I'm removing them from all virtual machines
and see what that does for these messages.
At the end of this morning I noticed the root filesystem of the shell server
on the homeserver had turned itself read-only. Another DRIVER_TIMEOUT
error in the kernel messages. And I didn't want to get to a situation
with half of the filesystem in lost+found
like the previous time.
This time I decided to use a different approach in the hopes of getting back
to a working system faster. And they worked this time.
echo s > /proc/sysrq-trigger to force a sync
echo u > /proc/sysrq-trigger to force an unmount of all filesystems
I killed the virtual machine with virsh destroy (the virtualization equivalent of pulling the plug)
I created a snapshot of the virtual machine disk to make have a state of
file system to return to in case of problems in the next steps
I booted the virtual machine and it had indeed filesystem issues
So reboot in maintainance mode and did a filesystem check
After that it booted fine and the filesystem was fine, nothing in lost+found
After things ran ok for a while I removed the snapshot. I also changed the
configuration to use virtio disks and not ide emulation. Ide emulation disks
have a timeout (DRIVER_TIMEOUT) after which things are given up. The fact that
(emulated) I/O hangs for 30 seconds is bad, but maybe related to the
rcu_sched messages. Maybe time for some more updates.