The DSM had been rebooting intermittently, so I visited the site around 4:30 pm 8/19 MT.
While there wondering if there was a way to unlock and open the battery box, I heard the TRH fan drop off and then start up again immediately. So the power outage was brief. I also noticed the ttstation ubiquiti wifi was up, which only happens for the first 15 minutes after boot. I connected to the ubiquiti and confirmed uptime of 10 minutes.
Wiggling the plug in the gfi outlet in the weatherproof box did nothing.
I verified poe passthrough is off on the radio, so I am able to connect to the DSM through the radio lan2 port. Just to remove the switch as a possible failure point, I disconnected the power from the switch and connected the ubiquiti directly to the Pi.
I had to restart the json data service after the system date updates, and then dashboard looked good.
systemctl --user restart json_data_stats
The power outages could be caused by a problem in the victron, so I looked for messages from the victron in the raw data which might indicate an error
aq@tt:~ $ lsu
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 58G 1.8G 53G 4% /media/usbdisk
-r--r--r-- 1 daq eol 117670 Aug 19 19:25 tt_20100201_000129.dat
-r--r--r-- 1 daq eol 189288 Aug 19 19:40 tt_20100201_000130.dat
-r--r--r-- 1 daq eol 124325 Aug 19 19:52 tt_20100201_000131.dat
-r--r--r-- 1 daq eol 119383 Aug 19 20:04 tt_20100201_000132.dat
-r--r--r-- 1 daq eol 118837 Feb 1 2010 tt_20100201_000133.dat
-r--r--r-- 1 daq eol 121911 Feb 1 2010 tt_20100201_000134.dat
-r--r--r-- 1 daq eol 119202 Aug 19 22:28 tt_20100201_000135.dat
-r--r--r-- 1 daq eol 116753 Aug 19 22:40 tt_20100201_000136.dat
-r--r--r-- 1 daq eol 49322 Jun 24 23:01 tt_20100201_000851.dat
-r--r--r-- 1 daq eol 8806338 Aug 19 23:04 tt_20210819_224007.dat
chrony/tracking.log, the reboot after 22:38 utc (16:38 mt) took 1 minute 12 seconds to sync to PPS. I think that means the ublox is keeping leap seconds, and I'd bet pps starts sooner than that, so maybe we need to change some chrony settings to sync to it faster. Or maybe that's how long it takes chrony to start up, in which case a battery-backed system clock really makes sense.
2021-08-19 22:38:32 PPS 1 6.860 0.009 1.660e-07 N 1 5.438e-08 -2.242e-08 0.000e+00 9.651e-06 2.594e-05
Date (UTC) Time IP Address St Freq ppm Skew ppm Offset L Co Offset sd Rem. corr. Root delay Root disp. Max. error
2010-02-01 00:00:08 0.0.0.0 0 7.225 0.008 0.000e+00 ? 0 0.000e+00 -2.891e-16 1.000e+00 1.000e+00 1.500e+00
2010-02-01 00:01:12 PPS 1 7.225 0.008 -3.644e+08 N 1 1.209e-06 -2.989e-10 0.000e+00 4.525e+02 1.500e+00
2021-08-19 22:40:23 PPS 1 7.225 0.010 -9.001e-07 N 1 1.396e-06 -7.381e-11 0.000e+00 9.621e-06 3.644e+08
2021-08-19 22:40:39 PPS 1 7.224 0.017 -9.389e-07 N 1 1.322e-06 5.131e-07 0.000e+00 1.112e-05 2.937e-05
2021-08-19 22:40:55 PPS 1 7.223 0.030 5.480e-07 N 1 1.656e-06 3.884e-07 0.000e+00 1.168e-05 3.100e-05
There are no non-zero ERR reports in the latest data file:
data_dump -i 1,60 /media/usbdisk/projects/LOTOS2021/raw_data/tt_20210819_224007.dat |& egrep ERR | egrep -v 'ERR\\t0\\r' | less
I looked for a way to get an uptime from the victron, to see if the victron itself was power cycling, but I could not find that.
Looking at the wiki page for the victron (https://wiki.ucar.edu/pages/viewpage.action?pageId=398004758), the victron lowers the charging power by 50% every 10 minutes. Could that correlate with the outages?
daq@tt:~ $ data_dump -i -1,60 /media/usbdisk/projects/LOTOS2021/raw_data/tt_20210819_224007.dat | grep PPV | cut -c 50-70 | sed -e 's/\\t/ /g' -e 's/\\r\\n//' -e 's/PPV//' | sort -n | uniq
Exception: EOFException: /media/usbdisk/projects/LOTOS2021/raw_data/tt_20210819_224007.dat: open: EOF
The PPV is always from 21-28 V, so no sign that it is dropping off. However, if it dropped off suddenly, then that would kill everything before it could be logged.
I installed victronconnect app on my phone, got the bluetooth app and connection working, but I couldn't see any settings which could be causing output to cut. There is a setting to cut output according to battery temperature, but that is disabled. The output mode is "always on".