From the logs of the check_trh process on flux I see these entries since it was started on April 9.  For some reason the higher TRHs had some issues yesterday.

TRH problems
Times in MDT:
 
fgrep cycling /var/log/messages*
Apr 18 18:49:09 flux check_trh.sh: 300m temperature is 137.88 . Power cycling port 5
Apr 23 13:03:58 flux check_trh.sh: 300m temperature is 174.1 . Power cycling port 5
Apr 23 13:06:58 flux check_trh.sh: 300m temperature is 174.28 . Power cycling port 5
Apr 23 13:08:18 flux check_trh.sh: 200m temperature is 181.61 . Power cycling port 5
Apr 23 13:08:38 flux check_trh.sh: 300m temperature is 174.28 . Power cycling port 5
Apr 23 13:08:58 flux check_trh.sh: 200m temperature is 181.53 . Power cycling port 5
Apr 23 13:09:48 flux check_trh.sh: 300m temperature is 174.06 . Power cycling port 5
Apr 23 13:16:48 flux check_trh.sh: 200m temperature is 179.15 . Power cycling port 5
Apr 23 13:19:18 flux check_trh.sh: 250m temperature is 173.33 . Power cycling port 5
Apr 23 13:30:38 flux check_trh.sh: 200m temperature is 177.04 . Power cycling port 5
Apr 23 13:48:38 flux check_trh.sh: 250m temperature is 171.63 . Power cycling port 5
Apr 23 13:50:48 flux check_trh.sh: 250m temperature is 173.08 . Power cycling port 5

 

Yesterday (April 23) I reworked things so that the check script is run on each DSM, including the bao station.  The only entries after that are from 300m. Subtracting 6 hours from the times, these are at 13:27-13:29 MDT

Times in UTC
 
ssh 300m fgrep cycling /var/log/isfs/dsm.log

Apr 23 19:27:33 300m root: temperature is -62.52 . Power cycling port 5
Apr 23 19:28:25 300m root: temperature is -62.52 . Power cycling port 5
Apr 23 19:29:41 300m root: temperature is -62.52 . Power cycling port 5

For example, here is the hiccup from 200m at 19:30:22 UTC.  Note after the first power cycle, things look good for 5 seconds, then it reports a bad temp of 89.92 at 19:30:50.1491 and is power cycled again, and works after that.

200m
data_dump -i 4,20 -A 200m_20150423_160000.dat | more
...
2015 04 23 19:30:17.3598   1.001      37 TRH30 15.13 27.28 34 0 1377 56 107\r\n
2015 04 23 19:30:18.3691   1.009      37 TRH30 15.09 27.28 33 0 1376 56 105\r\n
2015 04 23 19:30:19.3691       1      37 TRH30 15.13 27.28 34 0 1377 56 108\r\n
2015 04 23 19:30:20.3692       1      37 TRH30 15.09 27.28 33 0 1376 56 103\r\n
2015 04 23 19:30:21.3790    1.01      37 TRH30 15.09 27.28 34 0 1376 56 108\r\n
2015 04 23 19:30:22.6191    1.24      40 TRH30 177.00 260.02 36 0 5510 886 112\r\n
2015 04 23 19:30:23.6290    1.01      40 TRH30 177.00 260.18 35 0 5510 885 109\r\n
2015 04 23 19:30:24.6290       1      40 TRH30 177.04 260.21 34 0 5511 885 106\r\n
2015 04 23 19:30:25.6398   1.011      40 TRH30 177.08 260.40 33 0 5512 884 105\r\n
...
2015 04 23 19:30:37.6898   1.001      40 TRH30 177.26 260.19 32 0 5517 886 102\r\n
2015 04 23 19:30:38.6900       1      40 TRH30 177.23 260.33 34 0 5516 885 108\r\n
2015 04 23 19:30:39.6991   1.009      38 TRH30 177.30 260.87 5 0 5518 882 16\r\n
2015 04 23 19:30:43.7398   4.041       2 \n
2015 04 23 19:30:43.7408 0.001042      80 \r Sensor ID30   I2C ADD: 11   data rate: 1 (secs)  fan(0) max current: 80 (ma)\n
2015 04 23 19:30:43.8292 0.08842      44 \rresolution: 12 bits      1 sec MOTE: off\r\n
2015 04 23 19:30:43.8806 0.05133      28 calibration coefficients:\r\n
2015 04 23 19:30:43.9098 0.02924      21 Ta0 = -4.129395E+1\r\n
2015 04 23 19:30:43.9398 0.02995      21 Ta1 =  4.143320E-2\r\n
2015 04 23 19:30:43.9691 0.02937      21 Ta2 = -3.293163E-7\r\n
2015 04 23 19:30:43.9899 0.02073      21 Ha0 = -7.786594E+0\r\n
2015 04 23 19:30:44.0191 0.02922      21 Ha1 =  6.188832E-1\r\n
2015 04 23 19:30:44.0449 0.02582      21 Ha2 = -5.069766E-4\r\n
2015 04 23 19:30:44.0691 0.02418      21 Ha3 =  9.665616E-2\r\n
2015 04 23 19:30:44.0991    0.03      21 Ha4 =  6.398342E-4\r\n
2015 04 23 19:30:44.1191 0.02001      21 Fa0 =  3.222650E-1\r\n
2015 04 23 19:30:45.1098  0.9907      37 TRH30 15.17 26.14 32 0 1378 54 102\r\n
2015 04 23 19:30:46.1191   1.009      37 TRH30 15.17 26.14 33 0 1378 54 103\r\n
2015 04 23 19:30:47.1291    1.01      37 TRH30 15.17 26.14 34 0 1378 54 108\r\n
2015 04 23 19:30:48.1290  0.9999      37 TRH30 15.17 26.14 32 0 1378 54 101\r\n
2015 04 23 19:30:49.1390    1.01      37 TRH30 15.17 26.14 33 0 1378 54 105\r\n
2015 04 23 19:30:50.1491    1.01      32 TRH30 89.92 0.90 0 0 3251 0 0\r\n
2015 04 23 19:30:53.5790    3.43       2 \n
2015 04 23 19:30:53.5801 0.001042      80 \r Sensor ID30   I2C ADD: 11   data rate: 1 (secs)  fan(0) max current: 80 (ma)\n
2015 04 23 19:30:53.6699 0.08981      44 \rresolution: 12 bits      1 sec MOTE: off\r\n
2015 04 23 19:30:53.7213 0.05139      28 calibration coefficients:\r\n
2015 04 23 19:30:53.7491  0.0278      21 Ta0 = -4.129395E+1\r\n
2015 04 23 19:30:53.7790 0.02995      21 Ta1 =  4.143320E-2\r\n
2015 04 23 19:30:53.8083 0.02925      21 Ta2 = -3.293163E-7\r\n
2015 04 23 19:30:53.8290 0.02075      21 Ha0 = -7.786594E+0\r\n
2015 04 23 19:30:53.8601 0.03103      21 Ha1 =  6.188832E-1\r\n
2015 04 23 19:30:53.8898 0.02971      21 Ha2 = -5.069766E-4\r\n
2015 04 23 19:30:53.9108 0.02107      21 Ha3 =  9.665616E-2\r\n
2015 04 23 19:30:53.9398 0.02892      21 Ha4 =  6.398342E-4\r\n
2015 04 23 19:30:53.9691 0.02932      21 Fa0 =  3.222650E-1\r\n
2015 04 23 19:30:54.9590    0.99      37 TRH30 15.17 26.14 34 0 1378 54 107\r\n
2015 04 23 19:30:55.9598   1.001      37 TRH30 15.17 26.14 33 0 1378 54 103\r\n
2015 04 23 19:30:56.9691   1.009      37 TRH30 15.21 26.15 34 0 1379 54 108\r\n
2015 04 23 19:30:57.9691       1      37 TRH30 15.17 26.14 33 0 1378 54 103\r\n

Notice the delta-T column after the datetime. I've looked at a few of these, and I think that there is always a larger deltat-T (in this case 1.24 sec instead of 1.0 ) at the time of the initial bad data, in case that might help in debugging.

9am, Apr 25: Some more glitches since yesterday. Notice again that the problems in different sensors seem to occur at approximately simultaneous times:

ck_trh
200m
Apr 24 20:56:05 200m root: temperature is 170.15 . Power cycling port 5
Apr 24 20:57:10 200m root: temperature is 170.22 . Power cycling port 5

300m
Apr 24 20:51:47 300m root: temperature is -62.52 . Power cycling port 5
Apr 24 20:56:52 300m root: temperature is -62.52 . Power cycling port 5