tnw07b is responding to pings but ssh connections are reset. Oops, I made a mistake, nmap does look like a DSM. (Port 22 is ssh, 8888 is tinyproxy, 30000 is dsm. Xinetd check-mk 6556 is not listed I suspect because nmap does not scan it by default.)
[daq@ustar raw_data]$ nmap tnw07b
Starting Nmap 7.40 ( https://nmap.org ) at 2017-03-10 17:40 WET
Nmap scan report for tnw07b (192.168.1.146)
Host is up (0.055s latency).
Not shown: 997 closed ports
PORT STATE SERVICE
22/tcp open ssh
8888/tcp open sun-answerbook
30000/tcp open ndmps
And data are coming in:
tnw07b:/dev/gps_pty0 7 2 32 2017 03 10 17:41:25.874 03 10 17:41:41.036 2.04 0.142 0.925 69 80
tnw07b:/dev/ttyUSB0 7 22 16 2017 03 10 17:41:25.828 03 10 17:41:40.958 0.99 1.008 1.009 39 39
tnw07b:/dev/ttyUSB4 7 102 16 2017 03 10 17:41:26.368 03 10 17:41:41.434 1.00 1.004 1.005 38 38
tnw07b:/dev/ttyUSB7 7 32768 4 2017 03 10 17:41:26.049 03 10 17:41:41.049 0.20 4.772 5.228 17 30
The data connection to dsm_server is in fact from the right IP address, so tnw07b appears to be configured correctly:
[root@ustar daq]# netstat -ap | grep tnw07b
tcp 0 0 ustar:51968 tnw07b:43666 ESTABLISHED 9440/dsm_server
According to nagios, tnw07b was responding to check-mk requests until 2017-03-09 10:52 UTC, so something happened then which now causes network connections to be reset. Probably this system needs to be rebooted.
Steve noticed that ports ttyS11 and ttyS12 are no longer reporting any data on rsw04. After getting rsw04 updated and clearing off the USB yesterday and restarting DSM, those ports are still not reporting. They were working until Feb 25. ttyS10 was out for a while also, but it came back this morning at 2017 03 10 12:16:47.339, before the reboot.
[daq@ustar raw_data]$ data_stats rsw04_20170*.dat
Exception: EOFException: rsw04_20170310_155028.dat: open: EOF
sensor dsm sampid nsamps |------- start -------| |------ end -----| rate minMaxDT(sec) minMaxLen
rsw04:/dev/gps_pty0 35 10 3944084 2017 02 03 09:44:08.995 03 10 15:58:31.569 1.29 0.015 1090606.000 51 73
rsw04:/var/log/chrony/tracking.log 35 15 133438 2017 02 03 09:44:53.133 03 10 15:58:25.024 0.04 0.000 1090616.750 100 100
rsw04:/dev/ttyS11 35 100 38021353 2017 02 03 09:44:08.517 02 25 09:48:48.136 20.00 -0.016 0.992 60 77
rsw04:/dev/ttyS12 35 102 38021782 2017 02 03 09:44:12.831 02 25 09:48:48.206 20.00 -0.107 1.390 40 125
rsw04:/dev/dmmat_a2d0 35 208 39114544 2017 02 03 09:44:08.570 03 10 15:58:31.363 12.84 0.034 1090604.875 4 4
rsw04:/dev/ttyS10 35 32768 767733 2017 02 03 09:44:13.130 03 10 15:58:31.137 0.25 -0.031 1132080.875 12 104
Steve tried connecting to the ports directly yesterday and did not see anything. After the reboot, I still don't see anything either. This is a viper, so I'm thinking ports 11 and 12 are on the second emerald serial card, and these log messages are relevant:
[ 41.641774] emerald: NOTICE: version: v1.2-522
[ 41.842945] emerald: INFO: /dev/emerald0 at ioport 0x200 is an EMM=8
[ 41.871947] emerald: WARNING: /dev/emerald1: Emerald not responding at ioports=0x240, val=0x8f
[ 41.881346] emerald: WARNING: /dev/emerald1: Emerald not responding at ioports=0x2c0, val=0x8f
[ 41.890757] emerald: WARNING: /dev/emerald1: Emerald not responding at ioports=0x300, val=0x8f
(I assume Preban also was there...)
Working very late (until at least 10:30pm), the DTU crew got all of our installed masts on the network, though a few DSMs didn't come up. We're very grateful!
From several Per emails:
The fiber for the internet is still not working, but José C. has promised that someone will come on Tuesday to have a look at it. I can see that the media converter reports an error on one the fibers.
We have brought a Litebeam 5 AC 23dBi with us and we have placed it on the antenna pole of the ops center. That has helped significantly on the performance and stability of link to the ridges. So I don’t think It’ll be necessary for you to manufacture any special brackets.
We have then placed the “old” litebeam from the ops center according to Teds plan at rNE_06. We have also placed the 19 dBi spare NanoBeam on RiNE_07 and reconfigured Tower 10 to match the new NanoBeam. So now we’re only lacking to replace the last of the 3 Prisms which I noticed was now mounted in tower 37. The Litebeam that Ted has ordered could maybe then replace that one?
We have gained some more bandwidth from the ops center to tower 29 by moving the frequencies further away from the ones being used by the two sector antennas at tower 29. It seemed like these three antennas close by each other were interfering.
As you already has discovered the fiber was fixed to day. It turned out that we had two issues with the connection out of here. Rx fiber was broken close to the first junction box they have. Aparently a couple of kilometers from here. The Tx fiber also had a problem with too sharp a bent in the very first electricity pole outside the building. The latter could explain the changing performance we were seeing on the line performance.
The last 100m tower was successfully instrumented today, and your DSM’s should with a little luck be visible on the network.
We have changed the Ubiquiti config in the 4 army alu towers behind riNE07. They should now be online.
A few of the ubiquities on the towers were not set up with the proper wireless security rules, some were locked on the MAC address of the old AP we replaced (the Prism) and the last one was set in the wrong network mode.
We have moved a few towers from the planned accesspoint to another were the signal quality was higher. I still miss to correct it on the spreadsheet, I’ll do that asap.
The ARL ubiquities were all having the wrong PSK. José C. forwarded me a mail from a Sean, where he says there’s an IP conflict in one of his units, but they all seemed to have the IP address stated to the far right in the spreadsheet. And not the .110 to .113 stated in the mail. I were not able to access the web config page as described in his mail either, but since the IP’s matched Ted’s spreadsheet I put them on the network.
This was reporting all NA. pio got it to work. I'm actually surprised, since I thought we had seen this problem in Jan and had even sent people up the tower to check the sonic head connection, with no success then...
Now that the network is up and we can look at things, I'm finding lots of TRHs with ifan=0:
tse06.2m: #67, no response to ^r, responded to pio (after power cycle, responds to ^r)
tse06.10m: #43, no response to ^r, pio didn't restart fan (after power cycle, responds to ^r)
tse06.60m: #8, responds to ^r and pio, but didn't restart fan
tse09b.2m: #103, ^r worked
tse11.2m: #120, no response to ^r, responded to pio
tse11.20m: #116, responds to ^r and pio, but didn't restart fan
tse11.40m: #110, responds to ^r and pio, but didn't restart fan
tse11.60m: #121, was in weird cal land, no response to ^r, responded to pio
tse13.2m: #119, no response to ^r, pio didn't restart fan (after power cycle, responds to ^r)
tse13.100m: #111, no response to ^r, pio didn't restart fan, reset CUR 200, now running at 167mA (and T dropped by 0.2 C). WATCH THIS! (has been running all day today)
tnw07.10m: #42, no response to ^r, responded to pio
tnw07.60m: #125, ^r killed totally! pio doesn't bring back. dead.