For some reason, the iss2 DSM lost the ability to keep it's IP address between boots. I discovered many days ago a situation where the DSM came up using DHCP instead of 192.168.0.100, so I ran set_ip again to set the static address. The interface was configured correctly and the DSM was running fine. When the power was cycled on the DSM a few days ago in an attempt to fix missing surface data, it came back up with DHCP again and got address 192.168.0.132. The realtime network stream to the data manager still works, but the DSM looks down because it is not at the expected address and rsync does not work. In the end, I had to fix the DSM by reverting to an older interface configuration, but now it boots with the right address. (The details will be in JIRA, since the underlying problem is not fixed.)

I also ran apt-get update and apt upgrade on the DSM, in case that would fix the problem.

While investigating that problem, I discovered an unidentified host at 192.168.0.120. It turns out the GAUS PC IPMI IPV4 interface was enabled in firmware (ctl-E during bios boot), and it was set to 192.168.0.120, causing it to respond to pings even when off. So I've disabled that, and hopefully it was enabled for a reason. I'm a little concerned as to how it got enabled in the first place.

I fixed the expected DSM sample ids on iss2, so nagios checks are all green now and should stay green.  Surface ingest and GAUS broadcast messages are working again.

I also noticed the 20170526_06z launch was mislabelled as 20170527_06z, so that will have to be sorted out later.

 

  • No labels