You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

Gordon Oct 23

I believe the issue Chris saw this morning on 6 is similar to what I've seen from time to time on other systems. When I've logged into a system and it is slooooow, I've run top and see that ntpd is taking 100% of the cpu. I believe that is because the tee_tty process has died. This message was in the system log for 6:

Oct 23 07:21:35 Ah6 tee_tty: 2012-10-23,07:21:35|NOTICE|received signal Interrupt(2), si_signo=2, si_errno=0, si_code=128

This happens about once every week or 2 on 1 of the 22 systems, so it is hard to diagnose.

tee_tty reads from the physical serial port that the GPS is connected to and sends that data to pseudo-terminals, one read by ntpd and one read by the dsm process. It appears that if tee_tty dies, then ntpd goes into an infinite loop, mishandling, or misdetecting the error on the input port.

Previously I tried to change ntpd so that it might catch the error, but no success.

Now I might know why the tee_tty is being sent the SIGINT signal. I turns out the serial port is opened in "cooked" mode, which means that if it somehow receives a ctrl-C on that port then the process is sent a SIGINT. I guess a ctrl-C could also be received on any pseudo-terminal that is opened for reading.

Today around 3 pm I logged into all the stations and updated the file /etc/gps.conf to change the "c" in GPS_TEE_OPTS to "r", so that the serial port is opened in "raw" mode:

GPS_TEE_OPTS="4800n81lnrxx -p 60 -l pps"

Next I'm rebuilding tee_tty so that if the real serial port is opened in raw mode that the pseudo-terminals are also opened in that way.

  • No labels