Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

Gordon Oct 25

Cockpit data was not coming in.

Traced this down a system clock problem on flux. Its clock gradually lost time, and then the dsm_server process started throwing away samples, with this message in /var/log/isfs/isfs.log:

Code Block
Oct 25 14:54:54 flux dsm_server[30058]: WARNING|sample with timetag in future by 2.305084 secs. time: 2012 Oct 25 20:54:56.861 id=10,20 total future samples=39039001

The problem was in /etc/chrony.conf on flux and gully. The chrony processes were configured to use keys, which were different, and so flux was not able to get time information from gully. Commented out

Code Block
# keyfile /etc/chrony.keys

statement in both, and restarted chronyd

Code Block
systemctl restart chronyd.service

chronyd on flux is configured to get time from gully, via "server 192.168.0.12" directive in /etc/chrony.conf on flux.

The server directives in /etc/chrony.conf on gully now look like so:

Code Block
# server 0.fedora.pool.ntp.org iburst
# server 1.fedora.pool.ntp.org iburst
# server 2.fedora.pool.ntp.org iburst
# server 3.fedora.pool.ntp.org iburst
server ntp.colostate.edu
server c20
server m21
server m22

chrony was never able to connect any of the fedora.pool.ntp.org servers. The chronyc sources command always returned lines like this for the pool servers:

Code Block
MS Name/IP address           Stratum Poll LastRx Last sample
============================================================================
^? 64.73.32.134                  0    6    10y     +0ns[   +0ns] +/-    0ns
^? ns1.your-site.com             0    6    10y     +0ns[   +0ns] +/-    0ns
^? 64.73.32.135                  0    6    10y     +0ns[   +0ns] +/-    0ns
^? cheezum.mattnordhoff.net      0    6    10y     +0ns[   +0ns] +/-    0ns

The traffic is probably blocked by a firewall somewhere.

In order to have a reference check for our DSMs, I added ntp.colostate.edu as a chrony server for gully, which works, which means our router is not blocking the traffic.

chronyc sources on gully with the above server configuration looks like so, which indicates that our DSMs (with PPS from a GPS) have better clocks than ntp.colostate.edu:

Code Block
# chronyc sources
210 Number of sources = 4
MS Name/IP address           Stratum Poll LastRx Last sample
============================================================================
^+ yuma.acns.colostate.edu       2    6      5    +75ms[  +75ms] +/-  173ms
^+ C20                           3    6      5  -1281us[-1287us] +/- 3162us
^+ M21                           3    6      5   +470us[ +464us] +/- 2506us
^* Mu22                          3    6      5    -30us[  -36us] +/- 1799us

chronyd