Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

11:45   Low lost communication with its emerald boards around 11:00.

rebooted Rebooted low and everything back up.

...

Code Block
languagenone
Jun 23 16:49:50 low kernel: i2c i2c-0: i2c_pxa: timeout waiting for bus free
Jun 23 16:49:53 low last message repeated 5 times
Jun 25 09:27:58 low kernel: handle_IRQ_event called 4 times for IRQ 3
Jun 25 17:46:14 low kernel: handle_IRQ_event called 4 times for IRQ 3

So it turns out those messages are old and aren't much help.

Those were the only messages before the reboot, and they occurred at least 23 hours earlier, which means the problem is not due to a kernel oops, or any other atypical event that the kernel could detect. It is just the good ol' situation where there seems to be a very small possibility that a PC104 interrupt can be missed, and not retriggered, even though the PC104 IRQ interrupt line is high, such that the interrupt handler is never again called.

I believe restarting the dsm process with a ddn/dup, which closes and re-opens the serial ports, I believe a adn/aup will bring it back too.

I may add a timeout to every serial sensor on low, something like 20 seconds
(longer than the mote reporting interval). That should help to recover more quickly.just updated the xml on the low DSM so that every sensor has a timeout. The dsm process should then close and reopen each port after detecting the timeout, which should also help to recover from this situation more quickly.

Seems that I need to install a PC104 interrupt watchdog module. There is some indication this has happened on the aircraft, also quite infrequently. A test is being setup out at RAF.

When the PC104 interrupts are being handled, the irqs listing looks like so, showing 275 interrupts/sec from the Emerald cards:

Code Block
languagenone

root@low root# irqs
Counting interrupts over 5 seconds ...

IRQ      Interrupt Type            Total Int  Int/sec
------------------------------------------------------
3:       ISA serial:               1376       275.2
24:      GPIO-l eth0:              62         12.4
25:      GPIO-l GPIO1-PC104:       1376       275.2
36:      SC serial:                15         3
37:      SC serial:                101        20.2
42:      SC ost0:                  509        101.8
114:     GPIO isp116x-hcd:usb1:    90         18
115:     GPIO serial:              228        45.6
116:     GPIO serial:              102        20.4