Gordon, Nov 29

ssh'd into the systems, and checked the interrupt rates With the "intcount" command.

On vipers and titans, ttyS1, 2 and 3 are the serial ports on the CPU board. On both systems ttyS1 is served by interrupt 37, ttyS2 by interrupt 36. On vipers, ttyS3 is interrupt 116. On titans, ttyS3 is interrupt 122

The Emerald serial expansion board(s), serving serial ports ttyS5 to tty12 (and ttyS13-ttyS20 on m21) are configured to use ISA interrupt 3. The PC104 interrupts (in this case just IRQ 3) get multiplexed by a CPLD on the CPU board into a GPIO interrupt. On vipers, the PC104/GPIO interrupt is number 25, GPIO line 01. On titans, the PC104/GPIO interrupt is number 129, on GPIO 17.

Interrupts/second

DSM

CPU

ID#'s (top/middle/bottom)

Emerald

ttyS5-20, IRQ 3

PC104/GPIO

ttyS2, IRQ 36

ttyS1, IRQ 37

ttyS3 IRQ 116/122

USB

kernel

notes

a1

V2

 

8P 330002

150-900

150-900

20

3.2

20

1018

2.6.35.9-ael1-1-viper Sep 14 10:54:19 MDT 2012

1,3

a2

V6

607-00655-005-106-39-01857
Rev I Dec04 SN:W250043
Rev G Jun09 SN:W327962

 

5.4

5.4

20

3

20

1019

2.6.35.9-ael1-1-viper Sep 14 10:54:19 MDT 2012

3

a3

T7

6570-00703-002-101-39-00481
Rev I May04 SN:W241864
Rev M Jan01 SN:W390480

8X 241864

18.8

18.8

20

2

 

13.4

2.6.35.9-ael1-2-titan Oct 13 12:45:33 MDT 2012

 

a4

T5 /T1

6570-00703-002-101-39-00088
Rev C Aug09 SN:W329992
Rev H Jul12 SN:441913

 

 

 

20

3

4

13.4

2.6.35.9-ael1-2-titan Sep 14 11:29:27 MDT 2012

 

a5

V13

607-00655-005-106-39-02050
Rev C Aug09 SN:W329987
Rev G Dec08 SN:W317328

8P W329987

2300-3900

2300-3900

20

3

20

1020

2.6.35.9-ael1-1-viper Oct 4 13:21:09 MDT 2012

1,3

a6

V14

607-00655-005-106-39-02003
Rev I May04 SN:W242045
Rev E May04 SN:W240(9?)44

 

5.4

5.4

20

3

20

1044

2.6.35.9-ael1-1-viper Sep 14 10:54:19 MDT 2012

3

a7

T11

6570-00703-002-101-39-00495
(no middle)
Rev M Nov09 SN:W388778

 

 

 

20

2

4

15

2.6.35.9-ael1-2-titan Sep 14 11:29:27 MDT 2012

 

a8

V12

 

8P W274095

1500-2550

1500-2550

20

2

20

1020

2.6.35.9-ael1-1-viper Sep 14 10:54:19 MDT 2012

1,3

a9

V1

607-00655-005-106-39-01994
Rev C Aug09 SN:W329969
Rev F Jan06 “A1501P1”

 

 

 

20

2

20

1020

2.6.35.9-ael1-1-viper Sep 14 10:54:19 MDT 2012

3

a10

V9

 

8P W274177

 

 

20

13

20

1020

2.6.35.9-ael1-1-viper Sep 14 10:54:19 MDT 2012

3

a11

T16

6570-00703-002-101-39-01006
(no middle)
Rev M Nov09 SN:W388781

 

 

 

20

3

4

18.6

2.6.35.9-ael1-2-titan Sep 14 11:29:27 MDT 2012

 

a12

T12

6570-00703-002-101-39-00519
(no middle)
Rev F Jan 06 (no serial)

 

 

 

20

2

3

15.6

2.6.35.9-ael1-2-titan Sep 14 11:29:27 MDT 2012

 

a13

T9

6570-00703-002-101-39-00493
(no middle)
Rev M Jan01 SN:W390510

 

 

 

20

3

4

12.8

2.6.35.9-ael1-2-titan Sep 14 11:29:27 MDT 2012

 

a14

V16

607-00655-005-106-39-01841
Rev C Aug09 SN:W329990
Rev E Sep04 SN:W246380

8P 329990

 

 

20

3

20

1020

2.6.35.9-ael1-1-viper Sep 14 10:54:19 MDT 2012

3

a15

T6

6570-00703-002-101-39-00477
Rev B Aug06 SN:W274092
Rev M Jan01 SN:W390486

 

 

 

20

3

4

13.2

2.6.35.9-ael1-2-titan Sep 14 11:29:27 MDT 2012

 

a16

T8

6570-00703-002-101-39-00482
(no middle)
Rev M Jan01 SN:W390479

 

 

 

20

3

4

13.4

2.6.35.9-ael1-2-titan Sep 14 11:29:27 MDT 2012

 

a17

T14

6570-00703-002-101-39-00521
(no middle)
Rev M Jan01 SN:W390497

 

 

 

20

3

4

18.2

2.6.35.9-ael1-2-titan Sep 14 11:29:27 MDT 2012

 

a18

T15

6570-00703-002-101-39-00522
(no middle)
Rev G Dec08 SN:W317315

 

 

 

20

3

4

13.8

2.6.35.9-ael1-2-titan Sep 14 11:29:27 MDT 2012

 

a19

T13

6570-00703-002-101-39-00520
(no middle)
(no bottom)

 

 

 

20

3

4

14.6

2.6.35.9-ael1-2-titan Sep 14 11:29:27 MDT 2012

 

c20

V11

607-00655-005-106-39-01995
Rev H Apr04 SN:W240067
Rev G Dec08 SN:W317334

 

52

52

3

2

20

7.2

2.6.35.9-ael1-1-viper Sep 14 10:54:19 MDT 2012

4

m21

 

607-00655-005-106-39-01999
(couldn't read 2 middle boards)
Rev E Oct04 SN:W248073

 

350

6000-17500

3

21

20

14

2.6.35.9-ael1-1-viper Oct 3 12:12:46 MDT 2012

1,2,4

m22

T3

 

8P 330000

64

64

2

3

 

3

2.6.35.9-ael1-2-titan Oct 2 21:50:26 MDT 2012

4

Notes:
1: These vipers have large anomalous pc104 interrupt rates, which occur with both Sep 14 (a1,a8) and the Oct 4 (a5) kernels.
2: Not completely sure why on m21 the IRQ3 rate is less than the pc104 load. I may have installed a kernel on that system with a pc104 irq routine that exits if pending bits are 0, instead of attempting to serve he unmasked interrupts.
3: High USB interrupts are seen on vipers with bluetooth radios interfaced via USB. High USB interrupts are not seen on Titans, even though on "A" site titans, USB serves both the bluetooth radios and flash drives. On "A" site Vipers, USB has only bluetooth radios. Viper USB driver is isp116x_hcd. Titan USB driver is ohci_hcd.
4: On c20,m21, m22, USB is used only for flash drives

The big question is why the huge PC104 interrupt rates on vipers at a1, a5, a8 and m21. Previously traced this to a floating CTS/RTS line, but I'm not sure that is the current reason. Saw high PC104 interrupt rate on m21 even with the ribbon cables disconnected from the Emerald headers! Will have to investigate those systems back in the lab.

The only titans with PC104 serial are a3 and m22. They're both OK.

PC104 interrupts for vipers at a2, a6 and c20 are OK.

USB Interrupts

The viper kernel has a patch to use an assembler delay function for the isp116x, and not the kernel ndelay() function. It is not known whether this is related to the high interrupt rate. An incorrect delay may result in the USB interface completely failing.

The driver code that sets up the interrupt, uses two configuration values, int_act_high and int_edge_triggered, which are used to configure the interface:

if (board->int_act_high)
                val |= HCHWCFG_INT_POL;
        if (board->int_edge_triggered)
                val |= HCHWCFG_INT_TRIGGER;

The current isp116x driver is configured for level triggered (int_edge_triggered=0, default) and active high (int_act-high=1) interrupts. However, the setting for the corresponding GPIO interrupt is IORESOURCE_IRQ_HIGHEDGE. This might mean the interrupt is not being acknowledged correctly.

Planned Testing

  • Figure out what might be causing the high PC104 interrupt load on some vipers. By swapping cards and looking at IRQ3 level on a scope, figure out whether the problem is associated with the Emerald serial cards, or some of the Vipers, or is related to the RS232 connections.
  • Try various USB interrupt configurations on Vipers and see if we can get the USB interrupt rate to drop, while sending data through a USB Bluetooth radio at data rates that were used at SCP. Could this be related to the issue of data corruption on USB flash drives that we see on Vipers, when two or more USB devices are connected?