Blog from September, 2012

Gordon, Sep 30, 12:15
Built a new kernel for titans, with a two-line fix that I hope will fix the issue where pc104 interrupts cease to be serviced.

Installed it on Aph3 from Boulder, and rebooted it. It came up and data from the emerald serial ports 5 (TRH24), 6 (TRH51) ,7 (Handar) (as well as the sonic on 2) is coming in. Previously the issue was seen within an hour or two. Wait and see...

Didn't work. Emerald data quit after about 1/2 hour. So, then made another small change to the interrupt handler, installed the new kernel and rebooted Aph3 around 12:58 pm. It's been up for about 6 hours and still getting Emerald data, which is longer than it had ever run before, so things might be fixed.

Note that the barometer on port 1 is not reporting, needs a cable.

Gordon, Oct 2

The above attempts didn't fix the problem. Emerald data still died after several hours. It took several more iterations, but I think it has finally been fixed.

WIFI Etherants

Gordon, Sep 29

One of the Ethernats being used by OSU was knocked out by the lightning strike. Chris says the power light does not come on, and the status light on the injector goes out when the Etherant is plugged in.

We have pulled EantA from the main tower. Its status is unknown, likely dead.

By my record. OSU was using EantD, 00:20:F6:05:16:CD at the Sodar, and EantJ 00:20:F6:05:24:4F at the tower.

I presume it was the one at the OSU tower that died, but as now, Sep 29, these Etherants are reporting:

interface wireless registration-table print 
 # INTERFACE        RADIO-NAME       MAC-ADDRESS       AP  SIGNAL... TX-RATE UPTIME              
 0 wlan1-Int-Ant                     00:20:F6:05:1E:D5 no  -26dBm... 11Mbps  4d23h2m11s          
 1 wlan1-Int-Ant                     00:20:F6:05:24:85 no  -32dBm... 11Mbps  1d3h51m6s           
 2 wlan1-Int-Ant                     00:20:F6:05:24:4F no  -95dBm... 11Mbps  1h33m13s            
 3 wlan1-Int-Ant                     00:20:F6:05:24:56 no  -95dBm... 11Mbps  3m4s                

Unfortunately the AP24 doesn't show the radio names. From my earlier post I thought EantJ was on the OSU tower, but I guess not, since EantJ is currently reporting, so it must have been on the Sodar at that time. From the registration table printout above, here is what is currently reporting:

MAC

name

site

IP

1E:D5

EantE

ISFS C

192.168.0.137

24:85

EantH

ISS Sodar

192.168.0.140

24:56

EantI

OSU Tower?

192.168.0.141

24:4F

EantJ

OSU Sodar?

192.168.0.142

Gordon, Sep 29.

The DSMs on the main tower stopped reporting on Sep 27, 21:03:05 UTC (15:03:05 MDT). This was probably the moment just before the lightning strike.

Two of the sensors at C also ceased reporting at this time:

data_stats isfs_20120928_000000.dat.bz2 
C20:/var/tmp/gps_pty0     20     30     28952 2012 09 27 20:00:00.613  09 27 23:59:59.735    2.01  0.085   6.849   51   73
C20:/dev/ttyS1            20     50     14341 2012 09 27 20:00:00.446  09 27 23:59:59.296    1.00  0.842   7.001   20   20
C20:/dev/ttyS2            20     60     11172 2012 09 27 20:00:00.646  09 27 23:59:59.726    0.78  0.949   6.421   36   36
C20:/dev/ttyS5            20    100    286914 2012 09 27 20:00:00.020  09 27 23:59:59.940   19.92  0.001   5.850   12   12
C20:/dev/ttyS6            20    110     11171 2012 09 27 20:00:00.306  09 27 23:59:59.826    0.78  0.917   7.211   38   38
C20:/dev/ttyS7            20    150      2925 2012 09 27 20:00:00.306  09 27 21:03:05.006    0.77  1.278   2.581   37   37
C20:/dev/ttyS8            20    200    286910 2012 09 27 20:00:00.021  09 27 23:59:59.900   19.92  0.003   5.850   12   12
C20:/dev/ttyS9            20    250     11176 2012 09 27 20:00:01.126  09 27 23:59:58.626    0.78  1.073   6.722   36   38
C20:/dev/ttyS10           20    300      3772 2012 09 27 20:00:00.986  09 27 21:03:04.576    1.00  0.952   2.023   20   20

Id 20,150 is the 1.5m TRH. 20,300 is the Vaisala PTB.

The 1.5m TRH (TRH34) resumed reporting 7.5 hours later on Sep 28 04:29 UTC (Sep 27 22:29 MDT). I believe this was a miraculous resurrection (smile) , with no human intervention.

This morning three TRHs are reporting at C:

data_dump -i 20,-1 | fgrep TRH
2012 09 29 16:02:03.8064 0.01448 20, 110      35 TRH19 18.57 58.28 0 0 1460 114 0\r\n
2012 09 29 16:02:04.1965 0.005246 20, 250      38 TRH28 17.02 63.57 33 0 1420 125 104\r\n
2012 09 29 16:02:04.2564 0.01515 20, 150      35 TRH34 18.84 56.44 0 0 1468 110 0\r\n

These are the 1m, 1.5 and 2.5 TRHs. The 0.5m is not reporting. From a run of data_stats on the C20_20120928_120000.dat archive file it quit on Sep 28, 14:54 MDT.

Lightning damage on Sept 27

Sept 29

A lightning strike on the afternoon of Sept 27 knocked out power to the base trailer and damaged equipment on the 20m main tower.

In the real-time archive, the DSMs at the main tower ceased reporting on Sep 27 21:03:05 UTC (15:03:05 MDT).

Station 12 also quit reporting at that time. A TRH and barometer at C20 and a TRH at Ah6 also quit at that time.

Kurt restored power mid-morning Sept 28 and we assessed damage to the main tower.  Our best guess is that the lightning struck near the base trailer and a current surge flowed down the power cable to DSM M21, trashing the charge controller, the PC104 stack, and the power interface panel.  Many of the sensors associated with M21 were found inoperative, but M21 appears to have acted as an expensive 'fuse', limiting damage to M22 and its associated sensors.  Gordon wrote:

"The most 'striking' damage was in the battery box at the main NCAR tower, the power interface panel on the lower DSM at the tower, and to systems at the OSU tower.  In general, the sensors and DSM higher on the main NCAR tower suffered less damage.

"The breaker in the conference center for power to the trailer was tripped, as was the breaker in the trailer which (I believe) is the circuit for power to the transformers at 'C' and 'M'.  The breaker that was tripped in the trailer is the lowest one on the panel, and is labelled 'class'  Kurt saw burn marks on one lug of a power cable connector at the trailer. I believe he said it was the socket at the end of the power cable from the conference center.

"All sensors except for one at the C site are reporting.  Sodars are OK, I hear.  The Picarro was not connected to AC.  It apparently suffered damage via the serial cable from the main tower DSM. So the damage was on the systems connected to the AC circuit to the main and OSU tower.  I think Tom and Kurt agree that it seems the lightning didn't strike either tower but the surge came in from the AC circuit.

"I like Kurt's suggestion that if a lightning storm threatens, that the power cables to the transformers be disconnected at the trailer. (Probabaly a good idea to throw the "class" breaker first).  That will shut down the Sodars and Piccarro. The ISFS systems and sensors at C and M should still run from their batteries.  I believe everything on the OSU tower is on battery?"

The sensor damage on the main tower was determined as in the following tables.  Sensors marked NG (Not Good) were removed and transported to Boulder.  We also found that one of the four sensors at each of the two soil sites were bad and brought down its mote if connected.  I recall that the soil temperature probe was bad at Grass, but the connector labels at Cactus were inadequate to determine the bad sensor (not the TP01 in my recollection).  The bad soil sensors were disconnected from the motes but not disinterred.

M21

ht (m)

serial port

sensor

status

0.5

s1

CSAT

ok

0.5

s2

TRH

NG, fan running, high current (>1A), no data

 

s3

gps

ok

1

s5

CSAT

ok

1

s6

Licor

NG, blows fuses

1

s7

PTB220 p

ok

1.5

s8

TRH

NG, fan speed unsteady, no data

2

s9

CSAT

NG, blows fuses

2

s10

Licor

NG. No data

2

s11

TRH

NG, 1.2A, no data

3

s12

CSAT

ok

3

s13

TRH

NG, 1.3A

4

s14

CSAT

ok

4

s15

TRH

NG. 0.1 A but no data, no fan

5

s16

CSAT

NG 0.3 A, no data

5

"

kh2o/serializer

?

5

s17

Paroscientific p

NG. 0.05 A, no data, tried *0100P4/r/n init

 

s18

Picarro PC

NG

M22

ht (m)

Serial Port

Sensor

Status

6

S1

TRH

NG 0.1 A, no fan, no data

6

S2

Handar

ok

 

S3

gps

ok

8

S5

TRH (replaced SHT)

bad data (-40C), replacement OK

8

S6

Handar

ok

10

S7

CSAT

ok

10

"

KH2O/serializer

NG. Raw data: 0xffff

15

S8

TRH (replaced SHT)

bad data (-40C), replacement OK

15

S9

Handar

ok

20

S10

CSAT

ok

20

S11

PTB 220 p

ok

data_stats isfs_20120927_200000.dat.bz2 indicates that sensors at other sites quit reporting at 21:03:05, including two sensors at C: the 1.5m TRH and the barometer. The TRH resumed reporting on Sep 28 04:29 UTC.

Station 12 quit reporting at 21:03:05, but is now working after replacing the charging controller. However data from the TRHs at 12 via the bluetooth mote is not coming in.

0.5m TRH at Ah6 also stopped.

Site

Serial Port

Sensor

Status

C20

7

1.5m TRH

quit, but resumed on Sep 28 04:29 UTC

C20

10

PTB

NG no data

Ah6

1

0.5m TRH

NG, not reporting

Ap12

btmote12

0.5 m TRH, 2m TRH

unknown status, no data from Bluetooth mote

Other Damaged Equipment

  • WIFI antenna (EantA) on main tower. Unknown status, probably dead.
  • WIFI antenna (EantD) on OSU tower. Unknown status.
  • USB disk drive (pocketec) on M21 DSM is not recognized by host systems. This contained 21 hours of data for Sep 27, but we have the same data that was received in real-time over the network.
  • 5-port network switch in M21 is toast

Pocketec USB disk drive in Mu22, upper DSM on NCAR tower, is OK.
*

Gordon, Sep 29.

Ap12 had not worked well since it was deployed. It died early every morning due to low voltage. During the night the measured station voltage, Vdsm, would dive down to a cutoff at 11.3 V, from an apparent healthy 14 V when being charged during the day.

Replaced the battery on Sep 26. Also checked that the lugs on the controller were tight.

The station was dead the evening of Sep 27, so it was feared it had been damaged in the lightning storm. The measured voltage at the interface panel was about 3 V when the system was turned on, and 12 V when off.

Brought the system to the trailer and could not find anything wrong. Voltages all OK. Decided that the problem must be that the charge controller could not deliver sufficient power. Kurt replaced the controller and the system powered up at 16:48 MDT (per the times in the archive). It has run through the night.

Gordon, later on Sep 29:

Turns out that 12 was affected by the lightning. It quit reporting, both via network an to its local storage right at the time of the strike, Sep 27, 21:03:05 UTC. After the charge controller was replaced, all sensors except the bluetooth mote serving the TRH's resumed reporting.

Sonic serial numbers

Sept 27

The CSAT serial numbers are found in the log files on each DSM.

Station

0.5m Handar

1m CSAT

CSAT cal

1

?

0923

15aug12

2

?

0833

29aug12

3

?

0743

05jul12

4

x

1120

19jul12

5

?

0732

09jul12

6

?

0800

08dec11

7

x

0673

18jul12

8

x

0176

22aug12

9

x

1121

08dec11

10

x

0677

17jul12

11

x

0674

13aug12

12

x

0855

12sep12

13

x

0745

09jul12

14

x

1124

19jul12

15

x

1122

08dec11

16

x

0740

28aug12

17

x

0856

31jan12

18

x

0744

06jul12

19

x

0672

17jul12

C Tower

Ht

Handar

CSAT

CSAT cal

0.5 m

?

x

x

1 m

x

0200

27jun12

2 mm

x

0197

03jul12

Main Tower

Ht

Handar

CSAT

CSAT cal

0.5

x

 

 

1

x

 

 

2

x

 

 

3

x

 

 

4

x

 

 

5

x

 

 

6

?

x

x

8

?

x

x

10

x

 

 

15

?

 

 

20

x

 

 

Sept 27

Kurt and I installed NCAR barometer 0001 from the Manitou Beachon tower at station A3 today.  We did not have a serial cable at the time, but Kurt will take one to the site this afternoon.

Sept 27:

Kurt and I installed CSAT 0800 from the Manitou Beachon tower at station 6.  Unfortunately we do not have a CSAT cable, but Jielun will bring one from Boulder on Friday.

TRH serial numbers

Serial TRHs at the Ah (1,2,3,5,6), C and M towers report their serial numbers in every data record, and so they can be displayed in real-time with rserial, or from the data archive with data_dump. At the Ah sites, sensor id N,40 is the 0.5m TRH and N,50 is the 2m TRH, where N is the station number:

data_dump -i 1,40 -A isfs_20120927_080000.dat.bz2 
data_dump -i 1,50 -A isfs_20120927_080000.dat.bz2 

TRHs that are sampled by Wisard motes report their serial numbers periodically in the Wisard message block. NIDAS processing, such as statsproc, logs the Wisard sensor serial numbers that it finds in the data archive. One can grep the output of statsproc for the string TRH. The 0.5m TRH reports as sensorType 0x10, the 2 m as 0x11.

2012-09-27,09:16:36|INFO|A4:/dev/ttyS1: 2012 09 20 21:34:05.932, mote=4, sensorType=0x10 SN=52, typeName=TRH
2012-09-27,09:16:36|INFO|A4:/dev/ttyS1: 2012 09 20 21:34:05.932, mote=4, sensorType=0x11 SN=59, typeName=TRH

The TRH serial numbers of the initial deployment on Sep 20/21 are as follows, along with the UTC date, time and serial number date after a unit swap. The ids of the TRHs on the C and M towers are also shown.

site

SN at 0.5m

SN at 2m

1

2

9

2

14

17

3

24, Nov 7 20:27=48, Nov 12 20:00=52

51

4

52, Oct 6 18:57=56

59

5

15

11

6

21

16

7

58

47

8

50

39

9

26

49

10

8, Oct 6 19:02=38

33

11

64

63

12

66

23

13

32

27

14

40

31

15

62

41

16

54

57

17

68, Oct 6 19:47=60

43

18

42

55

19

46

65, Oct 9 16:04=7

tower

id

SN

C 0.5m

20,60

7, Oct 3 17:33=25

C 1m

20,110

19

C 1.5m

20,150

34

C 2.5m

20,250

28

M 0.5m

21,60

13, Oct 3 00:49=5

M 1.5m

21,150

37, Oct 3 00:49=22

M 2.5m

21,250

12, Oct 3 00:49=44

M 3m

21,310

3, Oct 3 00:49=18

M 4m

21,410

6, Oct 3 00:49=1

M 6m

22,600

4, Oct 3 14:44=35

M 8m

22,800

20, Oct 3 14:44=53

M 15m

22,1500

10, Oct 3 14:44=29

The 0.5m TRH on 19, serial number 46, has failed. It worked initially after deployment from 9/20 16:00 MDT til 9/22 06:30 MDT, and then didn't report for 2 hours, then ran for 16 hours, then didn't report for 8.5 hours. It was removed from A19 yesterday, 9/26.

1 Comment  · 

Sept 26, 2012

Gordon noticed a power dropout at stn 12 during the night.  He replaced the 12 V battery.

TRH fixes, Sept 26

Sept 26, 2012

Gordon and I investigated three stations with a bad TRH:

Stn 5: The TRH on serial port 5 at 2.5m was dead on arrival; no fan running; it had blown the 12V power fuse.  We replaced it with a spare unit but used the original SHT sensor.   Returned bad TRH to the base.

Stn 19: The TRH with mote id 10 at 0.5m was bad (but fan running).  Tried replacing SHT but did not help.  Returned bad TRH to base with original SHT.

Tower C: The TRH on serial port 6 at 1m had a bad cable.  Replaced cable.

sonics in motion

Today:

- Sonics 672 and 1123 were FedEx'd from Boulder to Campbell, after determining (through swapping head/electronics) that 1123's head was bad (and we knew that 672's head was loose).  Ed thought it was possible that these could be returned by mid next week.

- Chris and Lisa removed sonics from the 8m and 30m levels at MFO (sorry, Ned) and brought them back to Boulder.  Kurt will take them to SCP tomorrow to (finally) install A6 and replace A17.  We'll look at the data to see if sending A17 back to Campbell is justified.

- While at MFO, Chris and Lisa also replaced a PTB220 barometer with the MPL solid-state barometer.  Kurt also will take the PTB220 to SCP tomorrow to install at A3.  I verified that data were coming in from the MPL.  Note that I deleted the reference to a qc_file P.dat in manitou.xml (only on this DSM) since the default $QC didn't point to it and thus produced NANs.

Status, Sept 25

SCP status 9/25 & 9/26 (rad & soil)

?I cycled through the dsm's and used rserial to examine each serial port.

Summary:

Known problem with Emerald card on A3.

Missing barometer at A3

TRH not reporting: s5 on A5 (no dc power on s5) & S6 on C20.

TRH data is nan, s1 on A19.

CSAT missing on A6

Ah1  s1 trh ok

        s2 csat ok (esc-h)
        s5 trh ok
        s6 handar ok
        s7 power ok (mote_dump 1,35)

Ah2  s1 trh ok
        s2 csat ok
        s5 trh ok
        s6 handar ok    
        s7 power ok 

Aph3 s1 p ** no data **
        s2 csat ok
        s5 trh ** emerald problem **
        s6 trh ** emerald problem **
        s7 handar ** emerald problem **

A4    s1 trh 10 ok (mote_dump 4,4)
        s1 trh 11 ok
        s1 power ok
        s2 csat ok

Ah5  s1 trh ok
        s2 csat ok
        s5 trh ** not reporting **
        s6 handar ok    
        s7 power ok

Ah6  s1 trh ok
        s2 ** csat missing **
        s5 trh ok
        s6 handar ok    
        s7 power ok

A7    s1 trh 10 ok (mote_dump 4,4)
        s1 trh 11 ok
        s1 power 13.8V
        s2 csat ok

Ap8  s1 p ok
        s2 csat ok
        bt trh 10 ok
        bt trh 11 ok

Ap9  s1 p ok
       s2 csat ok
        bt trh 10 ok
        bt trh 11 ok

Ap10 s1 p ok
        s2 csat ok
        bt trh 10 ok
        bt trh 11 ok
        bt power 13.8V

Ars11 s2 csat ok
        bt trh 10 ok
        bt trh 11 ok
        bt power 13.2V
        bt radiation ok (40)
        bt wetness ok (40)
        bt soil grass ok (41)
        bt soil cactus ok (42)

Ap12 s1 p ok
        s2 csat ok
        bt trh 10 ok
        bt trh 11 ok
        bt power 13.5V

A13  s1 trh 10 ok 
        s1 trh 11 ok
        s1 power 14.0V
        s2 csat ok

Ap14 s1 p ok
        s2 csat ok
        bt trh 10 ok
        bt trh 11 ok
        bt power 13.8V

A15  s1 trh 10 ok
        s1 trh 11 ok
        s1 power 14.0V
        s2 csat ok

A16  s1 trh 10 ok
        s1 trh 11 ok
        s1 power 14.0V
        s2 csat ok

A17  s1 trh 10 ok
        s1 trh 11 ok
        s1 power 13.5V
        s2 csat ok

A18  s1 trh 10 ok
        s1 trh 11 ok
        s1 power 13.8V
        s2 csat ok

A19  s1 trh 10 ** nan **
        s1 trh 11 ok
        s1 power 13.9V
        s2 csat ok

C20     s1 handar ok
        s2 trh ok
        s5 csat ok
        s6 trh ** not reporting **
        s7 trh ok
        s8 csat ok
        s9 trh ok
        10 p ok

M21  s1 csat ok
        s2 trh ok
        s5 csat ok
        s6 licor ok
        s7 p ok
        s8 trh ok
        s9 csat ok
        s10 licor ok
        s11 trh ok
        s12 csat ok
        s13 trh ok
        s14 csat ok
        s15 trh ok
        s16 csat ok
        s17 p ok
        s18 Picarro ** not reporting **

M22  s1 trh ok
        s2 handar ok
        s5 trh ok
        s6 handar ok
        s7 csat ok
        s8 trh ok
        s9 handar ok
        s10 csat ok
        s11 p ok

Sensor I.D.'s for the main tower:

20m         CSAT   0741

               PRES   B4

15m        Handar    678

               TRH       10

10m         CSAT    1119

               KH2O   1393

 8m          Handar     0370003

               TRH       20

 6m          Handar        1528

                TRH         4

 5m         CSAT         0671

              KH2O        1389

              Micro Baro ?

 4m         CSAT     0738

              TRH      6

 3m        CSAT    0540

             TRH    ?

 2.5m     TRH      12

2m          CSAT       0538

              Licor        1163

1.5m       TRH      37

1m         CSAT     1117

             Licor      1167

             PRES   B7

.5m        CSAT      1455

             TRH         13

sonic fixes

Sept 24:

Five sites had bad/missinig/no-serial-cable sonics:

A6 - missing sonic: no change

A12 - bad sonic: replaced S/N 1123 with S/N 0855

A17 - (formerly) bad sonic:  seems to have fixed itself for the moment, no action taken

A19 - missing sonic: installed S/N 0178

M, serial port 12: missing cable:  installed new cable

Kurt is taking 1123 back to Boulder for Chris to test and likely ship to Campbell

Thus we need only one good sonic to install at A6