Blog from March, 2015

Status update Tuesday

Sensor status:

T: ok

RH: ok

Ifan: ok

spd: ok

P: ok

co2/h2o: ok

csat u,v,w: ok

csat ldiag: couple flags

Tsoil: Tsoil.0.6cm.ehs had a small discontinuity at midnight (will have to check the high-rate to get a better look)

Wetness: ok

Rsw/Rlw/Rpile: ok

Voltages: ok

sstat outputs: ok, ok

Qsoil.ehs NA's

Looking at the high-rate data for Qsoil.ehs I've noticed that the signal has a lot of dropped values. I would assume that would be a result of the moisture a couple days ago, but it seems to precede that event. The second plot is data from yesterday, the 29th.

 

The IRGA sonics on the bao and ehs flux stations are installed at 5 meters, not 3 as has been stated in the configuration.  

So, I've updated the XML, ran a script to change the height in the names of the affected variables, and restarted the statsproc services on flux and porter2.

Status update Monday

Sensor status:

T: ok

RH: ok

Ifan: ok

spd: ok

P: ok

co2/h2o: ok

csat u,v,w: ok

csat ldiag: couple flags

Tsoil: Tsoil.0.6cm.ehs has reappeared; needs time to cure because it's a bit out of the profile at the moment

Wetness: ok

Rsw/Rlw/Rpile: ok

Voltages: ok

sstat outputs: ok, ok

Daily update Sunday

Sensor status:

T: ok

RH: ok

Ifan: ok

spd: ok

P: ok

co2/h2o: ok

csat u,v,w: ok

csat ldiag: couple flags

Tsoil: ok

Wetness: ok

Rsw/Rlw/Rpile: random spike in the Rpile.in.bao at 9am

Voltages: ok

sstat outputs: ok, ok

bao sonic heights

During our site visit to bao today, we measured the following heights:

ground-to-middle-of-EC150-paths: 79+415+8 = 502cm (pretty darn close to Julie's request of 5m!)

ground-to-middle-of-2D-sonic-paths: 79+667+281 = 1027cm (close to 10m)

We have every reason to believe that ehs is similar.

 

Quickie status

With the data outages yesterday, it was hard to do much else.  However, at the end of the day I reran all statsproc and wwwplots (on porter2), to pick up the fix that Gary made a few days ago to the EC150 calibration routine.  Units for Pirga and co2 should now be correct in the NetCDF files.

However, Rudy pointed out that h2o.3m.bao has been negative since a rain storm last week.  co2 has a large offset as well.  Kurt and I will run out today to see what is wrong.  Obvious possibilities are dirty optics or used-up scrubber chemicals.  We'll bring the newly-fixed EC150 head with us just in case.

"Aside from that, how did you like the play, Mrs. Lincoln?"

 

Data outage last 2 days

Okay, this will attempt to document issues during the past 2 days.

  1. ehs USB stick died sometime between the 24 mar and 25 mar 00Z rsyncs.  fdisk reports no partitions on this stick.  I replaced this with a new USB stick about an hour ago and it is now working.
  2. Santiago rebuilt/started eol-rt-data yesterday.  Apparently, in the process the ssh configuration was modified to disallow (some) connections to porter2.  
  3. Yesterday, in an attempt to solve the eol-rt-data issue by myself, I tried restart_process of ssh_tunnel on flux.  This killed ssh to flux from the outside.  SRS rebooted flux last night at about 1700 and I rebooted it again at about 1200 today (2 separate trips to the tower), which restored the connection from flux to eol-rt-data, but still failed to get all the way through to the eol machines due to eol-rt-data configuration issues.  I don't know if this was a red herring and would have fixed itself when eol-rt-data was fixed.
  4. Today, we also tried 2 porter2 reboots.  At least the second one was justified, since sstat reported most services not running and restart_service didn't bring them back.  All was well after the reboot.
  5. Even once eol-rt-data was fixed (by Ted restoring an old image of the virtual machine!) (that restored ssh tunnel to flux), connections to bao and ehs were broken.  We found that Gordon's check_udp_... only reported errors and didn't restart the udp_ tunnel process.  Running this by hand finally got data flowing.  We had also run this process earlier in the day, so there are a couple of hours of udp data that made it to porter2.
  6. Even with everything fixed, rsync_flab_loop.sh failed with a PATH issue (couldn't find rsync_flab.sh).  This is really strange, since this script has always worked.  I manually added setting of PATH to this script.  In the meantime, I also ran the nightly rsync manually (which was hideously slow – about 3 hours to bring 2 days of data back – so I cheated on some by removing the bandwidth limit).

Lessons:

  1. Santiago/Ted were unaware that CABL even existed and didn't think to check with Gordon (though he wasn't available) before proceeding with system work.  They did ask (vacationing) Gary, who didn't educate them.
  2. We still need to fix check_udp_... to restart automatically – I'm guessing that this is just a path issue.
  3. Ted thinks that virtual machine "snapshots" were the cause of the config errors and has decreed that they should be avoided in the future.
  4. Santiago wants to better document eol-rt-data and is thinking about splitting up some of its services onto other virtual machines. 

Impact:

  1. BAO tower data should be fine since the DSMs kept running and flux was up to rsync their data.  All of these data should be rsynced (and new statsproc files and wwwplots generated) soon.
  2. bao data also should be fine since its DSM was up and saving to local storage.  It appears that porter2 was able to rsync its data from 0325 and will be able to run tonight.  flux wasn't able to get these data and has a gap between 0325_195959 to 0326_202418.
  3. ehs data were archived on flux (via udp transmission) until 0325_195959 and are missing until 0326_202418, as with bao.  The porter2 files are even worse, with no data from 0325, undoubtedly due to the bad USB stick.  More data will not be filled in by the nightly merge.  About 28 hours of data were lost.  Note that the NetCDF statistics files have 1–2 hours of data for which we don't have raw_data.  If these NetCDF files are ever regenerated from scratch, we'll lose this (short) period of data.

Whew!

 

 

 

 

 

BAO site visit

replaced trh  at 11:00 AM local time,

checked to make sure leaf wetness sensor is installed properly, (it was), and took site photos

 

The boxplot below shows the IQR over the past two days of data for each soil sensor. Despite missing the adjacent 0.6cm sensor for reference, I think it's safe to conclude that the Tsoil.x.0.6cm.ehs sensor is starting to go haywire.

 

Quickie status

As far as I know, everything is working.  Over the weekend, data dropouts from the ehs soil sensors became large, so today I removed the RF links to these motes and simply cabled them into the DSM.  The connections now appear to be reliable.

Julie had mentioned that the bao wetness sensor didn't reporting a wet event.  However, the data show several large (snow) events up through 1 Mar, and some small (rain) events on the night of 12 Mar at both bao and ehs.  Thus, it appears to me that they are working.  I still could test them...

Wind direction is important for this project.  Unlike the previous year's wind rose, south and east winds have occurred during the last week.  I note huge vertical velocity and co2 variance at ehs in these cases (both not surprising).  Also, I earlier noted high temperature variance for sonics in the wake of the BAO tower.

Gsoil.ehs has an odd feature at midday that isn't seen at bao.  My guess is that this is the effect of the darkhorse shadow (since I installed this sensor in a plot to the NORTH of the darkhorse).  I had hoped that this effect would be smaller and/or of a shorter duration.  Oh well...

 

 

Not sure why it restarted, but I got a call from Bruce that dhcpd was running again.  Just like Gary before, I did:

systemctl stop dhcpd.service

systemctl disable dhcpd.service

After this, ps -ef | grep dhcpd no longer showed a dhcpd process, and

systemctl list-unit-files | grep dhcpd shows dhcpd.service as disabled (rather than enabled)

Hopefully, we won't have to do this again...

 

ehs motes rewired

Rudy had noted that Vmotes at ehs were dropping out.  It appears that this is an issue with RF transmission (even from one leg of the darkhorse to the other!).  I decided to bypass the whole radio stuff by wiring each of the motes as a serial mote.  This involved:

  • cabling from each DSM port to the mote console port
  • applying the pp=0 command to each mote (after first hitting the white scan button twice to enable the hardwired console)
  • changing the config to look at 3 serial ports

All this was done from about 1130 to 1200 today.  Now all is well except "ds" only reports data from tty9 (and not tty6&8).  This must be caused by my setting all the mode IDs to 0x8000, so data_stats is displaying just the last tty with this ID.  The data seem to be okay.

Other stuff:

  • took photos
  • used my GPS to "shoot" boom angles.  Got 146degN into the CSAT, 234degN into the 2D.
  • GPS station location: 40d 02.940'N, 105d 01.052'W.

 

EHS Vmote.soil Problem

Looking at the high-rate data, Vmote.soil.ehs and Vmote.soil2.ehs, have a growing number of NA's in the early afternoon hours. Any thoughts on this?

March 13th:

And March 14th:

Today's data seems to be even worse...

Quickie status

As far as I know, everything* is now working for CABL (and has been for the past 2 days).  Some minor issues:

  • TP01 values at ehs get removed when Vpile.off is slightly negative (see previous entry)
  • Our new aspirated Tirga seems to be worse than the default – I don't know why
  • Still need to clean up dat.P and make WWW plots
  • Document sites (still need ehs boom angles?)