Relatively calm day.

After checking station statuses, we looked into why the DSMs are so often writing more than two files per day.  The following are some observations, and  a potential fix.

  • The file rewrites seem to be caused by events that disconnect and reconnect the USB devices.  These seem to originate from the Ethernet.  Steve can provide the exact error message
  • The DSMs that reset the most are lconvm, relm, and relt.  This is based on looking at the files from the past 5 days (10/2 - 10/6).
  • DSMs that reset the least are lconv1, lconv2, lconvt, p1-thru-p6, uconv2, and uconvm.
  • One possible cause is that high traffic on the DSMs with network switches causes usb reset
  • Potential fix:  Steve looked up the problem online, and found a potential fix is to configure the Ethernet switch to work at USB 1 speeds (forget the exact syntax or config file that needs editing, but that's basically what it does).  We've tested the fix out on P6, just to see if it causes any adverse effects.  It doesn't.  Gary, should we try this on the problem DSMs?

We went to the U of IL Atmospheric sciences picnic, invited by Francina who didn't attend.

I took three soil samples for moisture measurement.  

  • Release tower cornfield: tare weight = 21.30g, wet soil weight = 134.1g.  tin 4
  • Upper Conv grass:  tare weight = 21.45g, wet soil weight = 130.62g.  tin3
  • Lower Conv soyfield: Tare = 21.26g , wet soil weight = 134.55g.  tin2

Steve made theodolite measurements of Init and Release tower sensors.

Dinner afterwards with Francina at the Sunsinger.  It isn't cheap, but has a decent wine selection

  • No labels

3 Comments

  1. Can you let me know where you found the problem online and this potential fix, and what leads you to think the problem is the ethernet device?  This sounds like a known problem (ISFS-145, ISFS-189, ISFS-224, ISFS-191, ISFS-270), but maybe not.  Usually the ethernet device is the first to report being disconnected in the log, followed by all the other USB devices, and I've always assumed they were just reported in order of enumeration when the entire bus was reset.  I never investigated the ethernet device as the source of the problem, so that's new information.

    I have no objection to you trying out the change, since you're right there to recover anything that fails.  It would be great to find a workaround for this problem.  I don't see how DSMs with network switches have any higher traffic than any other DSMs, so that sounds suspicious.  In the past I have not found any correlations between the DSMs which have this problem and those that don't.  My best guess has been some kind of low tolerance on the USB power draw, since the rumour I've heard is that Pi's have limitations on supplying bus power and are sensitive to that.  Maybe having the switch drawing power from the DSM is affecting the power to the Pi USB.  In RELAMPAGO we have seen this problem even on DSMs which are not using the ethernet device, assuming this is the same problem.  Either way, it's worth finding out if changing the ethernet configuration can improve the USB reliability.  I rebooted a couple DSMs over the weekend because the USB had reset and the flash disk came back as the wrong device (ie, sdb instead of sda).


  2. Unknown User (gilmer) AUTHOR

    pi_usb_issues_troubleshooting.docx


    Sorry, I'm just getting back to this.  The file above has a snippet from /var/log/messages, which shows the USB disconnecting and resetting.  the times this happens are correlated to the data file write times.  The file above also has a couple of notes at the end with some potential fixes.  I didn't find the exact link that Steve had found a couple of days ago, but found the same file to edit (/boot/cmdline.txt), and same text to add which switches USB speed to 1.1.

    I just added the text "dwc_otg.speed=1" to /boot/cmdline.txt to RELT , then rebooted (12:18PM local time).  It pings, so that didn't damage anything.  Now time to wait and see if RELTs USB stays up.

    The ISFS-189 Jira update by Isabel seems most similar to what we've been seeing.  

    I think you may be right on the USB power draw being the main problem.  


  3. Unknown User (gilmer) AUTHOR

    Update:  Setting USB speed to 1 seemed to significantly reduce the number USB resets on RELT.  Dan switched the rest of the DSMs to speed 1.  We'll continue to monitor, but this appears to be a good solution.