Data Analysis Services Group - March 2013

News and Accomplishments

Miles Rufat-Latre, a freshman CS student at the University of Colorado, has joined the VAPOR team as a student assistant. We are excited to welcome Miles aboard!

VAPOR Project

Project information is available at: http://www.vapor.ucar.edu

KISTI Award:

John continued to work with KISTI to develop an SOW for year three activities. KISTI has requested that we change the project type from the current "contract" to a "collaboration". This change would improve our chances of continuing the relationship beyond the upcoming year, but would require that NCAR provide good or services as part of the collaboration. In particular, KISTI has requested 500k hours of computing time on yellowstone. Al has tentatively approved the request, subject to acceptable language in the contract, which NCAR contract office will start preparing shortly.

Development:

  • John and Alan are developing project plans for 2013 and KISTI deliverables.  We have a list of features to release in 2013 (as vapor 2.2.2 and vapor 2.3.0) and features to release after 2.3 .
  • In order to support having multiple active development trees with easier branching and merging, Alan converted our CVS repository to Bazaar.
  • Miles Rufut-Latre, our new student assistant is working on rebuilding our dependent libraries so that they will work in the new NWSC environment.
  • We migrated our data to the glade file system in Wyoming, after pruning and archiving to reduce the size.
  • John fixed the spherical volume renderer, which was broken in the 2.2.0 release.
  • John continued refactoring of the MOM, POP, and ROMS data translators.

Administrative:

Miles Rufat-Latre, a freshman CS student at the University of Colorado, has joined the VAPOR team as a student assistant. We are excited to welcome Miles aboard!

Alan and Rick Brownrigg (in 2012) proposed a SIParCS project involving visualization of WRF data in Google Earth.  Our preferred applicant for this position (Mohammed) has accepted.

We interviewed applicants for the SEII position and selected Nick Edmonds (from Indiana U.) as the best candidate.  We are waiting to see whether he accepts our offer.

ASD Support

John continued working with Peter Ireland on his particle-turbulence interactions data. Peter wants to produce an animation of a lengthy time sequence (500 time steps) that involves over 100 TBs of data. John developed scripts to parallelize the transformation of the data from the raw model outputs to a VAPOR VDC. Only 10% of the wavelet coefficients were saved, reducing the storage requirements to a more manageable 10TBs.

Production Visualization Services & Consulting

  • Pete Johnsen has been running WRF for hurricane sandy at NCSA and sending the results to NCAR.  He is now using a 150-layer grid, with 500m horizontal resolution, on an approximately 7kx7k horizontal grid.   We expect that the increased vertical resolution has resulted in more accurate images of the hurricane’s landfall.  Pete created at sequence of 26 half-hour timesteps of Sandy, of which we have made several animations, shown at http://vis.ucar.edu/~alan/shapiro/sandy500m150lev/
  • Pete Johnsen is preparing an article for Supercomputing 2013 entitled “Petascale WRF simulation of Hurricane Sandy”, describing the computation he has performed on Blue Waters.
  • Mel Shapiro showed Tom Bogdan the animations we have created from hurricane Sandy data.  Bogdan is planning on presenting this work to FEMA to demonstrate the value of higher accuracy weather prediction for emergency planning, and to see if FEMA is interested in funding a continuation of this effort.

Publications, Papers & Presentations

  • The VAPOR team are preparing  presentations and a demo on VAPOR at EGU in Vienna on April 11-12.

Systems Projects

Data Services

  • Started testing GridFTP HPSS Data Storage Interface (DSI) with Bill Anderson. We were able to get basic transfers to/from HPSS via GridFTP functioning. There is significant more work that would need to be done if we choose to go down this path, but the basic code seems to be workable at this point.
  • Started testing the feasibility of using lxc (Linux Containers in user space) for hosting restricted environments for the data access class of users, as well as GridFTP services. Preliminary results look encouraging, with additional isolation available from the host environment as compared to that provided by chroot jails.

Accounting & Statistics

  • Added graphs from Ganglia to the GLADE Usage Report produced every 4 hours on the wiki.

Security & Administration

  • Continued work on account automation tools. Tested tools to clean up after user accounts have been inactivated, including a final archive of the contents of the home directories to HPSS. Added picnic support.
  • Agreed with USS to run ingests of the SAM/LDAP account/group information twice daily M-F. Internal consistency issues within the SAM/LDAP account/group information, while not completely gone, have been reduced to a low enough level that detailed review of the ingest is no longer necessary.
  • Rewrote artoms to use the GPFS Policy Engine and HTar for the VAPOR backups.
  • Wrote several scripts to manage access to GPFS filesystems, trace errors through IB routes, and to convert GPFS nodeid's into hostnames.
  • Wrote several scripts to check the health of remote GPFS nodes. Scripts check for gpfs state, mount state of file systems and other things useful during a major outage.

System Monitoring

  • Continued working on the realtime GPFS monitoring tools. Made enhancements based on initial monitoring use. Added node GPFS state tracking to the reporter so syslog entries could be made for select events. Worked with Joey to get Ganglia generated graphics.
  • Continued working on the realtime GPFS monitoring tools. Made enhancements based on initial monitoring use. Added node GPFS state tracking to the reporter so syslog entries could be made for select events. Worked with Joey to get Ganglia generated graphics.
  • Continued enhancing on the realtime GPFS monitoring tools. Added support for erebus and the test environments. Worked out strategies for tracing the source of problems causing long term GPFS waiters, which in turn cause delays for users trying to access data in GLADE. Once the problem node(s) are identified, expelling them from the GPFS server cluster is typically sufficient to clear the blockage, even if the problem node(s) are no longer usable for GPFS access until a reboot or repair is effected.

System Support

ML Data Analysis & Visualization Clusters

  • The decommission schedule is moving ahead.  The storm visualization cluster was decommissioned on March 1 and we are on target to decommission the mirage cluster on April 1 along with the ML-Glade file systems.

GLADE Storage Cluster

  • Helped Alan start testing the Open Text NFS Client software for Windows access to Kerberized NFS GLADE exports to support VAPOR development.
  • Helped Alan complete testing and configuring the Open Text NFS Client software for Windows access to Kerberized NFS GLADE exports to support VAPOR development. We determined that the initial problems encountered were caused by a bad motherboard in his laptop.
  • Made the old Glade filesystems read-only on March 1st.
  • Helped VETS copy the majority of their data to NWSC.
  • Updated HCA firmware on all glade nodes.

Data Services Cluster

  • Transitioned the vis.ucar.edu webserver to the new glade directories.

Experimental Clusters

  • Lynx will loose access to GPFS filesytems on April 1 when the ML-Glade system is decommissioned.  Further GPFS access for Lynx is dependent on the future upgrade of Lynx and work with Cray to fix reliability issues experienced with ML-Glade. Picnic will be used for evaluation of future functionality.

Test Clusters

  • Worked on picnic cluster configuration and software updates after it was moved to NWSC.
  • Worked to get all picnic nodes upgraded to match production software configuration from glade.
  • Worked to get the picnic GPFS filesystems mounted on jellystone.
  • Built several GPFS GPL-layer RPMs for SSG to use on jellystone.
  • Reinstalled picnicufm in preparation for the OFED updates and UFM work.
  • Worked with Mellanox to get the UFM server updated and ready for HA with jellystone's UFM server.
  • Updated HCA firmware and OFED on all picnic nodes.

Storage Usage Statistics

NWSC GLADE Usage Report

  • No labels