these notes are intended to be a fairly comprehensive review of the matlab code i used for the 3d wavelet transform, which is self-documenting for the most part.

.  i also have a few scripts for batch processing of data (see generate.m and generate_rando*.m), and generating vdf files.  these are discussed below.

 ***************************************************************************************************************************************************** **********

matlab code for wavelet transforms and denoising

***************************************************************************************************************************************************** **********

most of the transform code is taken from Ivan Selesnick et al at Brooklyn Polytech:  http://taco.poly.edu/WaveletSoftware/dt3D.html

which i tested- it works.   the thresholding scripts are all mine so any errors are mine too (smile)

all of the pertinent m-files are stored in the directory /cxfs/DASG/hleee. (moved to ~hleee/scripts)

***************************************************************************************************************************************************************************

these are utilities for doing fwd and reverse wavelet transforms:

afb3D.m

cshift3D.m

denS2D.m

dwt3D.m

idwt3D.m

sfb3D.m

filters.mat  (youll need to load this before using any of the transforms, pass the appropriate filter from this file by name to dwt3D) From matlab execute ' load filters'

these are functions for testing, retrieving statistics, doing thresholding, etc:

coefzero.m

getincoh.m

getens.m   

getcos.m   

soft.m

univthr.m

univthr_var.m

surethr.m

getpow.m   

testdwt.m

theyre rather slow for 1024x024x1024 data be warned. im sure you could speed things up considerably by pre-allocating arrays, using mex, etc

*******************************************************************************************************************************************************************

the basic procedure to create a thresholded data set is as follows:

first load a .raw file into matlab using the following commands

>> fid1=fopen('wx.raw','r')

fid1 =

     3

>> for i=1:1024
wx(:,:,i)=fread(fid1,[1024,1024],'float32');
end

this reads the data into the 3d array wx one slice at a time (just this loop can take 2+ hours to execute)

then do FWT:

>> load filters
>> d_wx_c12 = dwt3D(wx,7,coif2f); 

Above performs 7 transform passes 

then threshold:

>> d_wx_c12_sure = surethr(d_wx_c12);

 get percentage of coefficients zero'd out at each level/direction and total fraction of zero'd coeffs:

>> [a,b]=coefzero(d_wx_c12_sure)

% primary wavlet directions are columns, levels are the rows from finest to coarsest), entries are the fraction of coefficients set to 0

a =

    0.1225    0.1293    0.9361    0.3883    0.9728    0.9754    1.0000
    0.0069    0.0074    0.0196    0.0143    0.0316    0.0351    0.0475
    0.0002    0.0002    0.0005    0.0004    0.0006    0.0007    0.0011
    0.0000    0.0000    0.0000    0.0000    0.0000    0.0000    0.0000
    0.0000    0.0000    0.0001    0.0001    0.0002    0.0000    0.0001
    0.0002    0.0002    0.0005    0.0002    0.0002    0.0002    0.0002
    0.0020    0.0020    0.0020    0.0020    0.0020    0.0020    0.0020
    0.0020         0         0         0         0         0         0

% b is fraction of total coefficients killed 

b =

    0.5681

get average wavelet coeff. power at each level:

>> ens = getpow(d_wx_c12)

ens =

   1.0e+06 *

    0.0000    0.0000    0.0000    0.0000    0.0000    0.0000    0.0000
    0.0010    0.0009    0.0002    0.0003    0.0001    0.0001    0.0000
    0.0127    0.0124    0.0046    0.0035    0.0027    0.0026    0.0015
    0.0739    0.0718    0.0401    0.0197    0.0225    0.0220    0.0168
    0.2442    0.2329    0.1625    0.0595    0.0887    0.0861    0.0795
    0.7475    0.6228    0.4703    0.1391    0.2415    0.2149    0.2253
    1.9839    1.2770    2.1127    0.2834    0.5989    0.4363    0.6435

warning : i wouldnt recommend actually try to run these commands from within a matlab GUI session on the TG data. 

               just loading the data takes 2-3 hours and if anything goes wrong with your session youre SOL.

**********************************************************************************************************************************************************************

matlab code for batch data generation

***************************************************************************************************************************************************** **********

ive written two scripts for batch data generation.  they both start w/ the word generate and are located in /cxfs/DASG/hleee/vdc. (moved to ~hleee/scripts). generate.m performs thresholding on TG data. generate_rando* threshold synthetic noise data sets.  Note, generate.m hard codes output paths - output data needs to be renamed between runs of generate.m

each script has an associated text output file similarly named .out file (e.g. generate_rando.m's output is captured in generate_rando.out) and a

stats.mat matlab data file (e.g. generate.m saves its stat variables to generate_stats.mat).

note that all wavelet filters are stored in the filters.mat file and are loaded by the scripts.

they do the following:

 generate.m computes the wavelet transforms of the TG data using the haar, coif6,and coif12 wavelets (here the number corresponds to the # of taps in the filter). it then denoises the resulting coefficients using the universal threshold and a hybrid SUREshrink threshold (c.f. Ogden, T. : essential wavelets for statistical applications and data analysis p.147) and reconstructs the thresholded data and stores it in a .raw file starting with coh_. it subtracts the coh_ file from the original and stores the result in a incoh_ file. to use it you must specify the variable to decompose (on line 8). resulting data are stored in generate_stats_wx.mat and generate_wx.out files for wx in this case.   this is the main script that i have used to generate the vdf data.

JC: A sample generate*out file that has been annotated with documentation can be found in ~clyne/Projects/SIParCS08/mckinlay/generate_wx.out 

To change variables in generate.m, replaces all instances of 'wx' with appropriate variable name. 

 generate_hier.m implements a level dependent hard threshold on user specified data corrupted by hierarchical gaussian noise.   this technique is fairly rudimentary and theres probably room for improvement.

 to run the scripts in batch mode, go to /cxfs/DASG/hleee/vdc and run the command:      matlab -nosplash -nojvm -r name_of_script > & ! name_of_data_out_file < /dev/null &   (e.g. matlab -nosplash -nojvm -r generate > & ! generate_wx.out < /dev/null &)

to view the progress and errors run:  tail -f  name_of_data_out_file

caution:  before running the scripts i suggest checking to see if there are enough wavelet toolbox licenses available (otherwise youll spend a lot of time loading in data for nothing)

to check run the command:     lmstat -a -c /fs/local/apps/matlab/etc/license.dat | grep -i wavelet

in theory ncar has two liscenses but only one seems to work properly so you really need to check that none are in use, eg:

 Users of Wavelet_Toolbox:  (Total of 2 licenses issued;  Total of 0 licenses in use)

*************************************************************************************************************************************************************************

vdf file generation

*************************************************************************************************************************************************************************

finally the script create_vdf.csh takes the generated .raw files and creates vdf files( e.g.  c12_sure.vdf) and associated data directories (e.g. c12_sure_data/).

once you create these you can view the data in vapor.

*************************************************************************************************************************************************************************

final notes

*************************************************************************************************************************************************************************
 i discovered an inconsistency in my SUREshrink procedure rather late (8/19).  ive corrected it but probably all of the SUREshrink data should be regenerated. 

unfortunately license problems have prohibited me from doing this so far.  so the SUREshrink data in the vdf files should be taken with a grain of salt.

-cm  8/21/08

JC: Changes to the generate.m script to address the SUREshrink problem noted above have included additional statistics. Hence, all data should be reprocessed.
JC: Wx data have been processed with generate.m. They need to be translated to VDC. Remaining variables (wy, wz..vz) still need to be processed.  

  • No labels