OC3150 Lab 5

OC3150 Lab 5

1.0 Objectives:

To compute an energy density spectrum for each of the two data sets assigned to you using a modification of the program you wrote in Lab 4.

To observe the statistical variability of the spectral estimates by varying the number of samples to be ensemble averaged, thereby varying the degrees of freedom.

To observe how the resolution of the spectrum varies for different sample sizes.

To observe the trade-off between resolution and confidence by varying both the sample size and the number of samples so that the entire set is always used.

To test if the data set is ergodic.

2.0 Summary of data runs

Make plots of the Log Spectral Density for one of your data sets with the following NS and N values:

*RUN#*	N	*N (Buoy Data)*	NS	*deg of freedom*
1	256	(128)	4	8
2	256	(128)	16	32
3	128	(64)	16	32
4	128	(64)	32	64
5	128	(64)	16	32

3.0 Programming Hints

Modify your Lab 4 program so it will do the fft on your data. In matlab, before you run your program load your data set by typing load filename.data (e.g., "load c22x.data"). Then assign it a more general name, for example:

data = c22x; % (assuming c22x.data is one of your data sets)

ntot= 4096; (or 2048) % This number should be same as length of your data set.

You'll want to vary N, the sample size, and NS, the number of samples . Use the Matlab function input with a prompt to ask you to type in these values, as well as the appropriate values for the confidence intervals which you will plot with the Log Spectral Density of your data. (Take these values from the plot found in your class notes).

(Don't forget to assign the appropriate value of dt for your data set.)

N=input(`the appropriate prompt');

NS=input(`the appropriate prompt`);

ntotal = N * NS; %now this is the size of the data you're analyzing

ydata=data(1:ntotal);

ydata=reshape(ydata,N,NS); %each column represents an ensemble sample.

You will also need to demean each column at this point (use detrend). Now apply a Hanning window to your data:

H = hanning(N); % this is an N x 1 vector to be applied to
% each column of ydata

Eh = ones(1,NS);     % a row vector such that when we do matrix
                                 % multiplication, H*Eh, we get the desired
                                 % N x NS matrix

H = H * Eh; % each column of H is a Hanning window to be
% applied to each column of ydata.

ydata = H .* ydata; % try plotting ydata to see what it looks like.

Next, apply the fft and obtain the power spectrum:

X = fft(ydata); % the fft is applied to each column

etc.. % (same as in Lab 4, up to the creation of Gxmean column vector)

The Hanning window has cut down your variance so we must apply a window correction (wc) to the spectrum. This factor is 2.7.

Gxmean = Gxmean * wc; % Make sure you've assigned a value to wc.

3.1 Plotting

You'll also need to create vectors for the upper and lower confidence levels . Then you can plot all three curves on one plot with the following commands:

lower = Gxmean * (lower value you got from plot in class notes)
upper = . . . (same idea)
semilogy(freq,Gxmean,'b-',freq,lower,'r--',freq,upper,'r--')
v = [xmin xmax ymin ymax];    % These are the minima and maxima for the
                                                           % x and y-axes; you must specify xmin, xmax, etc.
axis(v)                     % Since you'll want to compare plots,
                                % this will allow you to plot all your
                                % spectra on the same scales.
Be careful how you select the units for freq on the x-axis of these plots. These units will depend on whether your data set had a dt of 0.5 sec or 1 hour.

4.0 Procedure

Run your program and make/print a semilogy plot for each of the six steps below. The first five plots are only for one of your data sets, while the sixth plot should be for the other one of your data sets. Label the axes and put an appropriate title on each plot to indicate the run number and the values of N and NS, as well as the data set being analyzed.

Start out with Run #1 (N=256 (N=128 for buoy data), NS=4). Just for your first data set, calculate the spectral density functions including the confidence intervals on the plot of the log spectral density.

Then holding N=256 (128), change NS=16 (Run #2). Observe the difference.

Now try Run #3 (N=128 (64),NS=16). Since df = 1/T = 1/(Ndt), the resolution will decrease for the smaller sample size. Compare with Run #2.

In practice, we usually have a finite set of data that we must divide up into ensemble samples. The larger the number of sample functions, the larger the degrees of freedom and thus the more confidence we have in the estimate. However, the larger the number of sample functions, the smaller the sample size. Thus, we sacrifice resolution to gain confidence. Observe this trade-off by doing Run #4 ( N=128 (64), NS=32 ), and comparing the result with Run #2.

We will assume that the data is stationary. Check that the data is ergodic by doing Run #5 (N=128 (64), NS=16) but use the second half of your data set. Check to see if the confidence intervals overlap with those from Run #3 (step #3 above) of N=128 (64) and NS=16 that used the first half of your data set. This will indicate whether the variability is due to chance or to the fact that your data is not truly ergodic. It might be useful here to actually plot your Run #3 data on top of your Run #5 data on a separate plot (not required).

To extract the second half of your data set change the statement

ydata=data(1:ntotal); to ydata=data(ntotal+1:ntot);

After observing the trade-off between resolution and confidence, select the best combination of N and NS and use this to compute the spectrum of your second data set. (Remember to change the line from step 5 back to "ydata=data(1:ntotal);" again.)

5.0 Report

Turn in a copy of your matlab program.

Turn in 5 log spectral density plots including confidence intervals (one for each of the above N,NS combinations) for one of your data sets. Also turn in the one log plot for the other data set using what you consider to be the best combination of N and NS.

While holding N constant (Runs # 1 and 2) what changes occurred in the spectral resolution and confidence intervals as a result of modifying the number of degrees of freedom?

While holding NS constant(Runs # 2 and 3) how did the resolution and confidence compare for different sample sizes?

Is your data ergodic? (Runs # 3 and 5). Support your conclusions.

Explain the tradeoff between confidence and resolution (Runs #2 and 4). What combination of N and NS did you choose to plot your second data set (procedure #6), and why?

Use var(data) to find the variance of both of your data sets. List and compare this value with the integrated energy density variances from the results of procedure steps 1-6 above (Your list should include five variance estimates for your first data set but only one variance estimate for your second data set). Be sure to give your variance results the correct units! Think of some reasons why the estimates are different from the values computed by var.