AA: Joint Distribution and Correlation

Assignment Summary

If your last (family) name begins with A-N, please complete the "Appl. Assignment 3 Quiz" on WebCT PRIOR TO STARTING work on this assignment. If your last (family) name begins with O-Z, please complete the "Appl. Assignment 3 Quiz" on WebCT after finishing work on this assignment. The quiz is found in the "Assessments" tab of the WebCT page.

Last Name Begins With When To Take the Quiz
A-N BEFORE STARTING the assignment
O-Z After completing the assignment

In this assignment, you will find and analyze the joint distribution of the received signal strength (RSS) at nodes a and b. In particular, your analysis will determine the correlation coefficient between the two values.

  1. Compute the joint pdf for your data. Turn in: Two different plots of your joint pdf.
  2. Compute the covariance matrix and the correlation coefficient for the two-directional measurements. Turn in: the calculated covariance matrix and correlation coefficient.

You will need Matlab, a text file from an RSS measurement experiment, and some way to produce a document (eg., Word, OpenOffice). Create one PDF containing the outputs of your assignment, and the Matlab code you used to generate them. Turn it into the WebCT assignment drop box, by midnight on the due date.

Assignment Description

Load the data as you did in Application Assignment 2. This involves loading, separating, and interpolating the data so that you have data vectors "interpa" and "interpb" in memory. Also, correct for the mean in each vector as you did previously.

0. Join the RSS values on both directions into one matrix

Create a matrix of the interpa and interpb vectors:

values = [interpa; interpb];

Let's call these random variables $X_1$ and $X_2$.

1. Compute and plot the joint pdf

There is no standard Matlab command to generate a two-dimensional histogram. I wrote this code to compute a 2-D histogram. It uses Matlab's hist command to perform most of the real work, and just repeatedly applies it. The for loop segregates the values of $X_1$ by the $X_1$-value bins. Then for each subset of $X_1$ values, it creates a 1-D histogram of the corresponding $X_2$ values and puts them into the appropriate boxes in the 2-D histogram. It requires you to set bins, which is the number of bins on each axis. I used bins=40.

% Actual estimation and plotting of the 2D pdf
% 1. Use hist on all data to find bins and bin sizes.
[n, x1] = hist(values(1,:),bins);
[n, x2] = hist(values(2,:),bins);
delta_x1 = x1(2)-x1(1);
delta_x2 = x2(2)-x2(1);
% 2. Initialize a 2-D matrix for the 2-D histogram
n2d = zeros(length(x1), length(x2));
% 3. For each row, find the indices of the X_1 values which fall into that row.
% Compute a histogram for the X_2 values, and put it in the 2-D histogram for that row.
for i = 1:length(x1),
ind = find((values(1,:) > x1(i)-delta_x1/2) & (values(1,:) <= x1(i)+delta_x1/2));
n2d(i,1:length(x2)) = hist(values(2,ind), x2);

Next, as with the 1-D pdf, we need to normalize so that the pdf integrates to one:

pdf = n2d./(sum(sum(n2d))*delta_x1*delta_x2);

Finally, plot the pdf using one of the following options, so that the pdf is clearly visible.

  1. Image Plot: h = imagesc(x1,x2, pdf); colorbar;
  2. Surface Plot: h = surf(x1,x2, pdf); view(20,30);
  3. Mesh Plot: h = mesh(x1,x2, pdf); view(20,30);
  4. Contour Plot: h = contour(x1,x2,pdf); colorbar;

These are only some of the options Matlab allows; please feel free to try another one. The view and colorbar commands are for my preference, and you can try other things. You do need some formatting regardless to make a decent figure:

colormap(1-gray) % May or may not be the best color scheme; you choose.
xlabel('Value of X_1')
ylabel('Value of X_2')
zlabel('Probability Density f_X_1_X_2(X_1, X_2)')

Note that you should probably make a function from these commands; you will use them again in Application Assignment 4.

Turn in two different plots of your 2-D pdf. Pick whichever two you think best show the features of your pdf.

2. Calculate the covariance matrix and correlation coefficient

Each column of values is a vector $\mathbf{X} = [X_1, X_2]^T$ of the bidirectional measurements at a given time.

The covariance matrix is calculated as
$$ C_\mathbf{X} = \frac{1}{N-1} \sum_{i=1}^N \mathbf{X}(i)\mathbf{X}(i)^T $$
where $\mathbf{X}(i)$ is the ith measured realization of $\mathbf{X}$, that is, the ith column of values, and $N$ is the total number of measurements.

In Matlab, this is quickly calculated as

C_X = (values * values') / (length(values)-1)

Compute C_X and turn it in. Also, print out the result of Matlab's cov(values') command to verify it produces the same result.

Remember that the covariance matrix contains both covariance values and variance values. Since your C_X matrix is two by two, it has the following form:
$$C_X = \left[ \begin{array}{cc} \mbox{Var}(X_1) & \mbox{Cov}(X_1,X_2) \\ \mbox{Cov}(X_2,X_1) & \mbox{Var}(X_2) \end{array} \right]$$

Finally, compute the correlation coefficient of $X_1$ and $X_2$. Recall that the correlation coefficient is given by
$$ \rho = \frac{\mbox{Cov}(X_1,X_2)}{\left[\mbox{Var}(X_1)\mbox{Var}(X_2) \right]^{1/2}} $$
Turn in the $\rho$ value for your data. Be sure that it has magnitude less than one!


Submit your results as described above.

If your last (family) name begins with O-Z, please complete the "Appl. Assignment 3 Quiz" on WebCT now.