AA: Marginal Distribution and Moments

Assignment Summary

If your last (family) name begins with A-N, please complete the "Appl. Assignment 1 Quiz" on WebCT PRIOR TO STARTING work on this assignment. If your last (family) name begins with O-Z, please complete the "Appl. Assignment 1 Quiz" on WebCT after finishing work on this assignment. The quiz is found in the "Assessments" tab of the WebCT page.

In this assignment, you will find and analyze the (marginal) distribution of the received signal strength at one of the receivers. You will first load a file of measured received signal strength (RSS) into Matlab. From these measurements, you will

  1. Plot the probability mass function (pmf). Turn in: a plot.
  2. Plot the cumulative distribution function (CDF). Turn in: a plot.
  3. Compute the mean, second moment, second central moment (variance), standard deviation, and third central moment (skewness). Turn in: the calculated values.
  4. Compare results to analytical distributions (e.g., Gaussian, Rayleigh). Turn in: a plot.
  5. Design a quantization scheme which makes ones and zeros equally likely. Turn in: a description and the probability of one and the probability of zero.

You will need Matlab, a text file from an RSS measurement experiment, and some way to produce a document (eg., Word, OpenOffice). Your document will be a mix of some Matlab plots and some text which reports the values you obtained. Save (print) your document as PDF and turn it into WebCT by the deadline.

Assignment Description

Again, if your last (family) name begins with A-N, please complete the "Appl. Assignment 1 Quiz" on WebCT PRIOR TO STARTING work on this assignment.

0. Loading the Data File

Your measured RSS data is contained in a text file. Each measurement is contained in a row, first with the receiver id number, and the raw value of RSS which it measured. The id number is either 1 or 2. The raw RSS value is an integer value. This could be converted into dBm, but it is not necessary for our application, so we will use the raw RSS value directly.

In Matlab, load the file into the matrix "data" by using the command:

data = load('-ascii','filename.txt');

where my file was named "filename.txt".

Then, select the data from one of the receivers using:

rxOneInd = find(data(:,1)==1);
rssValues = data(rxOneInd, 2);

The first line finds all of the rows which have receiver ID equal to 1. There was no reason I chose receiver 1; you could choose to use receiver 2. The second line selects the data recorded in those rows.

1. Plotting a p.m.f.

The most important key for this step is to realize the difference between a histogram and a pmf. A histogram counts how many occurrences of a value, but a pmf must predict the probability (between zero and one) of that value. The pmf must also sum to one. The pmf must scale the histogram by the number of samples in order to be a valid pmf. Further, Matlab must be told to obtain a count for each possible bin.

First, create a list of the possible values for the RSS. This is the set $S_X$ or 'certain event' we discuss in lecture. For example,

possibleValues = -40:1:20;

Make sure that your range includes the highest and lowest RSS you measured (how would you do this automatically?). Then, find a count of occurrences using hist:

n = hist(rssValues, possibleValues)

The pmf estimate is

pmf = n ./ sum(n);

To plot a pmf, use the "stem" command in Matlab:

stem(possibleValues, pmf);

You will of course want to label your axes and make the font larger:

ylabel('Probability of W = w');
xlabel('Value of w');

Here, I am assuming the random variable is called "W".

It would be a good idea to make this a function which takes in rssValues and produces the plot, so that you can call it quickly for future assignments.

You do not need to print out this output since you will put another plot on top of it in part 4., and can print the output then.

2. Creating a Cumulative Distribution Function

A CDF is $F_W(w) = P[W\le w]$. It is effectively a cumulative sum of the probabilities for all values less than or equal to the current value. Matlab provides a great tool for this called "cumsum". Be sure that your CDF vector begins at zero and ends at 1.0.

Since you can't plot straight lines between sample points in a CDF, you should use a plot without a line, like:

plot(possibleValues, CDF,'o')

where "CDF" is your vector of CDF values.

3. Calculating Moments

An estimate of any moment of your distribution comes from an average over all samples. For example,
$$ \hat{\mu} = \mbox{ estimate of }E[W] = \frac{1}{N} \sum_{i=1}^N W_i $$
where $W_i$ is the ith measured RSS value, and $N$ is the total number of measured RSS values. For any moment, you can do a decent job of estimating it from the average of that function of your data. For example, if the function is $g(W)$, in other works, you are trying to find $E[g(W)]$,
$$ \mbox{estimate of }E[g(W)] = \frac{1}{N} \sum_{i=1}^N g(W_i) $$
For example, the second central moment is
$$ \frac{1}{N} \sum_{i=1}^N (W_i-\hat{\mu})^2 $$
where $\hat{\mu}$ is your estimate for the mean.

For this part, estimate and report:

  1. the mean $\mu = E[W]$,
  2. the second moment $E[W^2]$,
  3. the second central moment $E[(W - \mu)^2]$ (the variance),
  4. the standard deviation (the square root of the variance),
  5. the third central moment $E[(W - \mu)^3]$ (the skewness). This last parameter tells you about the asymmetry of your distribution -- the more negative it is, the more it is skewed to the negative side, and the more positive it is, the more it is skewed towards positive values. If it is approximately zero, you have an approximately symmetric distribution.

4. Comparing to Analytical Distributions

Your data may not appear like any analytical distribution we cover in class, but it is important to have more tools to use to compare them. One important one is being able to plot an analytical distribution on top of your empirical pmf estimate you plotted previously. In this part, you will plot a Gaussian pdf on top of (on the same figure as) your pmf. To calculate a Gaussian pdf, use:

pdfGaussian = 1/sqrt(2*pi*varianceEst) .* exp(-(possibleValues-meanEst).^2 ./(2*varianceEst))
plot(possibleValues, pmf, 'o', possibleValues, pdfGaussian,'-')
ylabel('Probability Mass or Density');
xlabel('Value of w');
legend('Experimental pmf','Gaussian pdf')

where "varianceEst" is your estimate of the variance and "meanEst" is your estimate of the mean (your answers to the Moments problem). You could also used "hold on" and then have plotted only the Gaussian pdf; either way works.

5. Designing a Binary Quantization Scheme

In this final section, you will specify a binary quantization scheme to convert some integers to 1 and the rest to 0. All you need to do is specify one set of values which will be converted to 1, and another set which will be converted to 0. The simplest set is simply an interval. But, be sure that the union of both sets cover all possible RSS values (the two sets must be a partition).

For a good secret key, we want an equal probability of ones and zeros. Thus the probability that the RSS value falls into set 1 (or set 2) must be approximately 0.5. (You will not be likely to get it to 0.5 exactly; try to get the probabilities within the range of 0.47 to 0.53 for the two sets.

Describe the sets and show with a Matlab command (or two) that the probability of one and probability of zero are both approximately 0.5.


Submit your results as described above.

If your last (family) name begins with O-Z, please complete the "Appl. Assignment 1 Quiz" on WebCT now.