PHY 445/6 / PHY 515/6

Error analysis

Back to the first page.

We get a very accurate number if we take the average of 2000 measured values of the voltage. But we will not know how accurate is it. Let us explore the situation with five voltages and taking the average of those. We repeat the procedure for 2000 times, and we perform the same analysis (histogram) on the data as we did on the single data set in the previous page.

Averaging, Central Limit Distribution Theorem

Open a new workbook in Excel ("File", "New"..) Highlight the first column in Excel. and open the "random5.csv". You will see five columns of 2000 measurments, labeled V1... V5.

Let us take the average of the first five numbers and place it to the column F3. In cell F3, write "=SUM(A3:E3)/5" (this is a shorthand for =(A3+B3+C3+D3+E3)/5 ).


Click and drag the lower right corner down to row 2002. Now you have 2000 numbers, each one is being the average of five voltages. Write in the F1 "average". Save your file as an .xls document.

Next we make a bin. Expecting less scatter of the points we prepare it in increments of 0.1V, like this:


Open "Tools", "Data Analysis..." and make a histogram:
Finally, you make a graph of the result, like before. Do not forget to write in the labels and adjust the scale, if necessary. You should see this:
What is the moral of the story? The distribution is centered at about 10.1V. The voltage still fluctuates, but the distribution is different: It has the famous "bell shape". Remember, the random voltages we started with followed a uniform distribution and yet here we have a bell shaped (also called "normal" or "Gaussian") distribution. This is the essence of the Central Limit Distribution Theorem: No matter how you start, a sufficiently large number of random influences always lead to Gaussian distribution. This is why the Gaussian distribution is so important in the statistical calculations.

Notice another change in the distribution: the original width of +/- 0.5 V is reduced. The "full width at half height" of the distribution curve is about 0.4 V; therefore the error is +/- 0.2 V. The result of our five point average is 10.1+/- 0.2 V.

The general rule is this: Assuming statistical fluctuations, in an N point average the error is reduced by a factor of 1/N1/2. Therefore, if we average the original 2000 measurements we get 10.10+/- 0.01 V

In conclusion, by doing more than one measurement we can do several things:

The average of y1, y2, .... yN measured points is expressed by the formula yave= (1/N)Syi. There is a shortcut way to get an idea about the width of the distribution function. Calculate the quantity s2= (1/N) S (yi-yave)2. The parameter s describing the original distribution function can be approximated by s~s.


This page was created by Laszlo Mihaly. Las updated 1/8/98.