[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Chi-Square Test and Counting Statistics



The Chi-Square test derives from the linear least squares fitting
of a function to data.  The least squares solution is the maximum
likehood solution for gaussian data.  However it is used extensively
for Poisson distributed data.  As has been mentioned Poisson can
be approximated by a normal distribution for a large number of
counts in a given measurement.  What you consider large depends
on how accurate you want your results to be.  Even with large
counts discrepancies result.

The discrepancies are a result of the skewed shape of the Poisson
distribution.  All radioactivity counting results are Poisson 
and not normally distributed.  For QC testing of scalar systems you probably
will not notice any problems.  However, the third moment of
counting data should come out to about equal the mean count
whereas the third moment of normally distributed data will be 
equal to zero.  Because of this taking a weighted average of
counts and applying a Chi square test will result  that the
calculated mean times the number of data points will be
the sum of the data points minus Chi-Square.  Obviously if
you take an arithmetic mean this sum is zero.  Therefore,
the arithmetic mean minus the weighted mean is equal to
ChiSquared/N.

Chi-squared/(N-1) should be close to one for well behaved systems,
but there are cases where well behaved systems will give bad ChiSquares.
The example that comes to mind is in least square fitting of
photopeaks in gamma spectroscopy.  Reduced Chi-Square fits of
photopeaks will give good results (near one) for small to moderate
peak areas, but large peaks will give (apparent) bad fits.  This is a result
of not having a perfect line shape and the bias between Normal and
Poisson distributed data mentioned above.  At some large number
of counts in a peak the Chi Square will begin to grow linearly
in proportion to the number of events in the peak.  This is a determinate
error however.  The area of the peak will be under estimated by
the value of Chi-Squared.

The variance of data s = sigma * Chi sub nu where sigma is the square root
of the number of counts and Chi sub nu = square root of the reduced Chi-Square
(Chi Squared/N-1).  However in the case of large counts where the
bias is causing the Chi-Square to grow proportional to the counts the
use of this formula to calculate variance is improper.  A test of this is to
follow the decay curves of photopeaks with large counts, and do a least
squares fit to this data.  Use of the correct variance from the
photopeak data will give a reduced Chi-Square near unity.  If you use
sigma you get results near unity.  If you multply sigma by Chi sub nu
you get too small a Chi-Square indicating a problem.

That was the long response.  The short response is calculate the 
reduced Chi-Squared and if it is reasonably close to one things
are okay.

I recommend "Mathematics for the Physical Sciences" by Bevington
as a reference to statistical methods for the types of stuff we do.

Dale