Statistics Notes
What is the procedure used to create the Chisquare and Variance
graphs, and how should they be interpretated?
The Chisquare figures show the cumulative deviation of the
second-by-second local deviations from expectation, compounded across the
N eggs (N=36 to 38 at this time). That is, for each second, the Z's for
all the N eggs are added and normalized by sqrt(N), then the resulting Z
is squared to yield a Chisquare with 1 df, and finally the Chisquares-1
(Chisq=1 is the expectation) are cumulatively summed, to represent the
departure from expectation. More details are available in exact
descriptions of the GCP methods and
procedures and in the extended
analysis page.
The Variance figures show something similar, but instead of the
compounded Z across eggs, the variance (squared standard deviation) is
computed across the N eggs for each second. The sequence of Variance-50
(Var=50 is the expectation) is then cumulatively summed as before.
The Chisquare figure displays extreme departures, in either direction,
of the trial scores of the egg from what is expected by chance. The
Variance figure displays the degree of variability among the trial scores
for the eggs. Chisquare addresses movement of the central value of the
distribution, Variance represents changes in the range or width of the
distribution.
What is the difference in the the analyses by Roger Nelson and Dean
Radin?
The most important difference is in the treatment of the data at the
finest scale. Neither way is superior, but there is a difference in what
is expected or hypothesized about the behavior of the eggs in the presence
of a possible influence. The two perspectives are complementary, and
though they are not fully independent, using both contributes to our
confidence that the apparent effects are not accidents or mistakes.
For each second, Roger calculates what is called a Stouffer Z across
the eggs as described above. This means that in order to produce a large
deviation, the eggs have to have a positive correlation to be doing
the same thing. This composite Z is squared, so it does not matter whether
the average value is shifted to the high or low direction, but there must
be some excess deviation and there must be a tendency toward inter-egg
consistency in the direction of deviation. The result is a single squared
Z-score, which is Chi-square distributed, for each second.
Dean calculates a Z-score for each egg separately, and squares these
individual Z-scores. He then sums the squared Z's across the eggs,
producing a a single Chi-square for each second. In this case, the eggs
are not expected to show a positive correlation, and a high score requires
only that there is a tendency for excess deviation in either direction; no
inter-egg consistency in the direction of deviation is predicted. Again,
the result is a single squared Z-score, which is Chi-square distributed,
for each second.
Dean's method of summing Zē is closely related to Roger's variance
analysis.
More
Many other analyses and graphs have been generated, and some show
certain details and perspectives that may interest you. The extended analysis
page has most of the figures shown here, but in the context of the
developing analysis program over the first few days following the tragedy.
A question of particular interest is whether distance makes a difference.
Dean Radin separated the results by
location of the eggs in a careful analysis. Peter Bancel has been
looking at the interegg
correlations and has provided a draft report. There is also an Interpretations
page in the works, and one that simply presents the flood of messages from
people all over the world who are involved in the GCP/EGG project. For
more details about the project itself, you can go to the GCP home page where
you will find links to all aspects.