The two-point angular autocorrelation function is defined by the
joint probability
P of finding two sources in each of the elements of solid angle
and separated by angle :
where N is the mean surface density. Many derivations for estimators
of have been given before (see, for example, Peebles 1980).
For those unfamiliar with the field, we outline a basic derivation
given by Sicotte (1995).
It follows from the definition that for a catalog of n data points covering a solid angle , the mean number of sources at a distance from a randomly picked data point is where is the mean solid angle about a randomly chosen data point. The total number of pairs with separations in the interval is then given by . can be measured directly from the catalog, and when combined with an estimate of , can give an estimate of . For an all-sky catalog and . For surveys with complicated geometries, it is more practical to calculate using a field of randomly distributed points covering the same area as the survey. Assuming there are the same number of random points as there are data points, the number of pairs of random points with separations between is given by . One can clearly measure this quantity and obtain a value for which is approximately equal to . A simple estimator of is thus given by DD/RR-1. Strictly, one needs to have more random points than data points so that the estimate is not limited by statistical errors in the random points. An improvement on this estimate of can be obtained using the quantity DR(): the number of pairs of points separated by angle where one point is taken from the data field and one is from the random field. Using DR instead of RR enables one to measure the mean solid angle around data points [] as opposed to the mean solid angle around random points [].
Sicotte (1995) gives a detailed comparison of the various estimators of . We have chosen to quote results using the the standard (S) estimator [] as well as the estimator suggested by Landy and Szalay (1993, hereafter LS) []. The LS estimator has been shown to have smaller uncertainties on larger scales and Sicotte claims it will remove the effects of large scale fluctuations in density for estimates of at small . The method introduced by Hamilton (1993) [] gives results very similar to the LS estimator.
To determine the uncertainties associated with each estimate, a `bootstrap' analysis of the errors was performed. This method of estimating uncertainties is described in detail in Ling, Frenk and Barrow (1986) and in Fisher et al. (1994). A set of `bootstrap' catalogs, each the same size as the data catalog, are generated using the following procedure. A source is picked at random from the data catalog and inserted into the first bootstrap catalog. Random sources from the data catalog continue to be included in the bootstrap catalog until it has the same number of sources as the data catalog. Some of the sources in the original data set will appear more than once in the bootstrap catalog, some will not appear at all. A set of these bootstrap catalogs can then be generated and the correlation function can be calculated for each one. The result is that at each we produce a set of normally distributed estimates of the correlation function. The variance around the mean can then be used as an estimate of the uncertainty in the measurement of w at each . This process established that both the S estimator and the LS estimator of had similar uncertainties associated with them out to about . The Poissonian estimate of the uncertainties given by is less than the bootstrap estimate by a factor of 2 on small scales () and by more than an order of magnitude on larger scales ().
As explained above, the random field measurements RR and DR correct for the `edge effects'; that is, they estimate the mean solid angle about points in the area of the survey for a given . Therefore, the random field should contain all the biases that the data field contains. We consider the following: