Next: 4 Results Up: The Angular Two-Point Correlation Previous: 2 The Catalog

3 Determining the Correlation Function

The two-point angular autocorrelation function is defined by the joint probability P of finding two sources in each of the elements of solid angle and separated by angle :

where N is the mean surface density. Many derivations for estimators of have been given before (see, for example, Peebles 1980). For those unfamiliar with the field, we outline a basic derivation given by Sicotte (1995).

It follows from the definition that for a catalog of n data points covering a solid angle , the mean number of sources at a distance from a randomly picked data point is where is the mean solid angle about a randomly chosen data point. The total number of pairs with separations in the interval is then given by . can be measured directly from the catalog, and when combined with an estimate of , can give an estimate of . For an all-sky catalog and . For surveys with complicated geometries, it is more practical to calculate using a field of randomly distributed points covering the same area as the survey. Assuming there are the same number of random points as there are data points, the number of pairs of random points with separations between is given by . One can clearly measure this quantity and obtain a value for which is approximately equal to . A simple estimator of is thus given by DD/RR-1. Strictly, one needs to have more random points than data points so that the estimate is not limited by statistical errors in the random points. An improvement on this estimate of can be obtained using the quantity DR(): the number of pairs of points separated by angle where one point is taken from the data field and one is from the random field. Using DR instead of RR enables one to measure the mean solid angle around data points [] as opposed to the mean solid angle around random points [].

Sicotte (1995) gives a detailed comparison of the various estimators of . We have chosen to quote results using the the standard (S) estimator [] as well as the estimator suggested by Landy and Szalay (1993, hereafter LS) []. The LS estimator has been shown to have smaller uncertainties on larger scales and Sicotte claims it will remove the effects of large scale fluctuations in density for estimates of at small . The method introduced by Hamilton (1993) [] gives results very similar to the LS estimator.

To determine the uncertainties associated with each estimate, a `bootstrap' analysis of the errors was performed. This method of estimating uncertainties is described in detail in Ling, Frenk and Barrow (1986) and in Fisher et al. (1994). A set of `bootstrap' catalogs, each the same size as the data catalog, are generated using the following procedure. A source is picked at random from the data catalog and inserted into the first bootstrap catalog. Random sources from the data catalog continue to be included in the bootstrap catalog until it has the same number of sources as the data catalog. Some of the sources in the original data set will appear more than once in the bootstrap catalog, some will not appear at all. A set of these bootstrap catalogs can then be generated and the correlation function can be calculated for each one. The result is that at each we produce a set of normally distributed estimates of the correlation function. The variance around the mean can then be used as an estimate of the uncertainty in the measurement of w at each . This process established that both the S estimator and the LS estimator of had similar uncertainties associated with them out to about . The Poissonian estimate of the uncertainties given by is less than the bootstrap estimate by a factor of 2 on small scales () and by more than an order of magnitude on larger scales ().

As explained above, the random field measurements RR and DR correct for the `edge effects'; that is, they estimate the mean solid angle about points in the area of the survey for a given . Therefore, the random field should contain all the biases that the data field contains. We consider the following:

In the correlation analysis, we have used a catalog in which all sources within of each other are considered a single source. This constraint is reproduced in the random fields.
The coverage of the data is reproduced in the random fields using the pixel coverage map where each pixel represents a () area. Random points are generated in any region where there is data coverage. The flux density threshold for detection of a source depends on the local rms noise, which varies slighlty accross the survey area. To check whether any systematic variations in sensitivity could affect the correlation function estimate, we used the coverage map to generate a random field which also `missed' sources in noisy areas even though they had a flux density of S>1 mJy. Random sources were assigned a random flux density according to Windhorst et al.'s (1985) log N-log S curve. If the source flux density was below 5 times the rms noise given in the coverage map then it was discarded. The results of this are noted in
The presence of sidelobe contamination could also be included in the random fields. Instead, we decided to use the correlation function to obtain an independent estimate of the sidelobe contamination.

In addition to the angular autocorrelation function for the catalog as a whole and various source subsamples, we have determined the angular cross correlation function for the radio sources and Abell clusters. We use the estimator

where Dd is the number of radio sources separated by

degrees from an Abell cluster center and Rd is the number of random field points

degrees from an Abell cluster center.

Next: 4 Results Up: The Angular Two-Point Correlation Previous: 2 The Catalog

Richard L. White, rlw@stsci.edu

FIRST Home Page
1996 Dec 20