The primary condition for assigning an optical counterpart to a radio source has traditionally been positional coincidence; while other factors can enhance or detract from the probability that the association is correct, these are generally second-order effects. If the positional accuracies for both the radio and optical catalogs are very high, positional coincidence alone can afford a high degree of confidence. In BWH, we calculated the expected error rate for associations between the APM and several hypothetical VLA radio surveys with differing angular resolutions. The reliability of associations based on positional coincidence were presented as a function of optical source density on the sky, and as a function of apparent V magnitude at the North Galactic Pole. The results demonstrated the strong dependence of both reliability and completeness on the angular resolution of the radio survey.
In reality, the actual error rate in radio/optical associations is a complex problem depending on the details of the relative astrometric accuracy of the two data sets, the optical morphology of candidate objects, and radio source morphology. In this section we quantify empirically the reliability of associations between FIRST radio sources and optical counterparts on the POSS-I plates as measured by the APM.
In our analysis of the FIRST-APM match rate, we retain only the closest optical match to each radio source and vice versa. While the optical catalog we present below includes all matches out to , not just the closest one, using only the closest matches when describing the identification statistics makes understanding the background and true match rates easier. We want to know the number of radio sources that have optical counterparts, but we want to avoid double-counting (e.g., a particular optical object may be the closest counterpart for two different radio sources.) Also, neither catalog can have pairs of sources with separations of less than a few arcseconds, which means that there is effectively a ``hole'' in the catalog around each radio or optical source. This makes it very difficult to calculate chance coincidence rates, since true matches suppress additional false matches in a complicated manner.
We can avoid both of these problems by using only the closest matches. This also modifies the chance background rate, since once a source has a counterpart there can be no additional counterparts at wider separations. This effect is straightforward to understand and incorporate into the background computation: the area being searched for new matches at a given radius is contributed only by FIRST sources that do not already have a closer optical counterpart. Another complication when using closest matches is that the matching circles for close pairs of radio sources can overlap. An optical source that falls in the overlap region between two radio sources can only be assigned as the counterpart to one of them. This is also a mathematically tractable problem. Both effects are treated as a reduction in the effective sky area that is searched as a function of matching separation.
When both of these effects are taken into account, it is possible to
predict the number of chance (false) coincidences as a function of the
radio-optical separation. A simple model for this prediction that
assumes a uniform, random distribution of optical sources on the sky
predicts more coincidences at large separations than are actually
seen, owing to the variation in optical source density
on the sky. In the presence of a varying optical source
density, radio sources that remain unmatched at large separations are
more likely to be found in low-density areas of the sky; consequently,
the effective mean optical source density decreases as the separation
increases. If the variation
is not too
large, this can also be modeled fairly simply. For an ensemble of
radio sources, the effective background source density is
Our background model, then, has two parameters: the mean source density and the variance in the density . Values for these quantities were estimated from a spurious match catalog generated by offsetting the coordinates of the radio sources by to the south. We use this procedure instead of simply adopting the match rate between, say, 10" and 20" around the real source positions in order to minimize any enhancement in the false rate resulting from the presence of real radio-optical associations at large separations. Such associations result both from optical counterparts to multiple-component FIRST objects (where the optical position need not match any one of the cataloged FIRST components closely) and clustering of galaxies around FIRST objects (which will produce an uncharacteristically high optical source surface densities in the vicinity of the radio sources.) The parameters for the background due to various types of APM objects are given in Table 1.
In Figure 8, we display the result of matching all 382,892 radio sources in our catalog to the astrometrically corrected APM catalog. The plot shows the cumulative excess of matched sources over the background of chance coincidences as a function of the offset between the radio and optical positions. We estimate that 98% of the APM sources within 1" of a FIRST source are physically associated with the radio source; 42,400 sources meet this criterion. Even out to 2", 94.5% of the 59,700 associations are real. Figure 8 indicates that some real matches occur out to " although the reliability decreases steadily as the separation increases. Integrating under the curve of Figure 8 out to 4" implies 61,800 real associations (16% of all radio catalog entries). Inside the 1" radius, 16% of the optical counterparts are classified as stellar on both plates, while 41% are classified as non-stellar on both plates; in the remaining cases, the classifiers disagree or the object is only detected on one plate. Note that, at faint magnitudes, classification becomes difficult owing to the limited number of pixels above threshold; in addition, active nuclei can make a galaxy appear stellar in the blue band, and the bulges of faint ellipticals and SOs are generally unresolved. Thus, galaxies are increasingly classified as stellar as the plate limit is approached.
In Table 3 we tabulate the completeness and reliability of objects that are only detected on a single red or blue plate. These have been estimated by shifting the radio positions by 3 arcmin in declination and running the matching analysis in an identical manner on this shifted dataset. This shows that 92.7% of the blue-only matches within 1" are real, and 97.5% of the red-only matches within 1" are real matches. This is a lower limit on whether these single band detections are real objects since we expect some of the the chance associations to be real celestial objects.
The matching results are more complicated to analyze when we take into account the complex radio sources that get broken into two or more components in the FIRST catalog. We call these ``gregarious'' sources (as distinguished from isolated sources) because they are found in clumps on the sky. Note that some gregarious sources are simply the chance superposition of two unrelated radio emitters at different distances; nonetheless, from our vantage point, they are gregarious, and confuse the matching statistics in a manner similar to that of the real multi-component objects. We define the sociological boundary between gregarious and isolated at ; i.e., any source with no other catalog entry within a radius of is classified as isolated, and all other sources are labeled gregarious. Such a fixed boundary is arbitrary, and, indeed, a small number of very extended objects will be incorrectly classified as isolated; we discuss this matter further in § 5.
In Figure 9, we display the number of isolated matched sources as a function of the offset between the radio and optical positions in 0.1" bins, normalized by the annular area of each bin. Figure 10 displays the ratio of the number of matches to the predicted false rate as a function of separation.
Both the background rate and the typical angular separation between FIRST and APM positions depend strongly on the optical classification. Figure 11 shows the distribution of separations for sources classified as stellar on both plates or non-stellar on both plates. The FIRST-APM positions typically differ by more for galaxies, which have less well-determined optical positions. However, it is easier to identify galaxies confidently as FIRST counterparts because the background rate for galaxies is four times smaller than for stars (see Table 1).
We quantify this effect by calculating the angular radius which contains 90% of the real associations as well as the percentage of false matches within the same radius. For example, these numbers are 4.2" and 17% for the distribution of all FIRST/APM sources as shown in Figure 8. If we restrict the match to APM objects that are classified as galaxies on both plates (Fig. 11b), the values are and 2.7%. The 90% radius is quite similar to that for all associations, since galaxies constitute the majority of all identifications; however, the reliability for galaxies is much higher because the background rate for galaxies is smaller (Fig. 11). By comparison, if we restrict ourselves to APM sources that are stellar on both POSS plates (Fig. 11a), the 90% radius and error rate are and 6%. The smaller footprint of stars on the POSS-I leads to more accurate optical positions (hence the smaller 90% radius). These and other cases are summarized in Table 2 and in Figure 12, which displays the completeness and reliability of matches as a function of separation.
The statistical agreement between FIRST and APM positions is also affected by the morphology of FIRST sources. If we restrict the FIRST sample to point-like radio sources (but include all APM sources), the 90% association radius is 2.0" with a false rate of 5%. Limiting the discussion to sources that are point-like in both the radio and the optical results in 90% of the matches within 1.1" with a 2.4% false rate.
Figure 13 compares the histogram of separations for isolated sources with that for the gregarious sources. It can be seen that gregarious sources contribute most of the matches at large separations; indeed, they show evidence for a statistically significant excess of matches even beyond . Many of the gregarious FIRST sources are components of classical double radio sources, which may have no radio component at the center of the double. Consequently the optical counterpart will typically be the double separation from each component. We now begin to explore counterpart identification strategies for such sources.
The wide variety of radio source morphologies (not to mention the extent to which their appearance depends on the resolution of the survey) makes it difficult to design a robust algorithm that predicts the locations of the optical counterparts to multiple-component sources. We present here an empirical approach to the problem for the simplest class of such sources - isolated doubles.
Figure 14 displays the distribution of isolated pairs of FIRST sources as a function of their separation. The minimum value of is imposed by the source detection algorithm (WBGH). The fall-off in the number of detected doubles beyond reflects a number of factors: 1) the survey beam resolves out very extended features and will therefore not detect sources on the largest scales, 2) the exclusion of triples and other multiple component sources from this subsample will preferentially remove extended sources on larger scales, and 3) there is a real decline in the number of large angular-diameter objects. For comparison with the APM catalog, we have selected objects with separations in the range .
The lower portion of Figure 15 displays the distribution of optical objects in the vicinity of the 21,579 pairs of radio sources meeting this criterion. The x-axis for each pair is defined by the line joining the centroids of the two objects; the scale is normalized to the separation, with the origin chosen as the brighter of the two components. The upper panel shows a histogram for all optical objects within of the line joining to two components; the expected false rate from a uniform distribution of sources with the same mean surface density as the FIRST survey is shown as the dashed line. Several features of the distribution are immediately apparent. There is a large concentration of optical objects coincident with the brighter of the two radio components, suggesting a core-jet morphology. A roughly equal number of identifications is found approximately half way between the two components with a slight bias in the direction of the brighter lobe; the broader spread in the y-direction for these counterparts reflects in part the bending of radio lobes as a consequence of their interaction with the intergalactic medium (e.g., Blanton et al. 2000, 2001). Finally, a much smaller fraction of the identifications is coincident with the weaker of the two lobes. The overall identification rate derived from integrating the upper curve and subtracting the background rate is %, similar to that for the radio sample as a whole.
Dividing the optical counterparts by morphological class yields additional interesting results. The objects classified as galaxies on both plates (Figure 16a) produce the most counterparts midway between the components, while the stellar objects are predominantly associated with the brighter radio component (Figure 16b). Figure 17 shows that many of these core components are small in size (panel a) and are much brighter (panel b) than the other component: the median flux ratio for sources where the optical counterpart matches the brighter radio component is 3.0, while the median for the central-component matches is 1.4. When the brighter of the radio components identifies the optical counterpart, it also tends to be the smaller of the two components (panel c). Likewise, for cases in which the dimmer source has the optical counterpart, it tends to be smaller. In contrast, the sources with identifications that lie between the two radio lobes have very narrow ranges of flux density and size ratios near unity, and are virtually all resolved.
Using these trends in radio source component flux density and size ratios, total extents, and optical morphology (in addition to magnitude and color, perhaps) it would be possible to develop a reasonably reliable identification algorithm for double and multiple FIRST sources, although we regard such an effort as beyond the scope of this paper. What is clear from the foregoing analysis is that, since the identification rate for complex sources is similar to that for single-component objects, the effect on the final fraction of radio emitters identifiable at the POSS-I plate limit simply scales with the number of radio components removed from consideration as a consequence of their association with another catalog entry. Using an algorithm that assigns a probability to component associations based on flux density and separation, we find that, for separations up to there are 32,312 doubles, 12,802 triples, and 5,716 groups of four or more sources with a probability of real association . This reduces our catalog to discrete radio emitters and yields an overall identified fraction at the POSS-I limit of %.