Spurious Sources

[arXiv:0709.2358] Cleaning the USNO-B Catalog through automatic detection of optical artifacts, by Barron et al.

Statistically speaking, “false sources” are generally in the domain of Type II Type I errors, defined by the probability of detecting a signal where there is none. But what if there is a clear signal, but it is not real?

In astronomical analysis, sources are generally defined with reference to the existing background, as point-fluctuations that exceed some significance threshold defined by the estimated background “in the vicinity”. The threshold is usually set such that we can tolerate “a few” false positives at borderline significance. But that ignores the effect of systematic deviations that can be caused by various instrumental features. Such things are common in X-ray images — window support structures, chip gaps, bad CCD columns, cosmic-ray hits, etc. Optical data are generally cleaner, but by no means immune to the problem. Barron et al. here describe how they have gone through the USNO-B catalog and have modeled and eliminated artifacts coming from diffraction spikes and telescope reflection halos of bright stars.

The bad news? More than 2.3% of the sources are flagged as spurious. Compare to the typical statistical significance at which the detection thresholds are set (usually >3sigma).

  1. hlee:

    Type II error is claiming no signal when there is signal (failing to reject the null hypothesis when the alternative is true). Type I error is rejecting the null hypothesis when the null is true, i.e. detecting signal under no signal. The null hypothesis is a subset of combined hypotheses (union of null and alternative). I think no signal should be the null hypothesis and the existence of signal is the alternative. The other way around, the existence of signal is null and no signal is alternative, is an improper statement for hypothesis testing.

    [Response: Thanks for the catch. That was a pyto. -vlk]

    Setting 3σ or 5σ thresholds become important when you study the power of the test, defined by one minus the size of type II error, once you reject the null hypothesis, or say signal is significant. Besides many factors, power depends on the sample size. With the same rejection region of the null hypothesis based on 3σ (smaller sample size) or 5σ (larger sample size) thresholds, the power of larger sample is larger than the power of smaller sample; in other words, type II error is smaller with larger sample. Setting high number σ helps to reduce type II error, false negative, or the chance of saying no signal when there is signal. Unfortunately, other factors also determine the power of the test so that a larger σ threshold is not an optimal choice for a reliable source detecting rule, not to mention the cost of collecting large sample and systematic errors.

    09-20-2007, 12:42 am
  2. hlee:

    I saw many students and some clients from consulting class were confused with how to set the null and alternative hypotheses, and defining type I and II errors accordingly. Hypothesis testing looks very arbitrary and most likely appears as a method to reject the null hypothesis by collecting data. I was not sure it was a pyto of typo, or a confusion between null and alternative (or type I and II), which led me to write about it.

    09-21-2007, 1:03 pm
Leave a comment