#### The Banff Challenge [Eqn]

With the LHC coming on line anon, it is appropriate to highlight the Banff Challenge, which was designed as a way to figure out how to place bounds on the mass of the Higgs boson. The equations that were to be solved are quite general, and are in fact the first attempt that I know of where calibration data are directly and explicitly included in the analysis.

The observables are counts N, Y, and Z, with

N ~ Pois(ε λS + λB) ,
Y ~ Pois(ρ λB)
,
Z ~ Pois(ε υ)
,

where λS is the parameter of interest (in this case, the mass of the Higgs boson, but could be the intensity of a source), λB is the parameter that describes the background, ε is the efficiency, or the effective area, of the detector, and υ is a calibrator source with a known intensity.

The challenge was (is) to infer the maximum likelihood estimate of and the bounds on λS, given the observed data, {N, Y, Z}. In other words, to compute

p(λS|N,Y,Z) .

It may look like an easy problem, but it isn’t!

1. ##### brianISU:

Being completely ignorant on the physics of the problem, what are the assumptions that are reasonable to make on the data? e.g., are N, Y, and Z independent random variables? That might begin to make things easier. Then, with a derived likelihood, maybe maximize out the nuisance parameters and just be left with the parameter of interest. Or one could try numerical maximizing techniques. This problem also seems to be screaming a Bayesian approach and look at the posterior of the parameter of interest, especially since there are experts in this field that could give reliable prior information. Of course everything I am saying is just a fun thought process and I am not claiming I would really know what to do.

07-27-2008, 9:18 pm
2. ##### vlk:

Yeah, N, Y, and Z are independent. With just the first two equations, the problem is very simple, and has been solved analytically numerous times. But including the third one throws it for a loop. The hardest part is to make sure that the frequency coverage on lambda_S is correct at the low counts (N~few) level. See these two for example for how complicated it can become:
Edlefson, P. (JSM2007) – A Dempster-Shafer Bayesian Solution to the Banff A1 Challenge
Baines, P. (JSM2007) – Upper Limits for Source Detection in the Three-Poisson Model

07-28-2008, 5:16 pm
3. ##### Paul B:

Hmmm, indeed this is a tricky problem! N,Y and Z are conditionally independent, given the parameters, but the real problem is that epsilon is a multiplicative factor on the interest parameter. That makes inference very sensitive to the efficiency of the detector — and is the main reason why no method works perfectly for a wide range of source and efficiency values.
Finding the various types of MLE (profile, modified profile etc) is not too bad, the real problem is providing reliable estimates of the uncertainty. Finding a 95% confidence interval that actually provides 95% confidence is far easier said than done (see the references for some explanations). This is also the primary criterion for many of the physicists involved, so it leaves plenty of work to be done.
It still bothers me that there isn’t a decent Bayesian solution yet (that I’ve seen anyway), but I’m sure one is out there somewhere…

07-28-2008, 6:17 pm