Dec 27th, 2009| 10:13 pm | Posted by hlee

I often feel irksome whenever I see a function being normalized over a feasible parameter space and it being used as a probability density function (pdf) for further statistical inference. In order to be a suitable pdf, normalization has to be done over a measurable space not over a feasible space. Such practice often yields biased best fits (biased estimators) and improper error bars. On the other hand, validating a measurable space under physics seems complicated. To be precise, we often lost in translation. Continue reading ‘A short note on Probability for astronomers’ »

Tags:

axiom,

curriculum,

education,

google university,

hope,

measurable,

probability Category:

Algorithms,

arXiv,

Cross-Cultural,

Jargon,

Methods,

Quotes,

Stat,

Uncertainty |

Comment
Dec 3rd, 2008| 12:31 am | Posted by hlee

Almost two year long scrutinizing some publications by astronomers gave me enough impression that astronomers live in the Gaussian world. You are likely to object this statement by saying that astronomers know and use Poisson, binomial, Pareto (power laws), Weibull, exponential, Laplace (Cauchy), Gamma, and some other distributions.^{[1]} This is true. I witness that these distributions are referred in many publications; however, when it comes to obtaining “BEST FIT estimates for the parameters of interest” and “their ERROR (BARS)”, suddenly everything goes back to the Gaussian world.^{[2]}

Borel Cantelli Lemma (from Planet Math): because of mathematical symbols, a link was made but any probability books have the lemma with proofs and descriptions.

Continue reading ‘Borel Cantelli Lemma for the Gaussian World’ »

Tags:

Borel Cantelli Lemma,

CLT,

families of distributions,

gaussian,

grand challenge,

measure,

non-Gaussian,

probability,

statisticians Category:

arXiv,

Astro,

Bad AstroStat,

Cross-Cultural,

Frequentist,

Jargon,

News,

Quotes,

Stat,

Uncertainty |

Comment
Aug 27th, 2008| 02:35 pm | Posted by hlee

I didn’t realize this post was sitting for a month during which I almost neglected the slog. As if great books about probability and information theory for statisticians and engineers exist, I believe there are great statistical physics books for physicists. On the other hand, relatively less exist that introduce one subject to the other kind audience. In this regard, I thought the lecture note can be useful.

**[arxiv:physics.data-an:0808.0012]**

Lectures on Probability, Entropy, and Statistical Physics by Ariel Caticha

**Abstract:** Continue reading ‘A lecture note of great utility’ »

Tags:

Bayes Theorem,

Boltzmann,

Carnot,

Entropy,

Gibbs paradox,

Information,

laws of thermodynamics,

lecture note,

maximum likelihood,

probability,

Shannon,

statistical physics,

Tchebyshev inequality,

thermodynamics Category:

arXiv,

Bayesian,

Cross-Cultural,

Data Processing,

Fitting,

Physics,

Stat |

Comment
Apr 27th, 2008| 11:29 am | Posted by hlee

The last paper in the list discusses MCMC for time series analysis, applied to sunspot data. There are six additional papers about statistics and data analysis from the week. Continue reading ‘[ArXiv] 4th week, Apr. 2008’ »

Tags:

clusters,

CMB,

GALEX,

gravitaional waves,

lensing,

LF,

LMC,

machine learning,

maximum likelihood,

priors,

probability,

SDSS,

stellar populations,

sunspot,

time series Category:

arXiv,

MCMC |

Comment
Mar 30th, 2008| 11:16 pm | Posted by hlee

I began to study statistics with the notion that statistics is the study of information (retrieval) and a part of information is uncertainty which is taken for granted in our random world. Probably, it is the other way around; information is a part of uncertainty. Could this be the difference between Bayesian and frequentist?

__The statistician’s task is to articulate the scientist’s uncertainties in the language of probability, and then to compute with the numbers found__: cited from Continue reading ‘Statistics is the study of uncertainty’ »

Aug 19th, 2007| 12:31 am | Posted by vlk

I think of Markov-Chain Monte Carlo (MCMC) as a kind of directed staggering about, a random walk with a goal. (Sort of like driving in Boston.) It is conceptually simple to grasp as a way to explore the posterior probability distribution of the parameters of interest by sampling only where it is worth sampling from. Thus, a major savings from brute force Monte Carlo, and far more robust than downhill fitting programs. It also gives you the error bar on the parameter for free. What could be better? Continue reading ‘An alternative to MCMC?’ »

Mar 2nd, 2007| 11:21 am | Posted by hlee

Leo Breiman (1928-2005) was one of the most dominant statisticians from the 20th century. He was well known for his textbook in probability theory as well as his contributions to the machine learning, such as CART (Classification and Regression Tree), bagging (bootstrap aggregation), and Random Forest. He was the founding father of statistical machine learning. His works can be found from http://www.stat.berkeley.edu/~breiman/

An excerpt from “A Conversation with Leo Breiman,” from Statistical Science, by Richard Olshen (2001), 16(2), pp. 184–198, casts a second thought on the direction of statistical researches:

Continue reading ‘An excerpt from “A Conversation with Leo Breiman”’ »