Last Updated: 20240508

International CHASC AstroStatistics Centre

Topics in Astrostatistics

AY 2023-2024


Schedule Wednesdays Noon - 1:30pm Eastern Time
Location SC-706 + Zoom

Cecilia Garraffo (CfA)
Sep 06
Noon EDT
AstroAI: Integrating Artificial Intelligence into Astrophysics
Abstract: AstroAI, launched at the Center for Astrophysics | Harvard & Smithsonian (CfA) in November 2022, is a novel initiative focused on developing machine learning (ML) and artificial intelligence (AI) algorithms to further astrophysical research. Its inception was driven by the recognized need, both within the CfA and the broader scientific community, for dependable and interpretable models in astrophysics research. At its core, AstroAI aims to create AI and ML models designed for astrophysical discovery, emphasizing a multidisciplinary approach and collaboration among a diverse group of researchers. This talk will outline the progress and growth of AstroAI since its beginning and highlight some of the key projects undertaken by our team, and showcase a few of our projects and their transformative potential in astrophysical research.
Presentation Video [!yt]
Mengyang Gu (UC Santa Barbara)
Sep 13
Noon EDT
Calibration of imperfect geophysical models by multiple satellite interferograms with measurement bias
Abstract: Model calibration consists of using experimental or field data to estimate the unknown parameters of a mathematical model. The presence of model discrepancy and measurement bias in the data complicates this task. Satellite interferograms, for instance, are widely used for calibrating geophysical models in geological hazard quantification. In this work, we used satellite interferograms to relate ground deformation observations to the properties of the magma chamber at Kilauea Volcano in Hawai`i. We derived closed-form marginal likelihoods and implemented posterior sampling procedures that simultaneously estimate the model discrepancy of physical models, and the measurement bias from the atmospheric error in satellite interferograms. We found that model calibration by aggregating multiple interferograms and downsampling the pixels in the interferograms can reduce the computation complexity compared to calibration approaches based on multiple data sets. The conditions that lead to no loss of information from data aggregation and downsampling are studied. Simulation illustrates that both discrepancy and measurement bias can be estimated, and real applications demonstrate that modeling both effects helps obtain a reliable estimation of a physical model's unobserved parameters and enhance its predictive accuracy. We implement the computational tools in the RobustCalibration package available on CRAN.
Gu, M., & Wang, L. (2018). Scaled Gaussian stochastic process for computer model calibration and prediction. SIAM/ASA Journal on Uncertainty Quantification, 6(4), 1555-1583
Gu, M., Xie, F., & Wang, L. (2022). A Theoretical Framework of the Scaled Gaussian Stochastic Process in Prediction and Calibration. SIAM/ASA Journal on Uncertainty Quantification, 10(4), 1435-1460.
Gu, M., Anderson, K., & McPhillips, E. (2023). Calibration of imperfect geophysical models by multiple satellite interferograms with measurement bias. Technometrics, in press, arxiv:1810.11664 [!arXiv]
Gu, M., He, Y., Liu, X., & Luo Y. (2023). Ab initio uncertainty quantification in scattering analysis of microscopy arXiv:2309.02468 [!arXiv]
Presentation slides [.pdf]
Presentation video [!yt]
Ashley Villar & Rafael Martinez-Galarza (CfA)
Oct 04, 2023
Noon EDT
Project: A Variational Autoencoder-inspired Mixture of Poissons to classify X-ray photon lists
In the low-count limit, astrophysical phenomena follow Poisson distributions across a distribution of energies and time. Learning meaningful representations of these events remains a challenging endeavor; however, such representations can aid in a number of downstream scientific tasks: classification, anomaly detection and potentially inference. Here, we present a project pitch to build a probabilistic (Poisson-based) neural network (inspired by a variational autoencoder) to find meaningful representations of astronomical light curves.
Aneta Siemiginowska (CfA)
Oct 11, 2023
Noon EDT
Why time-delays?
Time-delays are often encountered in astronomical measurements. They provide otherwise unresolved intrinsic scales of a variable source or, in the case of gravitational lensing, constraints on the cosmological parameters. I will present an astronomer's view on the time-delay applications, discuss our recent model for time-delays due to gravitational lensing, future directions, and open projects.
Presentation slides [.pdf]
Presentation video [!yt]
See also: Tak et al. 2015, AoAS 11, 1309; Meyer et al. 2023, ApJ 950, 37
Pavlos Protopapas (SEAS)
Oct 18, 2023
Noon EDT
Residual-Based Error Bound for Physics-Informed Neural Networks
Abstract: Neural networks are universal approximators and are studied for their use in solving differential equations. However, a major criticism is the lack of error bounds for obtained solutions. In this talk I will describe a technique to rigorously evaluate the error bound of Physics-Informed Neural Networks (PINNs) on most linear ordinary differential equations (ODEs), certain nonlinear ODEs, and first-order linear partial differential equations (PDEs).
The error bound is based purely on equation structure and residual information and does not depend on assumptions of how well the networks are trained. We propose algorithms that bound the error efficiently.
Liu et al. 2023, arXiv:2306.03786 [!arXiv]
Presentation video [!yt]
Herman Marshall (MIT), Subramania Athray (UAlabama), & Vinay Kashyap (CfA)
Nov 8
Noon EST
SciCen 706
Deconvolving dispersed gratings spectra from extended sources
Abstract: We will present the mostly unsolved problem of deconvolving high-resolution grating dispersed spectra of extended sources. We will show examples of the data from Chandra, and some examples of how solar physicists are modeling data from the dispersed Sun in the high counts regime when there are strong line features in the spectrum. Can this be extended to smoother spectra in the Poisson regime?
See also: Winebarger et al. 2019, ApJ 882, 12, Unfolding Overlapped Slitless Imaging Spectrometer Data for Extended Sources [!ads]
Herman Marshall [.key]
Vinay Kashyap [.key]
Subramania Athiray [.pptx]
Adel Daoud (Linkoping/Chalmers)
24 Jan 2024
Noon EST
Are You Devising an Observatory of Extraterrestrial Life? Lessons learned from Observatory of Poverty-Measuring Living Conditions on Planet Earth with AI and Earth Observations
Abstract: The question, "Is there other especially intelligent life in the Universe," is one of the most intriguing questions in the sciences and beyond. If there is indeed life on other planets and the only means of observing it is through high-resolution satellite images, a follow-up question would be, "How may we use those images to measure extraterrestrial activities on the surface of their planets?" This talk gives some pointers to addressing that follow-up question by showing how we, at the AI and Global Development Lab, are measuring health and living conditions on Earth by using satellite images and deep learning. The Lab is currently measuring the historical and geographical development trajectories from satellite images from the 1990s to the present, focusing on the African continent. These measurements are our data product, capturing living conditions at unprecedented temporal and spatial granularity. This talk will discuss key scientific challenges and research prospects.
Presentation video [!yt]
Ana-Sofia Uzsoy (Harvard)
7 Feb 2024
Noon EST
Variational Inference for Acceleration of SN Ia Photometric Distance Estimation with BayeSN
Abstract: We use variational inference (VI) to fit the light curves of Type Ia supernovae (SN Ia) using the BayeSN hierarchical Bayesian model for SN Ia spectral energy distributions. We fit both simulated light curves and data from the Foundation Supernova Survey with two different forms of surrogate posterior - a multivariate normal and a custom multivariate zero-lower-truncated normal distribution - and compare them with baseline MCMC fits and the Laplace Approximation. To evaluate the accuracy of our variational approximation, we calculate the pareto-smoothed importance sampling (PSIS) diagnostic, and perform variational simulation-based calibration (VSBC). The VI approximation achieves similar results to MCMC but with significantly reduced runtime. Overall, we show that VI is a promising method for scalable parameter inference as we enter the era of "big data".
Presentation slides [.pptx]
Presentation video [!yt]
Axel Donath (CfA)
14 Feb 2024
Noon EST
Joint Likelihood Deconvolution of Astronomical Images in the Presence of Poisson Noise
Abstract: I will present a new method for Joint Likelihood Deconvolution (Jolideco) of astronomical images in the presence of Poisson noise. The method reconstructs a single flux image from a set of observations of the same sky region by optimizing the a posteriori joint Poisson likelihood of all observations under a patch based image prior. Simulations demonstrate that both the combination of multiple observations as well as the patch based prior lead to a much improved reconstruction quality, compared to alternative methods like the Richardson-Lucy method. I will showcase some results using example data from the Chandra observatory and conclude with an overview of open questions, most importantly the question of uncertainties on reconstructed flux images.
Presentation slides [.pdf]
Presentation video [!yt]
Xiangyu Zhang (Minnesota)
Feb 21, 2024
11am CST
On smooth tests of goodness-of-fit for astrophysical searches under high background
Abstract: Smooth tests were first introduced by Neyman (1937) as a comprehensive approach to the goodness-of-fit (GOF). Compared to classical GOF tests, such as Kolmogorov-Smirnov or Cramer von Mises, smooth tests use an alternative model that incorporates the null through a series of orthonormal basis functions (e.g., Shifted Legendre Polynomial or Cosine bases). As a result, they concentrate their power on a limited number of directions. A particularly appealing feature of smooth tests is that, when the null model is rejected, they naturally provide a correction for it. This aspect will be illustrated in the context of detecting line emissions under a high background. New methodological developments on the construction of distribution-free smooth tests that are unaffected by post-selection inference problems will also be discussed.
Presentation slides [.pdf]
Presentation Video [!yt]
Yang Chen (Michigan) & Max Bonamente (UAH)
Feb 28, 2024
Noon EST/11am CST
Yang Chen: Comparison of Goodness-of-fit Assessment Methods with C statistics in Astronomy
Abstract: In astrophysics, the C statistic, which is a likelihood ratio statistic, has been widely adopted for model fitting and goodness-of-fit assessments for Poisson-count data with heterogeneous rates. It is well known that when the sample size is very large, the C statistics enjoy convenient theoretical properties, especially in the large-mean limit. However, in many astronomy and high-energy physics applications, the observations are very sparse, making the theoretical properties of C statistics questionable. We comprehensively study the properties of C statistics and evaluate various algorithms for goodness-of-fit assessment using C statistics, emphasizing low-count scenarios.
Presentation slides [.pdf]
Max Bonamente: Systematic errors and Poisson regression
Abstract: A new statistical method is proposed that includes systematic errors in the analysis of Poisson data, especially for the purpose of regression analysis and subsequent hypothesis testing. The method is based on the introduction of an intrinsic model variance, which is enforced after the usual maximum-likelihood regression is performed. With this method, the usual goodness-of-fit statistic -- the Poisson deviance also known as the Cash statistic -- becomes distributed like a newly-introduced overdispersed chi-squared distribution under the null hypothesis, at least in the large-mean limit. This new distribution defaults to the usual chi-squared when systematic errors are negligible, and continues to be normally-distributed for extensive data. The method offers also the opportunity to estimate systematic errors, if they cannot be estimated a priori. It is hoped that this model, which is simple to use for most applications, offers an answer to the quest for a simple and statistically-motivated means of handling systematic errors in count data.
Presentation slides [.pdf]
meeting chat [.txt]
Presentation video [!yt]
Alexandre Bayle (Harvard)
Apr 3, 2024
Noon EDT
How Good is my Learning Algorithm? Building Cross-Validation Confidence Intervals for Test Error
Abstract: How good is my learning algorithm? Is algorithm A actually better than algorithm B? Cross-validation is a de facto standard for addressing these questions by providing an estimate of the test error of prediction rules. However, for high-stakes applications in which the uncertainty of an error estimate impacts decision-making, properly quantifying the uncertainty of the cross-validation estimate is crucial and requires a valid treatment of the dependence that comes with this sample-splitting scheme. In this work, we present our method to achieve this objective and we prove its theoretical validity. We developed central limit theorems for cross-validation and consistent estimators of its asymptotic variance under weak stability conditions on the learning algorithm. Together, these results provide practical, asymptotically-exact confidence intervals for k-fold test error and valid, powerful hypothesis tests of whether one learning algorithm has smaller k-fold test error than another. These results are also the first of their kind for the popular choice of leave-one-out cross-validation. In our real-data experiments with diverse learning algorithms, the resulting confidence intervals and tests outperform the most popular alternative methods from the literature (we will cover these methods in the presentation).
Bayle et al. 2020, Cross-validation Confidence Intervals for Test Error arXiv:2007.12671 [.pdf]
Presentation slides [.pdf]
Souhardya Sengupta (Harvard)
Apr 17, 2024
Noon EDT
SciCen 706
A tutorial on Causal Inference and its relevance in Astrophysics
Abstract: This talk will provide a basic introduction to causation and statistical methodologies that aim for such inference. We will start with an introduction to the potential outcomes framework and build on that to discuss population estimands that help us draw causal conclusions from an experiment, along with various techniques for its inference. The majority of this talk will focus on observational studies, where the scientist has no control over the treatment mechanism. In this part, we will discuss the concepts of confounding and various relevant estimators in the presence of such confounders, followed by an introduction to sensitivity analysis that establishes how sensitive our results are to the presence of any unmeasured confounders. Finally, if time permits, I will talk about structural causal models and their applications in astrophysics.
Presentation slides [.pdf]
Presentation video [!yt]
Jason Siyang Li (Imperial)
Apr 24, 2024
Noon EDT
SciCen 706
Estimating the Luminosity Function in the presence of "Dark" sources (with a new method for statistical marginalisation)
Abstract: Studies on populations of X-ray sources are strongly a5ected by detectability. We have developed a method to bypass limitations in X-ray source detection algorithms and model luminosity functions using catalogue available from other wavelengths. We propose a hierarchical model that allows estimation of individual source intensities simultaneously with parameters that describe the population of sources. It allows sources to be X-ray-dark by using zero-inflated distributions on the source intensities parameters. This hierarchical model is typical of statistical models in high-energy astrophysics, in that it contains numerous parameters and latent variables.
This accounts for the complexities in the instruments, a large number of X- ray sources in the population, and characteristics in the population.
However, posterior sampling, such as MCMC and nested sampling, can be ine5icient in large parameter spaces, making it hard to obtain posterior samples from the hierarchical model. A well-known method is to deploy the posterior sampler on a lower-dimensional marginal distribution of the posterior distribution, in which we call it "statistical marginalisation" of the posterior distribution.
To obtain such a statistical marginalisation of the posterior distribution, we introduce a new method to integrate over the population of source intensity parameters using moment generating functions. We present the link between the integral over a population of parameters and marginal likelihood computation. As a natural extension, we can show that the moment generating function method is also useful for exact computations of marginal likelihoods under certain assumptions.
Presentation slides [.pdf]
Presentation video [!yt]
Siddharth Vishwanath (UCSD)
May 1, 2024
Noon EDT
Repelling-Attracting Hamiltonian Monte Carlo
Abstract: We propose a variant of Hamiltonian Monte Carlo (HMC), called the Repelling-Attracting Hamiltonian Monte Carlo (RAHMC), for sampling from multimodal distributions. The key idea that underpins RAHMC is a departure from the conservative dynamics of Hamiltonian systems, which form the basis of traditional HMC, and turning instead to the dissipative dynamics of conformal Hamiltonian systems. In particular, RAHMC involves two stages: a mode-repelling stage to encourage the sampler to move away from regions of high probability density; and, a mode-attracting stage, which facilitates the sampler to find and settle near alternative modes. We achieve this by introducing just one additional tuning parameter -- the coefficient of friction. The proposed method adapts to the geometry of the target distribution, e.g., modes and density ridges, and can generate proposals that cross low-probability barriers with little to no computational overhead in comparison to traditional HMC. Notably, RAHMC requires no additional information about the target distribution or memory of previously visited modes. We establish the theoretical basis for RAHMC, and we discuss repelling-attracting extensions to several variants of HMC in literature. Finally, we provide a tuning-free implementation via dual-averaging, and we demonstrate its effectiveness in sampling from, both, multimodal and unimodal distributions in high dimensions.
Presentation slides: [!] ; [.pdf]
Presentation video [!yt]
Giovanni Motta (Columbia)
May 8, 2024
Noon EDT
Detecting stellar flares using conditional volatility
Abstract: For more than forty years now, discrete-time models have been developed to reflect the so-called stylized features of financial time series. These properties, which include tail heaviness, asymmetry, volatility clustering and serial dependence without correlation, cannot be captured with traditional linear time series ARMA. Continuous-time ARMA (CARMA) are the continuous-time version of the well-known ARMA models, and they are convenient for modeling astronomical data, which are often unequally spaced in time. In this talk we will review ARMA and CARMA models and their application in astrophysics. We then present a novel and powerful method to analyze time series to detect flares in TESS light curves. First, we remove the trend using a time-varying deterministic harmonic fit so to capture changes in the deterministic amplitude of the light curve. Then we enlighten the analogy between the stochastic part of the light curves and GARCH processes. We demonstrate that flares can be detected as significantly large deviations from the baseline. We apply the method on exemplar light curves from two flaring stars, and discuss some of the diagnostics that become amenable to measurement.
Presentation slides [.pdf]
Presentation video [!yt]
Ann Lee (CMU)
Oct 2024

Fall/Winter 2004-2005
Siemiginowska, A. / Connors, A. / Kashyap, V. / Zezas, A. / Devor, J. / Drake, J. / Kolaczyk, E. / Izem, R. / Kang, H. / Yu, Y. / van Dyk, D.
Fall/Winter 2005-2006
van Dyk, D. / Ratner, M. / Jin, J. / Park, T. / CCW / Zezas, A. / Hong, J. / Siemiginowska, A. & Kashyap, V. / Meng, X.-L.
Fall/Winter 2006-2007
Lee, H. / Connors, A. / Protopapas, P. / McDowell, J., / Izem, R. / Blondin, S. / Lee, H. / Zezas, A., & Lee, H. / Liu, J.C. / van Dyk, D. / Rice, J.
Fall/Winter 2007-2008
Connors, A., & Protopapas, P. / Steiner, J. / Baines, P. / Zezas, A. / Aldcroft, T.
Fall/Winter 2008-2009
H. Lee / A. Connors, B. Kelly, & P. Protopapas / P. Baines / A. Blocker / J. Hong / H. Chernoff / Z. Li / L. Zhu (Feb) / A. Connors (Pt.1) / A. Connors (Pt.2) / L. Zhu (Mar) / E. Kolaczyk / V. Liublinska / N. Stein
Fall/Winter 2009-2010
A.Connors / B.Kelly / N.Stein, P.Baines / D.Stenning / J. Xu / A.Blocker / P.Baines, Y.Yu / V.Liublinska, J.Xu, J.Liu / Meng X.L., et al. / A. Blocker, et al. / A. Siemiginowska / D. Richard / A. Blocker / Xie X. / Xu J. / V. Liublinska / L. Jing
AcadYr 2010-2011
Astrostat Haiku / P. Protopapas / A. Zezas & V. Kashyap / A. Siemiginowska / K. Mandel / N. Stein / A. Mahabal / Hong J.S. / D. Stenning / A. Diaferio / Xu J. / B. Kelly / P. Baines & I. Udaltsova / M. Weber
AcadYr 2011-2012
A. Blocker / Astro for Stat / B. Kelly / R. D'Abrusco / E. Turner / Xu J. / T. Loredo / A. Blocker / P. Baines / A. Zezas et al. / Min S. & Xu J. / O. Papaspiliopoulos / Wang L. / T. Laskar
AcadYr 2012-2013
N. Stein / A. Siemiginowska / D. Cervone / R. Dawson / P. Protopapas / K. Reeves / Xu J. / J. Scargle / Min S. / Wang L. & D. Jones / J. Steiner / B. Kelly / K. McKeough
AcadYr 2013-2014
Meng X.-L. / Meng X.-L., K. Mandel / A. Siemiginowska / S. Vrtilek & L. Bornn / Lazhi W. / D. Jones / R. Wong / Xu J. / van Dyk D. / Feigelson E. / Gopalan G. / Min S. / Smith R. / Zezas A. / van Dyk D. / Hyungsuk T. / Czerny, B. / Jones D. / Liu K. / Zezas A.
AcadYr 2014-2015
Vegetabile, B. & Aldcroft, T., / H. Jae Sub / Siemiginowska, A. & Kashyap, V. / Pankratius, V. / Tak, H. / Brenneman, L. / Johnson, J. / Lynch, R.C. / Fan, M.J. / Meng, X.-L. / Gopalan, G. / Jiao, X. / Si, S. / Udaltsova, I. & Zezas, A. / Wang, L. / Tak, H. / Eadie, G. / Czekala, I. / Stenning, D. / Stampoulis, V. / Aitkin, M. / Algeri, S. / Barnacka, A.
AcadYr 2015-2016
DePasquale, J. / Tak, H. / Meng, X.-L. / Jones, D. / Huang, J. / Blanchard, P. / Chen, Y. & Wang, X. / Tak, H. / Mandel, K. / Jiao, X. / Wang, X. & Chen, Y. / IACHEC WG / Si, S. / Drake, J. / Stampoulis, V. / Algeri, S. / Stein, N. / Chunzhe, Z. / Andrews, J. / Vrtilek, S. / Udaltsova, I. & Stampoulis, V.
AcadYr 2016-2017
Wang, X. & Chen, Y. / Kashyap, V., Siemiginowska, A., & Zezas, A. / Stampoulis, V. / Portillo, S. / Zhang, K. / Mandel, K. / DiStefano, R. / Finkbeiner, D. & Meade, B. / Gong, R. / Shihao Y. / Zhirui, H. / Xufei, W. / Campos, L. / Tak, H. / Xufei, W. / Jones, D. / Algeri, S. / Speagle, J. / Czekala, I.
AcadYr 2017-2018
AstroStat Day / Speagle, J. / Collin, G. / McKeough, K. & Yang, S. / McKeough, K. & Campos, L. / M. Ntampaka / H. Marshall / D. Huppenkothen / X. Yu / R. DiStefano / J. Yee / H. Tak / A. Avelino
AcadYr 2018-2019
Stenning, D. / Dvorkin, C. / Sottosanti, A. / Yu, X. / Chen, Y. / Jones, D. / Lee, T.C.-M. / Tak, H. / Kashyap, V., McKeough, K., Campos, L., et al. / Baines, P. / Collin, G. / Muthukrishna, D. / Zhang, D. / Algeri, S. / Janson, L. / Ward, S. / de Beurs, Z.
AcadYr 2019-2020
McKeough, K. / Astudillo, J. & Protopapas, P. / Zezas, A. / Speagle, J. / Meng, X.-L., Siemiginowska, A., & Kashyap, V. / Bonfini, P. / Liu, C. / Guenther, H. / Castrillon, J. / McKeough, K. / Broekgaarden, F. / Autenrieth, M. / Motta, G. / Zucker, C. / Tak, H. / Kashyap, V. & Wang, X. / Wang, J. / Wang, X. & Ingram, J.
AcadYr 2020-2021
Diaz Rivero, A. / Marshall, H. & Chen, Y. / McKeough, K. / Chen, Y. / Patil, A. / Jerius, D. / Wang, X. / Siemiginowska, A. / Xu, C. / Picquenot, A. / Jacovich, T. / Geringer-Sameth, A. / Toulis, P. / Donath, A. / Ergin, T. / Phillipson, R. / Sun, H. / Autenrieth, M.
AcadYr 2021-2022
Makinen, T.L. / Siemiginowska, A. / Fox-Fortino, W. / Reddy, K. / Primini, F. / Mishra-Sharma, S. / Meyer, A. / Janson, L. / Group
AcadYr 2022-2023
Saydjari, A. / Rau, M.M. / McKimm, H. / Sairam, L. / Meyer, A. / SCMA8 / Kochanski, N. & Chen, Y. / Jones, G. / ISI WSC / Li, D.D.
AcadYr 2023-2024
Garraffo, C. / Gu, M. / Villar, A. & Martinez-Galarza, J.R. / Siemiginowska, A. / Protopapas, P. / Marshall, H., Athiray, S., & Kashyap, V.L. / Daoud, A. / Uzsoy, A.-S. / Donath, A. / Zhang, X. / Chen, Y. & Bonamente, M. / Bayle, A. / Sengupta, S. / Li, J.S. / Vishwanath, S. / Motta, G. / Lee, A.