Last Updated: 20230720
Statistics for Astronomy
Wednesday, 19 July 2023
Room 211, 10am  Noon EDT
The field of Astrostatistics has been growing rapidly over the past decade, with advanced statistical methods entering the standard techniques broadly used by astronomers. Several astrostatistics associations have been established over this period with the primary goal of fostering interdisciplinary collaborations. Such collaborations have resulted in advancing both the analysis of astronomical data as well as statistical methodology. Astronomical data pose unique challenges to statisticians: analysis requires understanding the systematics in the data collection process, contaminating factors, and the behavior of instruments, an ability to handle signals with large dynamic range and complex hierarchical models, and a recognition that data acquisition is often not a repeatable or reproducible process. These challenges resulted in developments of new statistical methods, e.g., improvements in MCMC sampling strategies in reaction to complex astrophysical parameter spaces, establishing the magnitudes of corrections to instrument calibration even in the absence of an absolute reference, developing multistage strategies to account for systematic uncertainties present in atomic data and instrument calibration, computing change points in 4dimensional data cubes, spatial segmentations in unbinned data, disambiguating photons from overlapping sources, etc. The goals of the session are to highlight new results originating from ongoing collaborative research, increase awareness of astronomical data problems among statisticians, and foster collaborations between astronomers and statisticians. The speakers are from diverse geographical locations and include statisticians and astronomers who have been involved in active astrostatistics collaborations. They will provide their perspective on astrostatistics research and present case studies of several astronomical data problems and how they developed working solutions.
This session is organized by the Council of the International Astrostatistics Association (IAA) and CHASC Astrostatistics, and is supported by the
ISI Astrostatistics Special Interest Group.
The session was recorded on Zoom. The video is accessible via YouTube.
 10:05am  10:25am EDT
Astrostatistics: Review of the emerging crossdisciplinary field
Jogesh Babu (Pennsylvania State University)
 Many of the concepts of statistics had roots in Astronomy. In the 17th century, Galileo's analysis of telescopic data contained the rudiments of a theory for parametric modeling using the sum of absolute deviations as a fitting criterion. Scientists like Gauss and Laplace wrestling with problems in celestial mechanics played a central role in the development of the theory of errors and least squares in the 19th Century. Despite centuries of close association, the intimate connection between the two fields has weakened in the last 100 years. Modern observational astronomy has been characterized by an enormous growth in data acquisition, stimulated by the advent of new technologies in telescopes, detectors and computation. The complexity of data has also increased, giving rise to innumerable statistical problems. The growth of enormous methodological problems in astronomy, motivated the first crossdisciplinary conference, "Statistical Challenges in Modern Astronomy" in 1991 at Penn State, and the publication of a book entitled, "Astrostatistics" in 1996. A brief review of the emerging field of Astrostatistics will be presented.
 Presentation Slides [.pdf]

 10:25am  10:45am EDT
Statistical Issues in Instrument Calibrations and Goodnessoffit in Astrophysics
Yang Chen (University of Michigan)
 In the first half of the talk, a statistical framework for obtaining proper concordance among different instruments measuring the same set of astronomical sources will be presented. Calibration data are often obtained by observing several wellunderstood objects simultaneously with multiple instruments, such as satellites for measuring astronomical sources. Analyzing such data and obtaining proper concordance among the instruments is challenging when the physical source models are not well understood, when there are uncertainties in "known" physical quantities, or when data quality varies in ways that cannot be fully quantified. We propose a logNormal model and a more general logt model that respect the multiplicative nature of the mean signals via a halfvariance adjustment, yet permit imperfections in the mean modeling to be absorbed by residual variances. In the second half of the talk, a formal statistical analysis of both the theoretical properties and computational algorithms for the cstat aka Cashstat, a quantity for model fitting and goodnessoffit assessment widely adopted by astronomers, will be presented. Recommendations of practical procedures will be given based on numerical experiments.
 Presentation Slides [.pdf]

 10:45am  11:05am EDT
Multistage Modeling of the Multipeak Behavior of the Solar Cycle
David Stenning (Simon Fraser University)
 with Youwei Yan (SFU), Vinay Kashyap (CfA), Derek Bingham (SFU), Yaming Yu (UCI), Dibyendu Nandi (IISERK)
 We develop a datadriven approach to describe the multipeak behavior of the solar activity cycle. The method builds upon a multilevel Bayesian model for a singlepeaked solar cycle. While the latter uses only monthly mean sunspot numbers as a proxy for solar activity, our approach incorporates additional physical data and uses Gaussian process regression to capture complex features of the solar cycle that are missed by the singlepeak, singleproxy model. We demonstrate the capabilities of our methodology using hindcasts of previous cycle morphologies, and we make a prediction for the timing and characteristics of the upcoming solar cycle maximum.

 10:05am  10:25am EDT
A Hurdle Model for Old Star Clusters and their Host Galaxies
Gwendolyn Eadie (University of Toronto)
 Almost every galaxy in the universe has a population of old star clusters, called Globular Clusters (GCs). Previous works show that the stellar mass of GCs in a galaxy is linearly correlated (in log space) with the host galaxy's mass. However, this empirical relation breaks down for small, dwarf galaxies  in fact, some dwarf galaxies do not have GCs at all. Moreover, these "zeros" in the data (galaxies without GCs) have traditionally been ignored when fitting the linear relationship. This is unfortunate, as the transition region in mass over which galaxies go from having GCs to not having them could have important implications for GC and galaxy formation theories. Thus, our research addresses this problem through a Hierarchical, errorsinvariables hurdle ("HERBAL") model that accounts for galaxies with and without GC populations. In this talk, I will discuss the statistical details of our empirical model (which includes measurement uncertainties), the results when applied to data from the very nearby universe, and implications on physical theories of GCs and galaxies.

 11:25am  11:45am EDT
A functional data flood: Preparing for the Legacy Survey of Space and Time
Thomas Loredo (Cornell University)
 In early 2025 the Vera Rubin Observatory will begin the Legacy Survey of Space and Time (LSST), a tenyear survey of the southern night sky, observing the whole visible sky repeatedly and deeply every few days. A flood of data will ensue: 10 million nightly alerts of variable and moving objects, and yearly catalogs with multivariate and functional time series data for billions of stars, galaxies, and asteroids. I will describe associated statistical challenges, and preparatory data challenges.

 11:45am  Noon EDT
Discussion
 Future directions of Astrostatistics; ways to establish a permanent place in statistical societies, building upon this session to make a more permanent presence for Astrostatistics at ISI events; areas of overlap with other ISI associations; how to leverage this session to enhance collaborations (keeping in mind also that we will be able to follow up during JSM)
Organizer: Aneta Siemiginowska (asiemiginowska @ cfa . harvard . edu)
Session Chair: Vinay Kashyap (vkashyap @ cfa . harvard . edu)
 2023jul06: started page
 2023jul07: small edits; made public
 2023jul17: more points for discussion
 2023jul20: session video and slides from JB and YC added
CfA / CHASC