space weather

Among billion objects in our Galaxy, outside the Earth, our Sun drags most attention from astronomers. These astronomers go by solar physicists, who enjoy the most abundant data including 400 year long sunspot counts. Their joy is not only originated from the fascinating, active, and unpredictable characteristics of the Sun but also attributed to its influence on our daily lives. Related to the latter, sometimes studying the conditions on the Sun is called space weather forecast.

With my limited knowledge, I cannot lay out all important aspects in solar physics, climate changes (not limited to our lower atmosphere but covering the space between the sun and the earth) due to solar activities, and the most important issues of recent years related to space weather. Only I can emphasize that compared to earth climate/atmosphere or meteorology, contribution from statisticians to space weather is almost none existing. I’ve witnessed frequently that crude eyeballing instead of statistics in analyzing data and quantifying images occurs in Solar Physics. Luckily, a few articles discussing statistics are found and my discussion is rather focused on these papers while leaving a room for solar physicists to chip in how space weather is dealt statistically for collaborating with statisticians.

By the way, I have no intention of degrading “eyeballing” in data analysis by astronomers. Statistical methods under EDA, exploratory data analysis whose counterpart is CDA, confirmatory data analysis, or statistical inference, is basically “eyeballing” with technical jargon and basics from probability theory. EDA is important to doubt every step in astronomers’ chi-square methods. Without those diagnostics and visualization, choosing right statistical strategies is almost impossible with real data sets. I used “crude” because instead of using “edge detection” algorithms, edges are drawn by hand via eyeballing. Also, my another disclaimer is that there are brilliant image processing/computer vision strategies developed by astronomers, which I’m not going to present. I’m focusing on small areas in statistics related to space weather and its forecasting.

Statistical Assessment of Photospheric Magnetic Features in Imminent Solar Flare Predictions by Song et al. (2009) SoPh. v. 254, p.101.

Their forte is “logistic regression” a statistical model that is not often used in astronomy. It is seen when modeling binary responses (or categorical responses like head or tail; agree, neutral, or disgree) and bunch of predictors, i.e. classification with multiple features or variables (astronomers might like to replace these lexicons with parameters). Also, the issue of variable selection is discussed like L_{gnl} to be the most powerful predictor. Their training set was carefully discussed from the solar physical perspective. Against their claim that they used “logistic regression” to predict solar flares for the first time, there was another paper a few years back discussing “logistic regression” to predict geomagnetic storms or coronal mass ejections. This statement can be wrong if flares and CMEs are exclusive events.

The Challenge of Predicting the Occurrence of Intense Storms by Srivastava (2006) J.Astrophys. Astr. v.27, pp.237-242

Probability of the storm occurrence is response in logistic regression model, of which predictors are CME related variables including latitude and longitude of the origin of CME, and interplanetary inputs like shock speeds, ram pressure, and solar wind related measures. Cross-validation was performed. A comment that the initial speed of a CME might be the most reliable predictor is given but no extensive discussion of variable selection/model selection.

Personally speaking, both publications[1] can be more statistically rigorous to discuss various challenges in logistic regression from the statistical learning/classification perspective and from the model/variable selection aspect to define more well behaving and statistically rigorous classifiers.

Often times we plan our days according to the weather forecast (although we grumble weather forecasts are not right, almost everyone relies on numbers and predictions from weather people). Although it may not be 100% reliable, those forecasts make our lives easier. Also, more reliable models are under developing. On the other hand, forecasting space weather with the help of statistics is yet unthinkable. However, scientists and engineers understand that the reliable space weather models help planning space missions and controlling satellites into safety mode. At least I know is that with the presence of flare or CME forecasting models, fewer scientists/engineers need to wake up in the middle of night, because of, otherwise unforeseen storms from the sun.

  1. I thought I collected more papers under “statistics” and “space weather,” not just these two. A few more probably are buried somewhere. It’s hard to believe such rich field is not touched by statisticians. I’d appreciate very much your kind forwarding those relevant papers. I’ll gradually add them.[]
Leave a comment