Speaker
Description
We implemented multiple data analysis and machine learning algorithms to extract information of solar active regions that can be used in the future to perform CME and flare forecasting.
We use the data set produced by Angryk et al. (Sci.Data,2020) containing 51 magnetic field parameters and the associated X-ray activity of solar active regions from 2010 to 2018. We performed a data reduction by eliminating redundant parameters in each entry. The reduction was performed using a combination of Common Factor Analysis (CFA) and Principal Component Analysis (PCA). This reduced the mount of data to 5 parameters. To increase the separability of different types of active regions we projected all the points to a 7-dimensional manifold using Sparse Autoencoders (SAE). We then performed supervised and unsupervised classifications of the reduced parameters of each active region. We demonstrate that it is possible to differentiate flaring from non-flaring active regions. We also show that there are tenuous differences between active regions with different flaring activity. However, the current data is not sparse enough to allow a clear differentiation between the different levels of flare activity.
We propose an alternative parametrization of solar active regions. Using Disentangled Variational Autoencoders (beta-VAE) we produce a different representation of the active regions which can be used for classification. We show that this method can be used also to create new unseen artificial active regions to study the evolution of flaring activity, or to solve the problem of data imbalance.