Description
Participants will learn how to approach, visualize, and understand complex datasets using powerful dimensionality reduction techniques (like PCA and t-SNE). The afternoon is dedicated to the critical steps of data preprocessing, focusing on cleaning, normalization, transformation, and strategies for handling missing data (imputation).
-
Dr Simone Riggi (INAF - Osservatorio Astrofisico di Catania)14/04/2026, 14:30
- Introduction
- Big data projects in astronomy
- What is ML and how do we use it in astronomy?
- Regression & highlights
- Classification & highlights
- Anomaly detection & highlights
- Object detection & highlights
- Forecasting & highlights
- Data preprocessing & generation & highlights
- Outlier detection
- Simple methods (e.g. IQR-based)
-
Farida Farsian (Istituto Nazionale di Astrofisica (INAF))14/04/2026, 15:00
- Data properties (5Vs, format, modality, dimensionality, etc)
- Tabular data formats most used in Astrophysics
- Ascii/csv
- ROOT
- FITS
- HDF5/NetCDF
- VO tables
- Parquet
- Relational DB
- Data I/O considerations
-
Dr Simone Riggi (INAF - Osservatorio Astrofisico di Catania)14/04/2026, 16:10
- Data consistency checks
- Data visualization
- Examples of 1D data visualization: pie/bar/graph/histogram
- Examples of 2D visualization: scatter plots, 2D histograms (lego, contour, color maps)
- Examples of 3D visualization: volume renderings, iso surface, slicing planes
- Examples of ND data visualization: correlation, scatter plots
- Dimensionality reduction: curse of...
-
Farida Farsian (Istituto Nazionale di Astrofisica (INAF))14/04/2026, 17:10
- Tabular data pre-processing
- Linear transforms: minmax normalization, standardization, scaling
- Non-linear transforms: power/log, Box-Cox, Yeo-Johnson, Quantile
- Transforming categorical data
-
Farida Farsian (Istituto Nazionale di Astrofisica (INAF)), Dr Simone Riggi (INAF - Osservatorio Astrofisico di Catania)14/04/2026, 17:40