Giovanni Lacopo
(Istituto Nazionale di Astrofisica (INAF))
The effective exploitation of modern architecture is a key factor to achieve best performances in terms of both energy efficiency and run-time reduction.
We bring a specific example of this, by discussing the W-stacking gridder, an algorithm that tackles Radio imaging in massively parallel systems; its performance is limited by an all-to-all data reduction needed to pass from time-domain decomposition to space-domain decomposition.
To overcome this limitation, we have implemented a customized reduce operation built on explicitly numa-awareness.
We have found inside each computing node an increase in both performance and
energy efficiency by a factor of 4 to 7 on different architectures.
Primary authors
Giovanni Lacopo
(Istituto Nazionale di Astrofisica (INAF))
Giuliano Taffoni
Luca Tornatore