Alexey Lastovetsky
"Recent Advances in Matrix Partitioning for Parallel Computing on Heterogeneous Platforms",
IEEE Transactions on Parallel and Distributed Systems, vol. 30, issue 1: IEEE, pp. 218-229, 01/2019.
Download: recent-advances-matrix.pdf (2.77 MB)
"A Novel Data-Partitioning Algorithm for Performance Optimization of Data-Parallel Applications on Heterogeneous HPC Platforms",
IEEE Transactions on Parallel and Distributed Systems, vol. 29, issue 10: IEEE, pp. 2176-2190, 10/2018.
Download: paper_r2.pdf (1.49 MB); tpds2018hpoptasuppl.pdf (3.4 MB)
"Hierarchical Multicore Thread Mapping via Estimation of Remote Communication",
The Journal of Supercomputing, vol. 74, issue 3: Springer, pp. 1321-1340, 03/2018.
"Bi-Objective Optimization of Data-Parallel Applications on Homogeneous Multicore Clusters for Performance and Energy",
IEEE Transactions on Computers, vol. 67, issue 2: IEEE, pp. 160-177, 02/2018.
Download: paperfinal.pdf (1.16 MB)
"Out-of-core Implementation for Accelerator Kernels on Heterogeneous Clouds",
The Journal of Supercomputing, vol. 74, issue 2, pp. 551-568, 2018.
Download: paper.pdf (762.34 KB)
"Additivity: A Selection Criterion for Performance Events for Reliable Energy Predictive Modeling",
Supercomputing Frontiers and Innovations, vol. 4, issue 4, pp. 50-65, 12/2017.
Abstract
Download: 153-992-1-PB.pdf (666.73 KB)
"Model-Based Estimation of the Communication Cost of Hybrid Data-Parallel Applications on Heterogeneous Clusters",
IEEE Transactions on Parallel and Distributed Systems, vol. 28, issue 11: IEEE, pp. 3215-3228, 11/2017.
Download: model-based-estimation-tpds-2017.pdf (1.65 MB); model-based-estimation-tpds-2017-supplement.pdf (871.33 KB)
"A Survey of Power and Energy Predictive Models in HPC Systems and Applications",
ACM Computing Surveys, vol. 50, issue 3: ACM, 10/2017.
Download: surveypowerenergymodelshpc.pdf (578.85 KB)
"New Model-based Methods and Algorithms for Performance and Energy Optimization of Data Parallel Applications on Homogeneous Multicore Clusters",
IEEE Transactions on Parallel and Distributed Systems, vol. 28, issue 4: IEEE, pp. 1119-1133, 04/2017.
Download: performance-energy-homo-multicore-clusters.pdf (1.27 MB)
"Model-based optimization of EULAG kernel on Intel Xeon Phi through load imbalancing",
IEEE Transactions on Parallel and Distributed Systems, vol. 28, issue 3: IEEE, pp. 787-797, 03/2017.
Download: TPDS_15.pdf (812.34 KB)
"Hierarchical redesign of classic MPI reduction algorithms",
The Journal of Supercomputing, vol. 73, issue 2: Springer, pp. 713-725, 02/2017.
Download: TJS-Hasanov-2016.pdf (593.41 KB)
"Automatic tuning to performance modelling of matrix polynomials on multicore and multi-GPU systems",
The Journal of Supercomputing, vol. 73, issue 1, pp. 227-239, 01/2017.
Download: JoS-2016-Murilo.pdf (530.26 KB)
"Network-Aware Optimization of MPDATA on Homogeneous Multi-core Clusters with Heterogeneous Network",
ICA3PP 2016 Workshops, Granada, Spain, Lecture Notes in Computer Science 10049, Springer, pp. 30-42, 14-16 Dec 2016.
Download: tapems2016.pdf (357.28 KB)
"Extending τ -Lop to model concurrent MPI communications in multicore clusters",
Future Generation Computer Systems, vol. 61: Elsevier, pp. 66-82, 08/2016.
Download: fgcs2016.pdf (985.73 KB)
"Network-aware optimization of communications for parallel matrix multiplication on hierarchical HPC platforms",
Concurrency and Computation: Practice and Experience, vol. 28, issue 3: Wiley, pp. 802-821, 03/2016.
Abstract
"Hierarchical Optimization of MPI Reduce Algorithms",
13th International Conference on Parallel Computing Technologies (PaCT-2015), Petrozavodsk, Russia, Lecture Notes in Computer Science 9251, Springer, pp. 21-34, 31 Aug - 4 Sept, 2015.
Download: pact2015reduce.pdf (812.37 KB)
"Towards Application Energy Measurement and Modelling Tool Support",
13th International Conference on Parallel Computing Technologies (PaCT-2015), Petrozavodsk, Russia, Lecture Notes in Computer Science 9251, Springer, pp. 91-101, 31 Aug - 4 Sept, 2015.
Download: pact2015energy.pdf (383.55 KB)
"Asymmetric communication models for resource-constrained hierarchical Ethernet networks",
Concurrency and Computation: Practice and Experience, vol. 27, issue 6: Wiley, pp. 1575-1590, 25/04/2015.
Download: cpe3343.pdf (1.69 MB)
"Hierarchical Approach to Optimization of Parallel Matrix Multiplication on Large-Scale Platforms",
The Journal of Supercomputing, vol. 71, issue 11: Springer, pp. 3991-4014, 11/2015.
Download: JoS 2014 hierarchical matrix multiplication.pdf (1.3 MB)
"Topology-Oblivious Optimization of MPI Broadcast Algorithms on Extreme-Scale Platforms",
Simulation Modelling Practice and Theory, vol. 58: Elsevier, pp. 30-39, 11/2015.
Download: simpat2015.pdf (1.63 MB)
"Hierarchical Approach to Optimization of MPI Collective Communication Algorithms",
School of Computer Science, Dublin, University College Dublin, pp. 152, 10/2015.
Download: khalid-thesis-oct-2015.pdf (1.1 MB)
"Data Partitioning on Multicore and Multi-GPU Platforms Using Functional Performance Models",
IEEE Transactions on Computers, vol. 64, issue 9: IEEE, pp. 2506-2518, 09/2015.
Download: 06975085.pdf (2.08 MB)
"Exascale Machines Require New Programming Paradigms and Runtimes",
Supercomputing Frontiers and Innovations, vol. 2, issue 2, pp. 6-27, 09/2015.
Download: 44-301-3-PB.pdf (308.72 KB)
"Acceleration of MPI Mechanisms for Sustainable HPC Applications",
Supercomputing Frontiers and Innovations, vol. 2, issue 2, pp. 28-45, 2015.
Download: 35-302-3-PB.pdf (464.88 KB)
"Optimizations to enhance sustainability of MPI applications",
EuroMPI/ASIA '14, Kyoto, Japan, ACM, 9-12 September, 2014.
Download: p145-carretero.pdf (351.18 KB)