"Design and implementation of self-adaptable parallel algorithms for scientific computing on highly heterogeneous HPC platforms",
arXiv.org, no. arXiv:1109.3074, 09/2011.
Download: 1109.3074.pdf (1.04 MB)
"A Non-Intrusive and Incremental Approach to Enabling Direct Communications in RPC-based Grid Programming Systems",
Technical Report UCD-CSI-2005-2, pp. 15, 2006.
Download: ucd-csi-2005-2.pdf (295.56 KB)
An Efficient Procedure for Building the Functional Performance Model of a Processor,
, 2005.
Download: Cluster2005_perf_model.pdf (268.4 KB)
"MPIBlib: Benchmarking MPI Communications for Parallel Computing on Homogeneous and Heterogeneous Clusters",
15th European PVM/MPI User's Group Meeting, vol. 5205, Dublin, Ireland, Springer-Verlag Berlin Heidelberg, pp. 227-238, September 7-10, 2008.
Download: 52050227.pdf (341.9 KB)
"A Variable Group Block Distribution Strategy for Dense Factorizations on Networks of Heterogeneous Computers",
Proceedings of the 6th International Conference on Parallel Processing and Applied Mathematics (PPAM 2005), vol. 3911, Poznan, Poland, Springer, 11-14 Sept 2005.
Download: PPAM_HPC_Hetero_LU_2005.pdf (79.43 KB)
"A Language and Programming Environment for High-Performance Parallel Computing on Heterogeneous Networks",
Programming and Computer Software, vol. 26, issue 4: Kluwer, pp. 216-236, 2000.
Download: PCS2000.pdf (1.87 MB)
"Optimal Matrix Partitioning for Data Parallel Computing on Hybrid Heterogeneous Platforms",
19th International Symposium on Parallel and Distributed Computing (ISPDC), Warsaw, Poland, IEEE, 5-8 July, 2020.
Download: ispdc2020.pdf (367.16 KB)
"Topology-aware Optimization of Communication Cost of Parallel Applications in Heterogeneous HPC Systems",
School of Computer Science, Dublin, University College Dublin, pp. 106, 09/2016.
Download: thesis.pdf (1 MB)
"Network-aware optimization of communications for parallel matrix multiplication on hierarchical HPC platforms",
Concurrency and Computation: Practice and Experience, vol. 28, issue 3: Wiley, pp. 802-821, 03/2016.
Abstract
"Network-Aware Optimization of MPDATA on Homogeneous Multi-core Clusters with Heterogeneous Network",
ICA3PP 2016 Workshops, Granada, Spain, Lecture Notes in Computer Science 10049, Springer, pp. 30-42, 14-16 Dec 2016.
Download: tapems2016.pdf (357.28 KB)
"Topology-aware Optimization of Communications for Parallel Matrix Multiplication on Hierarchical Heterogeneous HPC Platforms",
23rd International Heterogeneity in Computing Workshop (HCW 2014), Phoenix, Arizona, USA, IEEE Computer Society, 19 May, 2014.
Download: HCW-2014-09.pdf (294.81 KB)
"Towards Optimal Matrix Partitioning for Data Parallel Computing on a Hybrid Heterogeneous Server",
IEEE Access, vol. 9: IEEE, pp. 17229 - 17244, 02/2021.
Download: IEEE-Access-09328411.pdf (3.76 MB)
"Acceleration of Bi-Objective Optimization of Data-Parallel Applications for Performance and Energy on Heterogeneous Hybrid Platforms",
IEEE Access, vol. 11: IEEE, pp. 27226-27245, 03/2023.
Download: Access-2023-acceleration.pdf (1.28 MB)
"On Energy Nonproportionality of CPUs and GPUs",
31st Heterogeneity in Computing Workshop (HCW 2022), Lyon, France, IEEE, pp. 34-44, 30/05/2022.
Download: On_Energy_Nonproportionality_of_CPUs_and_GPUs.pdf (1.03 MB)
"Accurate and Reliable Energy Measurement and Modelling of Data Transfer Between CPU and GPU in Parallel Applications on Heterogeneous Hybrid Platforms",
IEEE Transactions on Computers, vol. 74, issue 3, pp. 1011--1024, 03/2025.
Download: Accurate_and_Reliable_Energy_Measurement_and_Modelling_of_Data_Transfer_Between_CPU_and_GPU_in_Parallel_Applications_on_Heterogeneous_Hybrid_Platforms.pdf (853.36 KB); supplemental_r2.pdf (2.05 MB)
"SUARA: A scalable universal allreduce communication algorithm for acceleration of parallel deep learning applications",
Journal of Parallel and Distributed Computing, vol. 183, pp. 15, 01/2024.
Download: jpdc-suara.pdf (2.46 MB)
"Model-based selection of optimal MPI broadcast algorithms for multi-core clusters",
Journal of Parallel and Distributed Computing, vol. 165: Elsevier, pp. 1-16, 07/2022.
Download: 1-s2.0-S0743731522000697-main.pdf (988.38 KB)
"Efficient and accurate selection of optimal MPI collective algorithms using analytical performance modelling",
School of Computer Science, Dublin, University College Dublin, pp. 130, 06/2021.
Download: thesis.pdf (2.21 MB)
"Efficient and Accurate Selection of Optimal Collective Communication Algorithms Using Analytical Performance Modeling",
IEEE Access, vol. 9: IEEE, pp. 109355 - 109373, 07/2021.
Download: Efficient_and_Accurate_Selection_of_Optimal_Collective_Communication_Algorithms_Using_Analytical_Performance_Modeling.pdf (6.95 MB)
"A New Model-Based Approach to Performance Comparison of MPI Collective Algorithms",
16th International Conference on Parallel Computing Technologies (PaCT 2021), Kaliningrad, Russia, Lecture Notes in Computer Science 12942, Springer, pp. 11-25, 09/2021.
Download: Nuriyev-Lastovetsky2021_Chapter_ANewModel-BasedApproachToPerfo.pdf (623.78 KB)
"Application level energy measurements and models for hybrid platform with accelerators",
School of Computer Science, Dublin, University College Dublin, pp. 165, 05/2018.
Download: thesis.pdf (1.61 MB)
"Towards Application Energy Measurement and Modelling Tool Support",
13th International Conference on Parallel Computing Technologies (PaCT-2015), Petrozavodsk, Russia, Lecture Notes in Computer Science 9251, Springer, pp. 91-101, 31 Aug - 4 Sept, 2015.
Download: pact2015energy.pdf (383.55 KB)
"A Survey of Power and Energy Predictive Models in HPC Systems and Applications",
ACM Computing Surveys, vol. 50, issue 3: ACM, 10/2017.
Download: surveypowerenergymodelshpc.pdf (578.85 KB)
"Communication Performance Models for Heterogeneous Computational Clusters",
School of Computer Science and Informatics, Dublin, University College Dublin, pp. 115, 06/2009.
Download: moflynn-ethesis.pdf (925.45 KB)
"Energy aware ultrascale systems",
Ultrascale computing systems: IET, 03/2019.
Download: chap5.pdf (3.2 MB)