"A Comparative Study of Techniques for Energy Predictive Modeling Using Performance Monitoring Counters on Modern Multicore CPUs",
IEEE Access, vol. 8: IEEE, pp. 143306 - 143332, 08/2020.
Download: IEEE-Access-09154439.pdf (2.37 MB)
"Compilation of Vector Statements of C[] Language for Architectures with Multilevel Memory Hierarchy",
Programming and Computer Software, vol. 27, issue 3, pp. 111-122, 2001.
Download: CompilOfVectorExpres_2001.pdf (87.7 KB)
"Concurrent and Orthogonal Software Power Meters for Accurate Runtime Energy Profiling of Parallel Hybrid Programs on Heterogeneous Hybrid Servers",
IEEE Transactions on Parallel and Distributed Systems, vol. 37, issue 2: IEEE, pp. 322-339, 02/2026.
Download: Concurrent_and_Orthogonal_Software_Power_Meters_for_Accurate_Runtime_Energy_Profiling_of_Parallel_Hybrid_Programs_on_Heterogeneous_Hybrid_Servers.pdf (3.44 MB)
"Data distribution for dense factorization on computers with memory heterogeneity",
Parallel Computing, vol. 33, issue 12, pp. 757-779, 12/2007.
Abstract
Download: sdarticle.pdf (714.34 KB)
"Data Partitioning for Multiprocessors with Memory Heterogeneity and Memory Constraints",
Scientific Programming, vol. 13, issue 2: IOS Press, pp. 93-112, 2005.
Download: JSP_data_partitioning_2005.pdf (204.98 KB)
"Data Partitioning on Multicore and Multi-GPU Platforms Using Functional Performance Models",
IEEE Transactions on Computers, vol. 64, issue 9: IEEE, pp. 2506-2518, 09/2015.
Download: 06975085.pdf (2.08 MB)
"Data Partitioning with a Functional Performance Model of Heterogeneous Processors",
International Journal of High Performance Computing Applications, vol. 21, issue 1: Sage, pp. 76-90, 2007.
Download: 76.pdf (497.14 KB)
"Design of self-adaptable data parallel applications on multicore clusters automatically optimized for performance and energy through load distribution",
Concurrency and Computation: Practice and Experience, vol. 31, issue 4: Wiley, 02/2019.
Download: ccpe2018ravi.pdf (1.67 MB)
"Dynamic Load Balancing of Parallel Computational Iterative Routines on Highly Heterogeneous HPC Platforms",
Parallel Processing Letters, vol. 21, issue 2: World Scientific, pp. 195-217, 06/2011.
Download: DLB_PCIR_HHHP-16.pdf (797.9 KB)
"Effective Solving Scientific Problems on Heterogeneous Networks of Computers with mpC",
Journal of Computational Methods in Science and Engineering, vol. 2, issue 1-2: IOS Press, pp. 135-140, 2002.
"Efficient and Accurate Selection of Optimal Collective Communication Algorithms Using Analytical Performance Modeling",
IEEE Access, vol. 9: IEEE, pp. 109355 - 109373, 07/2021.
Download: Efficient_and_Accurate_Selection_of_Optimal_Collective_Communication_Algorithms_Using_Analytical_Performance_Modeling.pdf (6.95 MB)
"Efficient and reliable network tomography in heterogeneous networks using BitTorrent broadcasts and clustering algorithms",
Scientific Programming, vol. 21, issue 3-4: IOS Press, pp. 79-92, 12/2013.
Download: sci-pro-2013.pdf (702.01 KB)
"Efficient exact algorithms for continuous bi-objective performance-energy optimization of applications with linear energy and monotonically increasing performance profiles on heterogeneous high performance computing platforms",
Concurrency and Computation: Practice and Experience, vol. 35, issue 20: Wiley, pp. 1--19, 09/2023.
Download: Concurrency and Computation - 2022 - Khaleghzadeh - Efficient exact algorithms for continuous bi‐objective.pdf (1.58 MB)
"Energy Predictive Models of Computing: Theory, Practical Implications and Experimental Analysis on Multicore Processors",
IEEE Access, vol. 9: IEEE, pp. 63149 - 63172, 04/2021.
Download: IEEE_Access_2021_Energy_theory.pdf (2.11 MB)
"Energy-Efficient Parallel Computing: Challenges to Scaling",
Information, vol. 14, issue 4, pp. 1--29, 04/2023.
Download: information-14-00248.pdf (1.53 MB)
"Exascale Machines Require New Programming Paradigms and Runtimes",
Supercomputing Frontiers and Innovations, vol. 2, issue 2, pp. 6-27, 09/2015.
Download: 44-301-3-PB.pdf (308.72 KB)
"Extending τ -Lop to model concurrent MPI communications in multicore clusters",
Future Generation Computer Systems, vol. 61: Elsevier, pp. 66-82, 08/2016.
Download: fgcs2016.pdf (985.73 KB)
"Extension of ANSI C for vector and superscalar computers",
Programming and Computer Software, vol. 21, issue 1: Kluwer, pp. 17-25, 1995.
"FuPerMod: a software tool for the optimization of data-parallel applications on heterogeneous platforms",
The Journal of Supercomputing, vol. 69, issue 1: Springer US, pp. 61- 69, 2014.
Download: fupermod-jos-2014.pdf (276.83 KB)
"Heterogeneity in parallel and distributed computing",
Journal of Parallel and Distributed Computing, vol. 73, issue 12, pp. 1523-1524, 2013.
Download: jpdc-2013.pdf (152.05 KB)
"Heterogeneous Computing",
Parallel Computing, vol. 31, issue 7: Elsevier, pp. 649-812, 2005.
Download: HC_2005.pdf (61 KB)
"Heterogeneous Distribution of Computations Solving Linear Algebra Problems on Networks of Heterogeneous Computers",
Journal of Parallel and Distributed Computing, vol. 61, issue 4: Academic Press, pp. 520-535, 2001.
Download: SolvinLinearAlgebra_2001.pdf (229.46 KB)
"Heterogeneous Parallel Computing: from Clusters of Workstations to Hierarchical Hybrid Platforms",
Supercomputing Frontiers and Innovations, vol. 1, issue 3, pp. 70-87, 12/2014.
Download: 32-140-2-PB.pdf (747.18 KB)
"HeteroMPI: Towards a Message-Passing Library for Heterogeneous Networks of Computers",
Journal of Parallel and Distributed Computing, vol. 66, issue 2: Elsevier, pp. 197-220, 2006.
Download: JPDC_HMPI_2006.pdf (349.02 KB)
"HeteroPBLAS: A Set of Parallel Basic Linear Algebra Subprograms Optimized for Heterogeneous Computational Clusters",
Scalable Computing: Practice and Experience, vol. 10, issue 2, pp. 201-216, 06/2009.
Download: SCPE_10_2_06.pdf (248.74 KB)

] 

