"Compilation of Vector Statements of C[] Language for Architectures with Multilevel Memory Hierarchy",
Programming and Computer Software, vol. 27, issue 3, pp. 111-122, 2001.
Download: CompilOfVectorExpres_2001.pdf (87.7 KB)
"Data distribution for dense factorization on computers with memory heterogeneity",
Parallel Computing, vol. 33, issue 12, pp. 757-779, 12/2007.
Abstract
Download: sdarticle.pdf (714.34 KB)
"Data Partitioning for Multiprocessors with Memory Heterogeneity and Memory Constraints",
Scientific Programming, vol. 13, issue 2: IOS Press, pp. 93-112, 2005.
Download: JSP_data_partitioning_2005.pdf (204.98 KB)
"Data Partitioning on Multicore and Multi-GPU Platforms Using Functional Performance Models",
IEEE Transactions on Computers, vol. 64, issue 9: IEEE, pp. 2506-2518, 09/2015.
Download: 06975085.pdf (2.08 MB)
"Data Partitioning with a Functional Performance Model of Heterogeneous Processors",
International Journal of High Performance Computing Applications, vol. 21, issue 1: Sage, pp. 76-90, 2007.
Download: 76.pdf (497.14 KB)
"Design of self-adaptable data parallel applications on multicore clusters automatically optimized for performance and energy through load distribution",
Concurrency and Computation: Practice and Experience, vol. 31, issue 4: Wiley, 02/2019.
Download: ccpe2018ravi.pdf (1.67 MB)
"Dynamic Load Balancing of Parallel Computational Iterative Routines on Highly Heterogeneous HPC Platforms",
Parallel Processing Letters, vol. 21, issue 2: World Scientific, pp. 195-217, 06/2011.
Download: DLB_PCIR_HHHP-16.pdf (797.9 KB)
"Effective Solving Scientific Problems on Heterogeneous Networks of Computers with mpC",
Journal of Computational Methods in Science and Engineering, vol. 2, issue 1-2: IOS Press, pp. 135-140, 2002.
"Efficient and Accurate Selection of Optimal Collective Communication Algorithms Using Analytical Performance Modeling",
IEEE Access, vol. 9: IEEE, pp. 109355 - 109373, 07/2021.
Download: Efficient_and_Accurate_Selection_of_Optimal_Collective_Communication_Algorithms_Using_Analytical_Performance_Modeling.pdf (6.95 MB)
"Efficient and reliable network tomography in heterogeneous networks using BitTorrent broadcasts and clustering algorithms",
Scientific Programming, vol. 21, issue 3-4: IOS Press, pp. 79-92, 12/2013.
Download: sci-pro-2013.pdf (702.01 KB)
"Efficient exact algorithms for continuous bi-objective performance-energy optimization of applications with linear energy and monotonically increasing performance profiles on heterogeneous high performance computing platforms",
Concurrency and Computation: Practice and Experience, vol. 35, issue 20: Wiley, pp. 1--19, 09/2023.
Download: Concurrency and Computation - 2022 - Khaleghzadeh - Efficient exact algorithms for continuous bi‐objective.pdf (1.58 MB)
"Energy Predictive Models of Computing: Theory, Practical Implications and Experimental Analysis on Multicore Processors",
IEEE Access, vol. 9: IEEE, pp. 63149 - 63172, 04/2021.
Download: IEEE_Access_2021_Energy_theory.pdf (2.11 MB)
"Energy-Efficient Parallel Computing: Challenges to Scaling",
Information, vol. 14, issue 4, pp. 1--29, 04/2023.
Download: information-14-00248.pdf (1.53 MB)
"Exascale Machines Require New Programming Paradigms and Runtimes",
Supercomputing Frontiers and Innovations, vol. 2, issue 2, pp. 6-27, 09/2015.
Download: 44-301-3-PB.pdf (308.72 KB)
"Extending τ -Lop to model concurrent MPI communications in multicore clusters",
Future Generation Computer Systems, vol. 61: Elsevier, pp. 66-82, 08/2016.
Download: fgcs2016.pdf (985.73 KB)
"Extension of ANSI C for vector and superscalar computers",
Programming and Computer Software, vol. 21, issue 1: Kluwer, pp. 17-25, 1995.
"FuPerMod: a software tool for the optimization of data-parallel applications on heterogeneous platforms",
The Journal of Supercomputing, vol. 69, issue 1: Springer US, pp. 61- 69, 2014.
Download: fupermod-jos-2014.pdf (276.83 KB)
"Heterogeneity in parallel and distributed computing",
Journal of Parallel and Distributed Computing, vol. 73, issue 12, pp. 1523-1524, 2013.
Download: jpdc-2013.pdf (152.05 KB)
"Heterogeneous Computing",
Parallel Computing, vol. 31, issue 7: Elsevier, pp. 649-812, 2005.
Download: HC_2005.pdf (61 KB)
"Heterogeneous Distribution of Computations Solving Linear Algebra Problems on Networks of Heterogeneous Computers",
Journal of Parallel and Distributed Computing, vol. 61, issue 4: Academic Press, pp. 520-535, 2001.
Download: SolvinLinearAlgebra_2001.pdf (229.46 KB)
"Heterogeneous Parallel Computing: from Clusters of Workstations to Hierarchical Hybrid Platforms",
Supercomputing Frontiers and Innovations, vol. 1, issue 3, pp. 70-87, 12/2014.
Download: 32-140-2-PB.pdf (747.18 KB)
"HeteroMPI: Towards a Message-Passing Library for Heterogeneous Networks of Computers",
Journal of Parallel and Distributed Computing, vol. 66, issue 2: Elsevier, pp. 197-220, 2006.
Download: JPDC_HMPI_2006.pdf (349.02 KB)
"HeteroPBLAS: A Set of Parallel Basic Linear Algebra Subprograms Optimized for Heterogeneous Computational Clusters",
Scalable Computing: Practice and Experience, vol. 10, issue 2, pp. 201-216, 06/2009.
Download: SCPE_10_2_06.pdf (248.74 KB)
"Hierarchical Approach to Optimization of Parallel Matrix Multiplication on Large-Scale Platforms",
The Journal of Supercomputing, vol. 71, issue 11: Springer, pp. 3991-4014, 11/2015.
Download: JoS 2014 hierarchical matrix multiplication.pdf (1.3 MB)
"A Hierarchical Data-Partitioning Algorithm for Performance Optimization of Data-Parallel Applications on Heterogeneous Multi-Accelerator NUMA Nodes",
IEEE Access, vol. 8: IEEE, pp. 7861 - 7876, 01/2020.
Download: 08933138.pdf (3.4 MB)