UCX: An Open Source Framework for HPC Network APIs and Beyond P Shamis, MG Venkata, MG Lopez, MB Baker, O Hernandez, Y Itigin, ... High-Performance Interconnects (HOTI), 2015 IEEE 23rd Annual Symposium on, 40-43, 2015 | 213 | 2015 |
MVAPICH2-GPU: optimized GPU to GPU communication for InfiniBand clusters H Wang, S Potluri, M Luo, AK Singh, S Sur, DK Panda Computer Science-Research and Development 26 (3), 257-266, 2011 | 195 | 2011 |
Efficient Inter-node MPI Communication using GPUDirect RDMA for InfiniBand Clusters with NVIDIA GPUs S Potluri, K Hamidouche, A Venkatesh, D Bureddy, DK Panda Parallel Processing (ICPP), 2013 42nd International Conference on, 80-89, 2013 | 188 | 2013 |
GPU-aware MPI on RDMA-enabled clusters: Design, implementation and evaluation H Wang, S Potluri, D Bureddy, C Rosales, DK Panda Parallel and Distributed Systems, IEEE Transactions on 25 (10), 2595-2605, 2014 | 124 | 2014 |
Optimizing MPI communication on multi-GPU systems using CUDA inter-process communication S Potluri, H Wang, D Bureddy, AK Singh, C Rosales, DK Panda 2012 IEEE 26th International Parallel and Distributed Processing Symposium …, 2012 | 114 | 2012 |
Design of a scalable InfiniBand topology service to enable network-topology-aware placement of processes H Subramoni, S Potluri, K Kandalla, B Barth, J Vienne, J Keasler, ... SC'12: Proceedings of the International Conference on High Performance …, 2012 | 85 | 2012 |
Optimized non-contiguous MPI datatype communication for GPU clusters: Design, implementation and evaluation with MVAPICH2 H Wang, S Potluri, M Luo, AK Singh, X Ouyang, S Sur, DK Panda Cluster Computing (CLUSTER), 2011 IEEE International Conference on, 308-316, 2011 | 80 | 2011 |
OMB-GPU: a micro-benchmark suite for evaluating MPI libraries on GPU clusters D Bureddy, H Wang, A Venkatesh, S Potluri, DK Panda Recent Advances in the Message Passing Interface, 110-120, 2012 | 70 | 2012 |
Designing efficient small message transfer mechanism for inter-node MPI communication on InfiniBand GPU clusters R Shi, S Potluri, K Hamidouche, J Perkins, M Li, D Rossetti, DKDK Panda 2014 21st International Conference on High Performance Computing (HiPC), 1-10, 2014 | 58 | 2014 |
MVAPICH-PRISM: A proxy-based communication framework using InfiniBand and SCIF for Intel MIC clusters S Potluri, D Bureddy, K Hamidouche, A Venkatesh, K Kandalla, ... Proceedings of the International Conference on High Performance Computing …, 2013 | 52 | 2013 |
Efficient intra-node communication on intel-mic clusters S Potluri, A Venkatesh, D Bureddy, K Kandalla, DK Panda 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid …, 2013 | 49 | 2013 |
Designing Scalable Graph500 Benchmark with Hybrid MPI+ OpenSHMEM Programming Models J Jose, S Potluri, K Tomko, DK Panda Supercomputing, 109-124, 2013 | 49 | 2013 |
Extending openSHMEM for GPU computing S Potluri, D Bureddy, H Wang, H Subramoni, DK Panda 2013 IEEE 27th International Symposium on Parallel and Distributed …, 2013 | 44 | 2013 |
Gpu-centric communication on nvidia gpu clusters with infiniband: A case study with openshmem S Potluri, A Goswami, D Rossetti, CJ Newburn, MG Venkata, N Imam 2017 IEEE 24th International Conference on High Performance Computing (HiPC …, 2017 | 38 | 2017 |
Scaling advanced message queuing protocol (AMQP) architecture with broker federation and infiniband G Marsh, AP Sampat, S Potluri, DK Panda Ohio State University, Tech. Rep. OSU-CISRC-5/09-TR17 38, 2008 | 38 | 2008 |
HAND: A Hybrid Approach to Accelerate Non-contiguous Data Movement Using MPI Datatypes on GPU Clusters R Shi, X Lu, S Potluri, K Hamidouche, J Zhang, DK Panda Parallel Processing (ICPP), 2014 43rd International Conference on, 221-230, 2014 | 34 | 2014 |
Designing optimized mpi broadcast and allreduce for many integrated core (mic) infiniband clusters K Kandalla, A Venkatesh, K Hamidouche, S Potluri, D Bureddy, DK Panda 2013 IEEE 21st Annual Symposium on High-Performance Interconnects, 63-70, 2013 | 33 | 2013 |
Quantifying performance benefits of overlap using MPI-2 in a seismic modeling application S Potluri, P Lai, K Tomko, S Sur, Y Cui, M Tatineni, KW Schulz, WL Barth, ... Proceedings of the 24th ACM International Conference on Supercomputing, 17-25, 2010 | 33 | 2010 |
MPI alltoall personalized exchange on GPGPU clusters: Design alternatives and benefit AK Singh, S Potluri, H Wang, K Kandalla, S Sur, DK Panda Cluster Computing (CLUSTER), 2011 IEEE International Conference on, 420-427, 2011 | 31 | 2011 |
A scalable and portable approach to accelerate hybrid HPL on heterogeneous CPU-GPU clusters R Shi, S Potluri, K Hamidouche, X Lu, K Tomko, DK Panda Cluster Computing (CLUSTER), 2013 IEEE International Conference on, 1-8, 2013 | 29 | 2013 |