Kiran Kumar Matam

Cited by

	All	Since 2019
Citations	767	577
h-index	12	10
i10-index	14	10

180

135

20122013201420152016201720182019202020212022202320249 19 32 33 26 25 27 38 62 87 125 161 103

Public access

View all

3 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Murali AnnavaramUSCVerified email at usc.edu
Kishore KothapalliIIIT HyderabadVerified email at iiit.ac.in
Gunjae KooAssociate Professor, Korea UniversityVerified email at korea.ac.kr
Hung-Wei TsengUniversity of California, RiversideVerified email at ucr.edu
Viktor K. PrasannaUniversity of Southern CaliforniaVerified email at usc.edu
Krishna Giri NarraPhD, GoogleVerified email at google.com
P J NarayananProfessor, IIIT, HyderabadVerified email at iiit.ac.in
Da TongUniveristy of Southern CaliforniaVerified email at usc.edu
Jyothish SomanRelation Therapeutics, University of Cambridge, IBM Research, IIITVerified email at cam.ac.uk

Kiran Kumar Matam

Research Scientist, Facebook

Verified email at usc.edu

Computer architecture Parallel computing


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Summarizer: trading communication with computing near storage G Koo, KK Matam, I Te, HVKG Narra, J Li, HW Tseng, S Swanson, ... 2017 50th Annual IEEE/ACM International Symposium on Microarchitecture …, 2017	162	2017
Sparse matrix-matrix multiplication on modern architectures K Matam, SRKB Indarapu, K Kothapalli 2012 19th International Conference on High Performance Computing, 1-10, 2012	93*	2012
Software-hardware co-design for fast and scalable training of deep learning recommendation models D Mudigere, Y Hao, J Huang, Z Jia, A Tulloch, S Sridharan, X Liu, ... Proceedings of the 49th Annual International Symposium on Computer …, 2022	84	2022
Accelerating sparse matrix vector multiplication in iterative methods using GPU KK Matam, K Kothapalli 2011 International Conference on Parallel Processing, 612-621, 2011	68	2011
GraphSSD: graph semantics aware SSD KK Matam, G Koo, H Zha, HW Tseng, M Annavaram Proceedings of the 46th international symposium on computer architecture …, 2019	67	2019
{Check-N-Run}: A checkpointing system for training deep learning recommendation models A Eisenman, KK Matam, S Ingram, D Mudigere, R Krishnamoorthi, K Nair, ... 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI …, 2022	49	2022
High throughput and programmable online trafficclassifier on FPGA D Tong, L Sun, K Matam, V Prasanna Proceedings of the ACM/SIGDA international symposium on Field programmable …, 2013	43	2013
M. khorashadi, P D Mudigere, Y Hao, J Huang, Z Jia, A Tulloch, S Sridharan, X Liu, ... Bhattacharya, P. Lapukhov, M. Naumov, L. Qiao, M. Smelyanskiy, B. Jia, and V …, 2021	39	2021
High-performance, distributed training of large-scale deep learning recommendation models D Mudigere, Y Hao, J Huang, A Tulloch, S Sridharan, X Liu, M Ozdal, ... arXiv preprint arXiv:2104.05158, 2021	33	2021
First-generation inference accelerator deployment at facebook M Anderson, B Chen, S Chen, S Deng, J Fix, M Gschwind, A Kalaiah, ... arXiv preprint arXiv:2107.04140, 2021	32	2021
CPU and/or GPU: Revisiting the GPU vs. CPU myth K Kothapalli, DS Banerjee, PJ Narayanan, S Sood, AK Bahl, S Sharma, ... arXiv preprint arXiv:1303.2171, 2013	16	2013
GPU accelerated Lanczos algorithm with applications KK Matam, K Kothapalli 2011 IEEE Workshops of International Conference on Advanced Information …, 2011	16	2011
Energy-efficient large-scale matrix multiplication on FPGAs KK Matam, VK Prasanna 2013 International Conference on Reconfigurable Computing and FPGAs …, 2013	11	2013
Efficient Discrete Range Searching primitives on the GPU with applications J Soman, MK Kumar, K Kothapalli, PJ Narayanan High Performance Computing (HiPC), 2010 International Conference on, 1-10, 2010	11	2010
Evaluating energy efficiency of floating point matrix multiplication on FPGAs KK Matam, H Le, VK Prasanna 2013 IEEE High Performance Extreme Computing Conference (HPEC), 1-6, 2013	8	2013
T. I, HKG Narra, J. Li, H G Koo, KK Matam W. Tseng, S. Swanson, and M. Annavaram,“Summarizer: Trading communication …, 2017	7	2017
Check-n-run: A checkpointing system for training recommendation models A Eisenman, KK Matam, S Ingram, D Mudigere, R Krishnamoorthi, ... arXiv preprint arXiv:2010.08679 5, 2020	6	2020
Efficient automatic parallelization of a single GPU program for a multiple GPU system MK Kumar, MR Abdel-Majeed, M Annavaram Integration 66, 35-43, 2019	6	2019
Energy efficient architecture for matrix multiplication on fpgas KK Matam, H Le, VK Prasanna 2013 23rd International Conference on Field programmable Logic and …, 2013	5	2013
Multilogvc: efficient out-of-core graph processing framework for flash storage KK Matam, H Hashemi, M Annavaram 2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2021	4	2021

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors