System to profile and optimize user software in a managed run-time environment CJ Newburn, R Knight, R Geva, D Rodgers, X Zou, H Wang, BE Bigbee, ... US Patent 8,301,868, 2012 | 283 | 2012 |
Profiling using a user-level control mechanism R Knight, C Newburn, A Chernoff, H Wang, X Zou, R Geva US Patent App. 11/240,703, 2007 | 237 | 2007 |
Intel's Array Building Blocks: A retargetable, dynamic compiler and embedded language CJ Newburn, B So, Z Liu, M McCool, A Ghuloum, S Du Toit, ZG Wang, ... International Symposium on Code Generation and Optimization (CGO 2011), 224-235, 2011 | 132 | 2011 |
Trends in data locality abstractions for HPC systems D Unat, A Dubey, T Hoefler, J Shalf, M Abraham, M Bianco, ... IEEE Transactions on Parallel and Distributed Systems 28 (10), 3007-3020, 2017 | 126 | 2017 |
Offload compiler runtime for the Intel® Xeon Phi coprocessor CJ Newburn, S Dmitriev, R Narayanaswamy, J Wiegert, R Murty, ... 2013 IEEE International Symposium on Parallel & Distributed Processing …, 2013 | 93 | 2013 |
Using interaction costs for microarchitectural bottleneck analysis BA Fields, R Bodik, MD Hill, CJ Newburn Proceedings. 36th Annual IEEE/ACM International Symposium on …, 2003 | 91 | 2003 |
Gathering and scattering multiple data elements CJ Hughes, YKYK Chen, M Bomb, JW Brandt, MJ Buxton, MJ Charney, ... US Patent 8,447,962, 2013 | 83 | 2013 |
Stack value file: Custom microarchitecture for the stack HHS Lee, M Smelyanskiy, CJ Newburn, GS Tyson Proceedings HPCA Seventh International Symposium on High-Performance …, 2001 | 78 | 2001 |
Thread-data affinity optimization using compiler X Tian, M Girkar, DC Sehr, R Grove, W Li, H Wang, C Newburn, P Wang, ... US Patent 8,037,465, 2011 | 69 | 2011 |
Enhancements to performance monitoring architecture for critical path-based analysis C Newburn US Patent App. 11/143,425, 2005 | 51 | 2005 |
Multi-processor computing system that employs compressed cache lines' worth of information and processor capable of use in said system CJ Newburn, R Huggahalli, HHJ Hum, AR Adl-Tabatabai, AM Ghuloum US Patent 7,257,693, 2007 | 47 | 2007 |
Interaction cost and shotgun profiling BA Fields, R Bodik, MD Hill, CJ Newburn ACM Transactions on Architecture and Code Optimization (TACO) 1 (3), 272-304, 2004 | 45 | 2004 |
Workflows are the new applications: Challenges in performance, portability, and productivity T Ben-Nun, T Gamblin, DS Hollman, H Krishnan, CJ Newburn 2020 IEEE/ACM International Workshop on Performance, Portability and …, 2020 | 44 | 2020 |
Programmable event driven yield mechanism which may activate service threads X Zou, H Wang, SD Rodgers, DD Boggs, B Bigbee, S Kaushik, ... US Patent 7,849,465, 2010 | 43 | 2010 |
Laser: Light, accurate sharing detection and repair L Luo, A Sriraman, B Fugate, S Hu, G Pokam, CJ Newburn, J Devietti 2016 IEEE International Symposium on High Performance Computer Architecture …, 2016 | 41 | 2016 |
Context state management for processor feature sets DA Van Dyke, M Mishaeli, I Anati, BV Patel, W Deutsch, R Shah, G Neiger, ... US Patent 8,631,261, 2014 | 41 | 2014 |
Processor and memory controller capable of use in computing system that employs compressed cache lines' worth of information CJ Newburn, R Huggahalli, HHJ Hum, AR Adl-Tabatabai, AM Ghuloum US Patent 7,512,750, 2009 | 41 | 2009 |
Gpu-centric communication on nvidia gpu clusters with infiniband: A case study with openshmem S Potluri, A Goswami, D Rossetti, CJ Newburn, MG Venkata, N Imam 2017 IEEE 24th International Conference on High Performance Computing (HiPC …, 2017 | 38 | 2017 |
Programming abstractions for data locality A Tate, A Kamil, A Dubey, A Groblinger, B Chamberlain, B Goglin, ... Office of Scientific and Technical Information (OSTI), 2014 | 33 | 2014 |
Heterogeneous streaming CJ Newburn, G Bansal, M Wood, L Crivelli, J Planas, A Duran, P Souza, ... 2016 IEEE International Parallel and Distributed Processing Symposium …, 2016 | 26 | 2016 |