Parallel processing of matrix multiplication in a CPU and GPU heterogeneous environment S Ohshima, K Kise, T Katagiri, T Yuba High Performance Computing for Computational Science-VECPAR 2006: 7th …, 2007 | 84 | 2007 |
OMPCUDA: OpenMP execution framework for CUDA based on omni OpenMP compiler S Ohshima, S Hirasawa, H Honda Beyond Loop Level Parallelism in OpenMP: Accelerators, Tasking and More: 6th …, 2010 | 34 | 2010 |
Auto-tuning on NUMA and many-core environments with an FDM code T Katagiri, S Ohshima, M Matsumoto 2017 IEEE International Parallel and Distributed Processing Symposium …, 2017 | 26 | 2017 |
ppOpen-HPC: open source infrastructure for development and execution of large-scale scientific applications on post-peta-scale supercomputers with automatic tuning (AT) K Nakajima, M Satoh, T Furumura, H Okuda, T Iwashita, H Sakaguchi, ... Optimization in the Real World: Toward Solving Real-World Optimization …, 2016 | 25 | 2016 |
Directive-based auto-tuning for the finite difference method on the Xeon Phi T Katagiri, S Ohshima, M Matsumoto 2015 IEEE International Parallel and Distributed Processing Symposium …, 2015 | 25 | 2015 |
Auto-tuning of computation kernels from an FDM Code with ppOpen-AT T Katagiri, S Ohshima, M Matsumoto 2014 IEEE 8th International Symposium on Embedded Multicore/Manycore SoCs, 91-98, 2014 | 25 | 2014 |
Auto-tuning of hybrid MPI/OpenMP execution with code selection by ppOpen-AT T Katagiri, M Matsumoto, S Ohshima 2016 IEEE International Parallel and Distributed Processing Symposium …, 2016 | 16 | 2016 |
Optimization of hierarchical matrix computation on GPU S Ohshima, I Yamazaki, A Ida, R Yokota Asian Conference on Supercomputing Frontiers, 274-292, 2018 | 13 | 2018 |
Performance optimization of SpMV using CRS format by considering OpenMP scheduling on CPUs and MIC S Ohshima, T Katagiri, M Matsumoto 2014 IEEE 8th International Symposium on Embedded Multicore/Manycore SoCs …, 2014 | 13 | 2014 |
A sparse matrix library with automatic selection of iterative solvers and preconditioners T Sakurai, T Katagiri, H Kuroda, K Naono, M Igai, S Ohshima Procedia Computer Science 18, 1332-1341, 2013 | 13 | 2013 |
Control Formats for Unsymmetric and Symmetric Sparse Matrix–Vector Multiplications on OpenMP implementations T Katagiri, T Sakurai, M Igai, S Ohshima, H Kuroda, K Naono, K Nakajima High Performance Computing for Computational Science-VECPAR 2012: 10th …, 2013 | 8 | 2013 |
Implementation and evaluation of 3D finite element method application for CUDA S Ohshima, M Hayashi, T Katagiri, K Nakajima High Performance Computing for Computational Science-VECPAR 2012: 10th …, 2013 | 8 | 2013 |
Optimization of numerous small dense-matrix–vector multiplications in H-matrix arithmetic on GPU S Ohshima, I Yamazaki, A Ida, R Yokota 2019 IEEE 13th International Symposium on Embedded Multicore/Many-core …, 2019 | 7 | 2019 |
Implementation of FEM Application on GPU with StarPU S Ohshima, S Katagiri, K Nakajima, S Thibault, R Namyst SIAM CSE13-SIAM Conference on Computational Science and Engineering 2013, 2013 | 6 | 2013 |
A thread-level parallelization of pairwise additive potential and force calculations suitable for current many-core architectures Y Andoh, S Suzuki, S Ohshima, T Sakashita, M Ogino, T Katagiri, N Yoshii, ... The Journal of Supercomputing 74, 2449-2469, 2018 | 5 | 2018 |
Performance of hierarchical-matrix BiCGStab solver on GPU clusters I Yamazaki, A Abdelfattah, A Ida, S Ohshima, S Tomov, R Yokota, ... 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2018 | 5 | 2018 |
Scalable direct-iterative hybrid solver for sparse matrices on multi-core and vector architectures K Ono, T Kato, S Ohshima, T Nanri Proceedings of the International Conference on High Performance Computing in …, 2020 | 4 | 2020 |