End-to-end waveform utterance enhancement for direct evaluation metrics optimization by fully convolutional neural networks SW Fu, TW Wang, Y Tsao, X Lu, H Kawai IEEE/ACM Transactions on Audio, Speech, and Language Processing 26 (9), 1570 …, 2018 | 344 | 2018 |
Raw waveform-based speech enhancement by fully convolutional networks SW Fu, Y Tsao, X Lu, H Kawai 2017 Asia-Pacific Signal and Information Processing Association Annual …, 2017 | 271 | 2017 |
The ATR multilingual speech-to-speech translation system S Nakamura, K Markov, H Nakaiwa, G Kikui, H Kawai, T Jitsuhiro, ... IEEE Transactions on Audio, Speech, and Language Processing 14 (2), 365-376, 2006 | 201 | 2006 |
XIMERA: A new TTS from ATR based on corpus-based technologies H Kawai, T Toda, J Ni, M Tsuzaki, K Tokuda Fifth ISCA Workshop on Speech Synthesis, 2004 | 183 | 2004 |
Realization of linguistic information in the voice fundamental frequency contour of the spoken Japanese H Fujisaki, H Kawai ICASSP-88., International Conference on Acoustics, Speech, and Signal …, 1988 | 74 | 1988 |
An investigation of a knowledge distillation method for CTC acoustic models R Takashima, S Li, H Kawai 2018 IEEE International Conference on Acoustics, Speech and Signal …, 2018 | 69 | 2018 |
Improving Transformer-Based Speech Recognition Systems with Compressed Structure and Speech Attributes Augmentation. S Li, R Dabre, X Lu, P Shen, T Kawahara, H Kawai Interspeech, 4400-4404, 2019 | 59 | 2019 |
Maximum a posteriori Based Decoding for CTC Acoustic Models. N Kanda, X Lu, H Kawai Interspeech, 1868-1872, 2016 | 57 | 2016 |
An evaluation of automatic phone segmentation for concatenative speech synthesis H Kawai, T Toda 2004 IEEE International Conference on Acoustics, Speech, and Signal …, 2004 | 55 | 2004 |
An investigation of noise shaping with perceptual weighting for WaveNet-based speech generation K Tachibana, T Toda, Y Shiga, H Kawai 2018 IEEE International Conference on Acoustics, Speech and Signal …, 2018 | 51 | 2018 |
Unit selection algorithm for Japanese speech synthesis based on both phoneme unit and diphone unit T Toda, H Kawai, M Tsuzaki, K Shikano 2002 IEEE International Conference on Acoustics, Speech, and Signal …, 2002 | 46 | 2002 |
Conditional Generative Adversarial Nets Classifier for Spoken Language Identification. P Shen, X Lu, S Li, H Kawai Interspeech, 2814-2818, 2017 | 43 | 2017 |
A design method of speech corpus for text-to-speech synthesis taking account of prosody H Kawai, S Yamamoto, N Higuchi, T Shimizu Sixth International Conference on Spoken Language Processing, 2000 | 43 | 2000 |
Understanding natural language instructions for fetching daily objects using gan-based multimodal target–source classification A Magassouba, K Sugiura, AT Quoc, H Kawai IEEE Robotics and Automation Letters 4 (4), 3884-3891, 2019 | 40 | 2019 |
Multilingual speech-to-speech translation system: Voicetra S Matsuda, X Hu, Y Shiga, H Kashioka, C Hori, K Yasuda, H Okuma, ... 2013 IEEE 14th International Conference on Mobile Data Management 2, 229-233, 2013 | 40 | 2013 |
XIMERA: A concatenative speech synthesis system with large scale corpora H Kawai, T Toda, JUNI YAMAGISHI, T Hirai, J Ni, N Nishizawa, M Tsuzaki, ... 電子情報通信学会論文誌 D 89 (12), 2688-2698, 2006 | 40 | 2006 |
Optimizing sub-cost functions for segment selection based on perceptual evaluations in concatenative speech synthesis T Toda, H Kawai, M Tsuzaki 2004 IEEE International Conference on Acoustics, Speech, and Signal …, 2004 | 40 | 2004 |
Investigation of sequence-level knowledge distillation methods for CTC acoustic models R Takashima, L Sheng, H Kawai ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019 | 38 | 2019 |
Phone duration modeling using gradient tree boosting J Yamagishi, H Kawai, T Kobayashi Speech Communication 50 (5), 405-415, 2008 | 37 | 2008 |
Real-Time Neural Text-to-Speech with Sequence-to-Sequence Acoustic Model and WaveGlow or Single Gaussian WaveRNN Vocoders. T Okamoto, T Toda, Y Shiga, H Kawai INTERSPEECH, 1308-1312, 2019 | 36 | 2019 |