Yuki Saito

Cited by

	All	Since 2019
Citations	962	867
h-index	14	13
i10-index	19	19

200

100

150

2017201820192020202120222023202419 75 140 157 189 173 157 51

Public access

View all

2 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Hiroshi SaruwatariProfessor, The University of TokyoVerified email at ipc.i.u-tokyo.ac.jp
Shinnosuke TakamichiKeio UniversityVerified email at keio.jp
Yusuke IjimaNTT CorporationVerified email at lab.ntt.co.jp
Detai XinThe University of TokyoVerified email at ipc.i.u-tokyo.ac.jp
Kyosuke NishidaNTT Human Informatics Laboratories, NTT CorporationVerified email at lab.ntt.co.jp
Daichi KitamuraNational Institute of Technology, Kagawa CollegeVerified email at ieee.org
Kentaro Mitsuirinna Co., Ltd.Verified email at rinna.co.jp
Takaaki SaekiGoogleVerified email at google.com
Hiroyuki MiyoshiUniversity of TokyoVerified email at imperial.ac.uk
Wataru NakataThe University of TokyoVerified email at g.ecc.u-tokyo.ac.jp
Ryo MasumuraDistinguished Research Scientist, NTT Computer and Data Science Laboratories, NTT CorporationVerified email at lab.ntt.co.jp
Taiki Nakamura1st year M.S. student, the University of TokyoVerified email at g.ecc.u-tokyo.ac.jp
Yukino BabaAssociate Professor, The University of TokyoVerified email at g.ecc.u-tokyo.ac.jp
Yuto NishimuraThe University of TokyoVerified email at g.ecc.u-tokyo.ac.jp
Aya WatanabeThe University of TokyoVerified email at g.ecc.u-tokyo.ac.jp
Yuki YamashitaThe University of TokyoVerified email at nlab.ci.i.u-tokyo.ac.jp
Xuan LuoThe University of TokyoVerified email at g.ecc.u-tokyo.ac.jp
Kazuki YamauchiThe University of TokyoVerified email at g.ecc.u-tokyo.ac.jp
Ryoichi MiyazakiNational Institute of Technology, Tokuyama CollegeVerified email at miyazaki-lab.org
Kei AkuzawaThe University of TokyoVerified email at weblab.t.u-tokyo.ac.jp

Yuki Saito

Lecturer, The University of Tokyo

Verified email at ipc.i.u-tokyo.ac.jp - Homepage

Speech synthesis Voice conversion Machine learning


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Statistical parametric speech synthesis incorporating generative adversarial networks Y Saito, S Takamichi, H Saruwatari IEEE/ACM Transactions on Audio, Speech, and Language Processing 26 (1), 84-96, 2017	261	2017
Non-parallel voice conversion using variational autoencoders conditioned by phonetic posteriorgrams and d-vectors Y Saito, Y Ijima, K Nishida, S Takamichi 2018 IEEE International Conference on Acoustics, Speech and Signal …, 2018	140	2018
Voice conversion using sequence-to-sequence learning of context posterior probabilities H Miyoshi, Y Saito, S Takamichi, H Saruwatari arXiv preprint arXiv:1704.02360, 2017	66	2017
JVS corpus: free Japanese multi-speaker voice corpus S Takamichi, K Mitsui, Y Saito, T Koriyama, N Tanji, H Saruwatari arXiv preprint arXiv:1908.06248, 2019	63	2019
JSUT and JVS: Free Japanese voice corpora for accelerating speech synthesis research S Takamichi, R Sonobe, K Mitsui, Y Saito, T Koriyama, N Tanji, ... Acoustical Science and Technology 41 (5), 761-768, 2020	52	2020
Phase reconstruction from amplitude spectrograms based on von-Mises-distribution deep neural network S Takamichi, Y Saito, N Takamune, D Kitamura, H Saruwatari 2018 16th International Workshop on Acoustic Signal Enhancement (IWAENC …, 2018	46	2018
Training algorithm to deceive anti-spoofing verification for DNN-based speech synthesis Y Saito, S Takamichi, H Saruwatari 2017 IEEE International Conference on Acoustics, Speech and Signal …, 2017	37	2017
Voice conversion using input-to-output highway networks Y Saito, S Takamichi, H Saruwatari IEICE Transactions on Information and Systems 100 (8), 1925-1928, 2017	32	2017
Text-to-speech synthesis using STFT spectra based on low-/multi-resolution generative adversarial networks Y Saito, S Takamichi, H Saruwatari 2018 IEEE International Conference on Acoustics, Speech and Signal …, 2018	29	2018
Phase reconstruction from amplitude spectrograms based on directional-statistics deep neural networks S Takamichi, Y Saito, N Takamune, D Kitamura, H Saruwatari Signal Processing 169, 107368, 2020	25	2020
Cross-Lingual Text-To-Speech Synthesis via Domain Adaptation and Perceptual Similarity Regression in Speaker Space. D Xin, Y Saito, S Takamichi, T Koriyama, H Saruwatari Interspeech, 2947-2951, 2020	20	2020
Face2Speech: Towards Multi-Speaker Text-to-Speech Synthesis Using an Embedding Vector Predicted from a Face Image. S Goto, K Onishi, Y Saito, K Tachibana, K Mori INTERSPEECH, 1321-1325, 2020	19	2020
Real-Time, Full-Band, Online DNN-Based Voice Conversion System Using a Single CPU. T Saeki, Y Saito, S Takamichi, H Saruwatari INTERSPEECH, 1021-1022, 2020	14	2020
HumanGAN: generative adversarial network with human-based discriminator and its evaluation in speech perception modeling K Fujii, Y Saito, S Takamichi, Y Baba, H Saruwatari ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020	14	2020
Vocoder-free text-to-speech synthesis incorporating generative adversarial networks using low-/multi-frequency STFT amplitude spectra Y Saito, S Takamichi, H Saruwatari Computer Speech & Language 58, 347-363, 2019	12	2019
DNN-based speaker embedding using subjective inter-speaker similarity for multi-speaker modeling in speech synthesis Y Saito, S Takamichi, H Saruwatari arXiv preprint arXiv:1907.08294, 2019	12	2019
Perceptual-similarity-aware deep speaker representation learning for multi-speaker generative modeling Y Saito, S Takamichi, H Saruwatari IEEE/ACM Transactions on Audio, Speech, and Language Processing 29, 1033-1048, 2021	11	2021
Cross-Lingual Speaker Adaptation Using Domain Adaptation and Speaker Consistency Loss for Text-To-Speech Synthesis. D Xin, Y Saito, S Takamichi, T Koriyama, H Saruwatari Interspeech, 1614-1618, 2021	11	2021
Generative moment matching network-based random modulation post-filter for DNN-based singing voice synthesis and neural double-tracking H Tamaru, Y Saito, S Takamichi, T Koriyama, H Saruwatari ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019	10	2019
Duration-aware pause insertion using pre-trained language model for multi-speaker text-to-speech D Yang, T Koriyama, Y Saito, T Saeki, D Xin, H Saruwatari ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023	9	2023

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors