‪Xiao Wang‬ - ‪Google Scholar‬

Get my own profile

Cited by

	All	Since 2019
Citations	2126	2126
h-index	7	7
i10-index	7	7

0

960

480

240

720

2020202120222023202439 73 186 866 957

Co-authors

Xiaohua ZhaiResearch Scientist, Google DeepmindVerified email at google.com
Lucas BeyerGoogle DeepMind, Google Brain, RWTH AachenVerified email at google.com
Andreas Peter SteinerSoftware engineer, Google ResearchVerified email at google.com
Daniel KeysersGoogleVerified email at google.com
Alexander KolesnikovResearch Scientist, Google DeepmindVerified email at google.com
Xi ChenGoogle DeepMindVerified email at google.com

Xiao Wang

Xiao Wang

Google DeepMind

Verified email at google.com - Homepage

Vision-Language Multimodal Computer Vision


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Pali: A jointly-scaled multilingual language-image model X Chen, X Wang, S Changpinyo, AJ Piergiovanni, P Padlewski, D Salz, ... ICLR 2023 (Oral), 2022	487	2022
LiT: Zero-Shot Transfer with Locked-image Text Tuning X Zhai, X Wang, B Mustafa, A Steiner, D Keysers, A Kolesnikov, L Beyer CVPR 2022, 2021	454	2021
Measuring compositional generalization: A comprehensive method on realistic data D Keysers, N Schärli, N Scales, H Buisman, D Furrer, S Kashubin, ... ICLR 2020, 2019	353	2019
Scaling vision transformers to 22 billion parameters M Dehghani, J Djolonga, B Mustafa, P Padlewski, J Heek, J Gilmer, ... ICML 2023 (Oral), 2023	341	2023
Simple Open-Vocabulary Object Detection with Vision Transformers M Minderer, A Gritsenko, A Stone, M Neumann, D Weissenborn, ... ECCV 2022, 2022	332*	2022
Pali-x: On scaling up a multilingual vision and language model X Chen, J Djolonga, P Padlewski, B Mustafa, S Changpinyo, J Wu, ... CVPR 2024, 2023	103	2023
Pali-3 vision language models: Smaller, faster, stronger X Chen, X Wang, L Beyer, A Kolesnikov, J Wu, P Voigtlaender, B Mustafa, ... arXiv preprint arXiv:2310.09199, 2023	41	2023
Three Towers: Flexible Contrastive Learning with Pretrained Image Models J Kossen, M Collier, B Mustafa, X Wang, X Zhai, L Beyer, A Steiner, ... NeuIPS 2023, 2023	6	2023
A study of autoregressive decoders for multi-tasking in computer vision L Beyer, B Wan, G Madan, F Pavetic, A Steiner, A Kolesnikov, AS Pinto, ... arXiv preprint arXiv:2303.17376, 2023	5	2023
LocCa: Visual Pretraining with Location-aware Captioners B Wan, M Tschannen, Y Xian, F Pavetic, I Alabdulmohsin, X Wang, ... arXiv preprint arXiv:2403.19596, 2024	2	2024
CLIP the Bias: How Useful is Balancing Data in Multimodal Learning? I Alabdulmohsin, X Wang, A Steiner, P Goyal, A D'Amour, X Zhai ICLR 2024, 2024	2	2024
Locked-Model Multimodal Contrastive Tuning D Keysers, X Zhai, X Wang, L Beyer, B Mustafa, A Steiner, A Kolesnikov US Patent App. 18/051,106, 2024		2024

The system can't perform the operation now. Try again later.

Articles 1–12