Longtian Qiu
Longtian Qiu
Verified email at
Cited by
Cited by
Sphinx: The joint mixing of weights, tasks, and visual embeddings for multi-modal large language models
Z Lin*, C Liu*, R Zhang*, P Gao*, L Qiu*, H Xiao, H Qiu, C Lin, W Shao, ...
arXiv preprint arXiv:2311.07575, 2023
Calip: Zero-shot enhancement of clip with parameter-free attention
Z Guo*, R Zhang*, L Qiu*, X Ma, X Miao, X He, B Cui
Proceedings of the AAAI Conference on Artificial Intelligence 37 (1), 746-754, 2023
Vt-clip: Enhancing vision-language models with visual-guided texts
L Qiu, R Zhang, Z Guo, Z Zeng, Z Guo, Y Li, G Zhang
arXiv preprint arXiv:2112.02399, 2021
Joint-mae: 2d-3d joint masked autoencoders for 3d point cloud pre-training
Z Guo, R Zhang, L Qiu, X Li, PA Heng
arXiv preprint arXiv:2302.14007, 2023
HOICLIP: Efficient Knowledge Transfer for HOI Detection with Vision-Language Models
S Ning*, L Qiu*, Y Liu, X He
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023
A challenger to gpt-4v? early explorations of gemini in visual expertise
C Fu, R Zhang, H Lin, Z Wang, T Gao, Y Luo, Y Huang, Z Zhang, L Qiu, ...
arXiv preprint arXiv:2312.12436, 2023
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models
P Gao*, R Zhang*, C Liu*, L Qiu*, S Huang*, W Lin*, S Zhao, S Geng, ...
ICML 24, 2024
Lumina-T2X: Transforming Text into Any Modality, Resolution, and Duration via Flow-based Large Diffusion Transformers
P Gao, L Zhuo, Z Lin, C Liu, J Chen, R Du, E Xie, X Luo, L Qiu, Y Zhang, ...
arXiv preprint arXiv:2405.05945, 2024
Mining Fine-Grained Image-Text Alignment for Zero-Shot Captioning via Text-Only Training
L Qiu*, S Ning*, X He
AAAI 24, 2024
The system can't perform the operation now. Try again later.
Articles 1–9