TDN: Temporal Difference Networks for Efficient Action Recognition L Wang, Z Tong, B Ji, G Wu IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 1895-1904, 2021 | 189 | 2021 |
VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training Z Tong, Y Song, J Wang, L Wang 36th Conference on Neural Information Processing Systems (NeurIPS), 2022 | 134 | 2022 |
Not All Patches are What You Need: Expediting Vision Transformers via Token Reorganizations Y Liang, C Ge, Z Tong, Y Song, J Wang, P Xie International Conference on Learning Representations (ICLR), 2022 | 63 | 2022 |
AdaptFormer: Adapting Vision Transformers for Scalable Visual Recognition S Chen, C Ge, Z Tong, J Wang, Y Song, J Wang, P Luo 36th Conference on Neural Information Processing Systems (NeurIPS), 2022 | 39 | 2022 |
MGSampler: An Explainable Sampling Strategy for Video Action Recognition Y Zhi, Z Tong, L Wang, G Wu IEEE/CVF International Conference on Computer Vision (ICCV), 1513-1522, 2021 | 22 | 2021 |
VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking L Wang, B Huang, Z Zhao, Z Tong, Y He, Y Wang, Y Wang, Y Qiao IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023 | 1 | 2023 |
Soft Neighbors are Positive Supporters in Contrastive Visual Representation Learning C Ge, J Wang, Z Tong, S Chen, Y Song, P Luo International Conference on Learning Representations (ICLR), 2023 | | 2023 |
CycleACR: Cycle Modeling of Actor-Context Relations for Video Action Detection L Chen, Z Tong, Y Song, G Wu, L Wang arXiv preprint arXiv:2303.16118, 2023 | | 2023 |