Yanbin Hao Homepage

Dr. Yanbin Hao (郝艳宾, 郝艷賓)

Professor (教授)

Ph.D in Signal and Information Processing (successive master-doctor program)

Google Scholar
Google Scholar
School of Computer Science and Information Engineering (计算机与信息学院)
Hefei University of Technology

Email: haoyanbin AT hfut.edu.cn

Short Bio

Yanbin Hao is currently a Professor in the School of Computer Science and Information Engineering at Hefei University of Technology (HFUT), China. His research interests revolve around intelligent processing and applications of multimedia data, with a focus on feature extraction, content understanding, multimodal information fusion for decision-making and intelligent applications. He aims to develop efficient deep learning models for image and video content understanding, empowering artificial intelligence to address complex tasks like large-scale video retrieval, cross-modal understanding and generation.

Before joining HFUT, he was an Associate Professor at Lab for Data Science (LDS) led by Prof. Xiangnan He, the University of Science and Technology of China (USTC) from 2021 to 2024. He was a Post-Doctoral Fellow at the VIREO Laboratory led by Prof. Chong-Wah Ngo, the City University of Hong Kong (CityU) from 2018 to 2020. He received his B.E. (ranked top 1/87) and Ph.D. (under the supervision of Prof. Jianguo Jiang, Prof. Richang Hong and Prof. Meng Wang in HFUT.) degrees from Hefei University of Technology, China, in 2012 and 2017, respectively. During his Ph.D., he was a Visiting Student under the supervision of Dr. Tingting Mu and Prof. Yannis Goulermas, at the Department of Electrical Engineering and Electronics at the University of Liverpool (UOL), U.K., from 2015 to 2017.

Advertisements: Recruiting Students for the 2026 Graduate Entrance Examination!

Selected Publications

Conference

2024

Enhancing Recipe Retrieval with Foundation Models: A Data Augmentation Perspective code
Fangzhou Song, Bin Zhu, Yanbin Hao*, Shuo Wang, ECCV, 2024.
Enhance Image Classification via Inter-Class Image Mixup with Diffusion Model code
Zhicai Wang, Longhui Wei, Tan Wang, Heyu Chen, Yanbin Hao, Xiang Wang, Xiangnan He, Qi Tian, CVPR, 2024.
Boosting Few-Shot Learning via Attentive Feature Regularization code
Xingyu Zhu, Shuo Wang, Jinda Lu, Yanbin Hao, Haifeng Liu, Xiangnan He, AAAI, 2024.
PointTFA: Training-Free Clustering Adaption for Large 3D Point Cloud Models code
Jinmeng Wu, Chong Cao, Hao Zhang, Basura Fernando, Yanbin Hao, Hanyu Hong, IJCAI, 2024.
Masked Collaborative Contrast for Weakly Supervised Semantic Segmentation code
Fangwen Wu, Jingxuan He, Yufei Yin, Yanbin Hao,Gang Huang, Lechao Cheng, WACV, 2024.

2023

Bi-directional Distribution Alignment for Transductive Zero-Shot Learning code
Zhicai Wang, Yanbin Hao*, Tingting Mu, Ouxiang Li, Shuo Wang, Xiangnan He*, CVPR, 2023.
CgT-GAN: CLIP-guided Text GAN for Image Captioning code
Jiarui Yu, Haoran Li, Yanbin Hao*, Bin Zhu, Tong Xu, Xiangnan He*, ACM MM, 2023.
3D Human Pose Estimation with Spatio-Temporal Criss-cross Attention code
Zhenhua Tang, Zhaofan Qiu, Yanbin Hao, Richang Hong, Ting Yao, CVPR, 2023.
How Can Contrastive Pre-training Benefit Audio-Visual Segmentation? A Study from Supervised and Zero-shot Perspectives code
Jiarui Yu, Haoran Li, Yanbin Hao*, Jinmeng Wu, Tong Xu, Shuo Wang, Xiangnan He, BMVC, 2023.
Semantic-based Selection, Synthesis, and Supervision for Few-shot Learning code
Jinda Lu, Shuo Wang, Xinyu Zhang, Yanbin Hao, Xiangnan He, ACM MM, 2023.

2022

Group Contextualization for Video Recognition code
Yanbin Hao, Hao Zhang*, Chong-Wah Ngo, Xiangnan He, CVPR, Poster, 2022.
MF-GAN: Multi-conditional Fusion Generative Adversarial Network for Text-to-Image Synthesis
Yuyan Yang, Xin Ni, Yanbin Hao, Chenyu Liu, Wenshan Wang, Yifeng Liu and Haiyong Xie, MMM, Oral (Hornorable Mention Award), 2022.
Unsupervised Video Hashing with Multi-granularity Contextualization and Multi-structure Preservation code
Yanbin Hao, Jingru Duan, Hao Zhang*, Bin Zhu, Pengyuan Zhou and Xiangnan He, ACM MM, poster, 2022.
Long-term Leap Attention, Short-term Periodic Shift for Video Classification code
Hao Zhang, Lechao Cheng, Yanbin Hao* and Chong-Wah Ngo, ACM MM, oral, 2022.
Parameterization of Cross-Token Relations with Relative Positional Encoding for Vision MLP code
Zhicai Wang, Yanbin Hao*, Xingyu Gao*, Hao Zhang, Shuo Wang, Tingting Mu and Xiangnan He, ACM MM, poster, 2022.
Hierarchical Hourglass Convolutional Network for Efficient Video Classification code
Yi Tan, Yanbin Hao*, Hao Zhang, Shuo Wang and Xiangnan He*, ACM MM, poster, 2022.
Multi-directional Knowledge Transfer for Few-Shot Learning
Shuo Wang, Xinyu Zhang, Yanbin Hao, Chengbing Wang and Xiangnan He*, ACM MM, poster, 2022.

2021

Token Shift Transformer for Video Classification code
Hao Zhang, Yanbin Hao*, Chong-Wah Ngo, ACM Multimedia (MM), Poster, 2021.
Selective Dependency Aggregation for Action Classification code
Yi Tan, Yanbin Hao*, Xiangnan He, Yinwei Wei, Xun Yang, ACM Multimedia (MM), Poster, 2021.
Motion Prediction Using Trajectory Cues
Zhenguang Liu, Pengxiang Su, Shuang Wu, Xuanjing Shen, Haipeng Chen, Yanbin Hao, Meng Wang, ICCV, 2021.
NASTER: Non-local Attentional Scene Text Recognizer
Lei Wu, Xueliang Liu, Yanbin Hao, Yunjie Ma, Richang Hong, ICMR, 2021.
Aggregated Multi-GANs for Controlled 3D Human Motion Prediction
Z. Liu, K. Lyu, S. Wu, H. Chen, Y. Hao, S. Ji, Association for the Advancement of Artificial Intelligence (AAAI), 2021.

2020&before

Compact Bilinear Augmented Query Structured Attention for Sport Highlights Classification
Y. Hao, H. Zhang*, C.-W. Ngo, Q. Liu, X. Hu, ACM Multimedia (MM), Oral, 2020.
Person-level Action Recognition in Complex Events via TSD-TSM networks
Y. Hao, Z.-N. Liu, H. Zhang*, B. Zhu, J. Chen, Y. Jiang, C.-W. Ngo, ACM Multimedia Workshop(MMW), 2020.
Rank 3rd in HiEve2020 Grand Challenge (Track-4）
R2GAN: Cross-modal recipe retrieval with generative adversarial network
B. Zhu, C.-W. Ngo, J. Chen, and Y. Hao, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
Cross-sentence Pre-trained model for Interactive QA matching
J. Wu and Y. Hao, Proceedings of The 12th Language Resources and Evaluation Conference, 2020.

Journal

2024

PosMLP-Video: Spatial and Temporal Relative Position Encoding for Efficient Video Recognition code
Yanbin Hao, Diansong Zhou, Zhicai Wang, Chong-Wah Ngo, Meng Wang, International Journal of Computer Vision (IJCV), 2024.
Efficient Unsupervised Video Hashing with Contextual Modeling and Structural Controlling code
Jingru Duan, Yanbin Hao*, Bin Zhu, Lechao Cheng, Pengyuan Zhou, Xiang Wang, IEEE Transactions on Multimedia (TMM), 2024.
Feature Mixture on Pre-Trained Model for Few-Shot Learning code
Shuo Wang, Jinda Lu, Haiyang Xu Yanbin Hao, Xiangnan He, IEEE Transactions on Image Processing (TIP), 2024.
Two-Step Discrete Hashing for Cross-Modal Retrieval code
Junfeng Tu, Xueliang Liu, Yanbin Hao*, Richang Hong, Meng Wang, IEEE Transactions on Multimedia (TMM), 2024.

2023

FTCM: Frequency-Temporal Collaborative Module for Efficient 3D Human Pose Estimation in Video
Zhenhua Tang, Yanbin Hao, Jia Li, Richang Hong, IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 2023.
MLP-JCG: Multi-Layer Perceptron with Joint-Coordinate Gating for Efficient 3D Human Pose Estimation
Zhenhua Tang, Jia Li, Yanbin Hao, Richang Hong, IEEE Transactions on Multimedia (TMM), 2023.
Boosting Hyperspectral Image Classification with Dual Hierarchical Learning
Shuo Wang, Huixia Ben, Yanbin Hao, Xiangnan He, Meng Wang, ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 2023.

2022

Spatio-Temporal Collaborative Module for Efficient Action Recognition
Yanbin Hao, Shuo Wang, Yi Tan, Xiangnan He, Zhenguang Liu, Meng Wang, IEEE Transactions on Image Processing (TIP), 2022.
Attention in Attention: Modeling Context Correlation for Efficient Video Classification
Yanbin Hao, Shuo Wang, Pei Cao, Xinjian Gao, Tong Xu, Jinmeng Wu, Xiangnan He, IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 2022.

2021

Learning to Match Anchor-Target Video Pairs with Dual Attentional Holographic Networks
Yanbin Hao*, Chong-Wah Ngo, Bin Zhu, IEEE Trans on Image Processing (TIP), 2021.
Auxiliary Diagnosis for COVID-19 with Deep Transfer Learning
Hongtao Chen, Shuanshuan Guo, Yanbin Hao*, Yijie Fang, Zhaoxiong Fang, Wenhao Wu, Zhigang Liu, Shaolin Li*, Journal of Digital Imaging, 2021.
Space-Time Separate Modeling for Efficient Video Classification
Pei Cao, Shuo Wang*, Jinmeng Wu, Yanbin Hao, Journal of Physics: Conference Series, 2021.

2020&Before

Cross-domain sentiment encoding through stochastic word embedding
Y. Hao, T. Mu, R. Hong, M. Wang and J. Y. Goulermas, IEEE Trans on Knowledge and Data Engineering (TKDE), 2020.
Neighbourhood structure preserving cross-modal embedding for video hyperlinking code
Y. Hao, C.-W. Ngo and B. Huet, IEEE Trans on Multimedia (TMM), 2019.
Unsupervised t-distributed video hashing and its deep hashing extension
Y. Hao, T. Mu, J. Y. Goulermas, J. Jiang, R. Hong and M. Wang, IEEE Trans on Image Processing (TIP), 2017.
Stochastic multiview hashing for large-scale near-duplicate video retrieval code
Y. Hao, T. Mu, R. Hong, M. Wang, N. An and J. Y. Goulermas, IEEE Trans on Multimedia (TMM), 2016.
3D human pose estimation via human structure-aware fully connected network
X. Zhang, Z. Tang, J. Hou and Y. Hao, Pattern Recognition Letters, 2019.

主持的科研项目

My Students

长期招硕士生，可推荐读博

欢迎报考。