Dr. Yanbin Hao (郝艳宾, 郝艷賓)

Research Associate Professor (特任副研究员)
Ph.D in Signal and Information Processing (successive master-doctor program)
Google Scholar
Lab for Data Science (LDS)
Research Center for Data to Cyberspace (RCDC)
School of Information Science and Technology (信息科学技术学院(6系))
--> University of Science and Technology of China


Email: haoyanbin AT hotmail.com

Short Bio

Yanbin Hao received the B.E. (ranked top 1 out of 87 students) and Ph.D. degrees from the Hefei University of Technology (HFUT) , Hefei, China, in 2012 and 2017, respectively. He is currently a Research Associate Professor at the School of Information Science and Technology, University of Science and Technology of China (USTC), China. His research interests revolve around intelligent processing and applications of multimedia data, with a focus on feature extraction, content understanding, multimodal information fusion for decision-making and intelligent applications. He aims to develop efficient deep learning models for image and video content understanding, empowering artificial intelligence to address complex tasks like large-scale video retrieval, cross-modal understanding and generation.

Before joining USTC, he was a PhD student under the supervision of Prof. Jianguo Jiang, Prof. Richang Hong and Prof. Meng Wang in HFUT. During his PhD, he was also a visiting PhD (2015–2017) student under the supervision of Dr. Tingting Mu and Prof. Yannis Goulermas, Department of Electrical Engineering and Electronics, University of Liverpool, Liverpool, U.K. During 2018-2020, he joined the research group of Prof. Chong-Wah Ngo in the Department of Computer Science, City University of Hong Kong (CityU), as a Postdoc Fellow for projects: "Video Hyperlinking" and "Sport Video Analysis and Retrieval (體育視頻分析與檢索)".


Selected Publications

Conference
    2024
  • Enhancing Recipe Retrieval with Foundation Models: A Data Augmentation Perspective code
    Fangzhou Song, Bin Zhu, Yanbin Hao*, Shuo Wang, ECCV, 2024.

  • Enhance Image Classification via Inter-Class Image Mixup with Diffusion Model code
    Zhicai Wang, Longhui Wei, Tan Wang, Heyu Chen, Yanbin Hao, Xiang Wang, Xiangnan He, Qi Tian, CVPR, 2024.

  • Boosting Few-Shot Learning via Attentive Feature Regularization code
    Xingyu Zhu, Shuo Wang, Jinda Lu, Yanbin Hao, Haifeng Liu, Xiangnan He, AAAI, 2024.

  • PointTFA: Training-Free Clustering Adaption for Large 3D Point Cloud Models code
    Jinmeng Wu, Chong Cao, Hao Zhang, Basura Fernando, Yanbin Hao, Hanyu Hong, IJCAI, 2024.

  • Masked Collaborative Contrast for Weakly Supervised Semantic Segmentation code
    Fangwen Wu, Jingxuan He, Yufei Yin, Yanbin Hao,Gang Huang, Lechao Cheng, WACV, 2024.

  • 2023
  • Bi-directional Distribution Alignment for Transductive Zero-Shot Learning code
    Zhicai Wang, Yanbin Hao*, Tingting Mu, Ouxiang Li, Shuo Wang, Xiangnan He*, CVPR, 2023.

  • CgT-GAN: CLIP-guided Text GAN for Image Captioning code
    Jiarui Yu, Haoran Li, Yanbin Hao*, Bin Zhu, Tong Xu, Xiangnan He*, ACM MM, 2023.

  • 3D Human Pose Estimation with Spatio-Temporal Criss-cross Attention code
    Zhenhua Tang, Zhaofan Qiu, Yanbin Hao, Richang Hong, Ting Yao, CVPR, 2023.

  • How Can Contrastive Pre-training Benefit Audio-Visual Segmentation? A Study from Supervised and Zero-shot Perspectives code
    Jiarui Yu, Haoran Li, Yanbin Hao*, Jinmeng Wu, Tong Xu, Shuo Wang, Xiangnan He, BMVC, 2023.

  • Semantic-based Selection, Synthesis, and Supervision for Few-shot Learning code
    Jinda Lu, Shuo Wang, Xinyu Zhang, Yanbin Hao, Xiangnan He, ACM MM, 2023.

  • 2022
  • Group Contextualization for Video Recognition code
    Yanbin Hao, Hao Zhang*, Chong-Wah Ngo, Xiangnan He, CVPR, Poster, 2022.

  • MF-GAN: Multi-conditional Fusion Generative Adversarial Network for Text-to-Image Synthesis
    Yuyan Yang, Xin Ni, Yanbin Hao, Chenyu Liu, Wenshan Wang, Yifeng Liu and Haiyong Xie, MMM, Oral (Hornorable Mention Award), 2022.

  • Unsupervised Video Hashing with Multi-granularity Contextualization and Multi-structure Preservation code
    Yanbin Hao, Jingru Duan, Hao Zhang*, Bin Zhu, Pengyuan Zhou and Xiangnan He, ACM MM, poster, 2022.

  • Long-term Leap Attention, Short-term Periodic Shift for Video Classification code
    Hao Zhang, Lechao Cheng, Yanbin Hao* and Chong-Wah Ngo, ACM MM, oral, 2022.

  • Parameterization of Cross-Token Relations with Relative Positional Encoding for Vision MLP code
    Zhicai Wang, Yanbin Hao*, Xingyu Gao*, Hao Zhang, Shuo Wang, Tingting Mu and Xiangnan He, ACM MM, poster, 2022.

  • Hierarchical Hourglass Convolutional Network for Efficient Video Classification code
    Yi Tan, Yanbin Hao*, Hao Zhang, Shuo Wang and Xiangnan He*, ACM MM, poster, 2022.

  • Multi-directional Knowledge Transfer for Few-Shot Learning
    Shuo Wang, Xinyu Zhang, Yanbin Hao, Chengbing Wang and Xiangnan He*, ACM MM, poster, 2022.

  • 2021
  • Token Shift Transformer for Video Classification code
    Hao Zhang, Yanbin Hao*, Chong-Wah Ngo, ACM Multimedia (MM), Poster, 2021.

  • Selective Dependency Aggregation for Action Classification code
    Yi Tan, Yanbin Hao*, Xiangnan He, Yinwei Wei, Xun Yang, ACM Multimedia (MM), Poster, 2021.

  • Motion Prediction Using Trajectory Cues
    Zhenguang Liu, Pengxiang Su, Shuang Wu, Xuanjing Shen, Haipeng Chen, Yanbin Hao, Meng Wang, ICCV, 2021.

  • NASTER: Non-local Attentional Scene Text Recognizer
    Lei Wu, Xueliang Liu, Yanbin Hao, Yunjie Ma, Richang Hong, ICMR, 2021.

  • Aggregated Multi-GANs for Controlled 3D Human Motion Prediction
    Z. Liu, K. Lyu, S. Wu, H. Chen, Y. Hao, S. Ji, Association for the Advancement of Artificial Intelligence (AAAI), 2021.

  • 2020&before
  • Compact Bilinear Augmented Query Structured Attention for Sport Highlights Classification
    Y. Hao, H. Zhang*, C.-W. Ngo, Q. Liu, X. Hu, ACM Multimedia (MM), Oral, 2020.

  • Person-level Action Recognition in Complex Events via TSD-TSM networks
    Y. Hao, Z.-N. Liu, H. Zhang*, B. Zhu, J. Chen, Y. Jiang, C.-W. Ngo, ACM Multimedia Workshop(MMW), 2020.
    Rank 3rd in HiEve2020 Grand Challenge (Track-4)

  • R2GAN: Cross-modal recipe retrieval with generative adversarial network
    B. Zhu, C.-W. Ngo, J. Chen, and Y. Hao, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.

  • Cross-sentence Pre-trained model for Interactive QA matching
    J. Wu and Y. Hao, Proceedings of The 12th Language Resources and Evaluation Conference, 2020.

Journal
    2024
  • PosMLP-Video: Spatial and Temporal Relative Position Encoding for Efficient Video Recognition code
    Yanbin Hao, Diansong Zhou, Zhicai Wang, Chong-Wah Ngo, Meng Wang, International Journal of Computer Vision (IJCV), 2024.

  • Efficient Unsupervised Video Hashing with Contextual Modeling and Structural Controlling code
    Jingru Duan, Yanbin Hao*, Bin Zhu, Lechao Cheng, Pengyuan Zhou, Xiang Wang, IEEE Transactions on Multimedia (TMM), 2024.

  • Feature Mixture on Pre-Trained Model for Few-Shot Learning code
    Shuo Wang, Jinda Lu, Haiyang Xu Yanbin Hao, Xiangnan He, IEEE Transactions on Image Processing (TIP), 2024.

  • Two-Step Discrete Hashing for Cross-Modal Retrieval code
    Junfeng Tu, Xueliang Liu, Yanbin Hao*, Richang Hong, Meng Wang, IEEE Transactions on Multimedia (TMM), 2024.

  • 2023
  • FTCM: Frequency-Temporal Collaborative Module for Efficient 3D Human Pose Estimation in Video
    Zhenhua Tang, Yanbin Hao, Jia Li, Richang Hong, IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 2023.

  • MLP-JCG: Multi-Layer Perceptron with Joint-Coordinate Gating for Efficient 3D Human Pose Estimation
    Zhenhua Tang, Jia Li, Yanbin Hao, Richang Hong, IEEE Transactions on Multimedia (TMM), 2023.

  • Boosting Hyperspectral Image Classification with Dual Hierarchical Learning
    Shuo Wang, Huixia Ben, Yanbin Hao, Xiangnan He, Meng Wang, ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 2023.

  • 2022
  • Spatio-Temporal Collaborative Module for Efficient Action Recognition
    Yanbin Hao, Shuo Wang, Yi Tan, Xiangnan He, Zhenguang Liu, Meng Wang, IEEE Transactions on Image Processing (TIP), 2022.

  • Attention in Attention: Modeling Context Correlation for Efficient Video Classification
    Yanbin Hao, Shuo Wang, Pei Cao, Xinjian Gao, Tong Xu, Jinmeng Wu, Xiangnan He, IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 2022.

  • 2021
  • Learning to Match Anchor-Target Video Pairs with Dual Attentional Holographic Networks
    Yanbin Hao*, Chong-Wah Ngo, Bin Zhu, IEEE Trans on Image Processing (TIP), 2021.

  • Auxiliary Diagnosis for COVID-19 with Deep Transfer Learning
    Hongtao Chen, Shuanshuan Guo, Yanbin Hao*, Yijie Fang, Zhaoxiong Fang, Wenhao Wu, Zhigang Liu, Shaolin Li*, Journal of Digital Imaging, 2021.

  • Space-Time Separate Modeling for Efficient Video Classification
    Pei Cao, Shuo Wang*, Jinmeng Wu, Yanbin Hao, Journal of Physics: Conference Series, 2021.

  • 2020&Before
  • Cross-domain sentiment encoding through stochastic word embedding
    Y. Hao, T. Mu, R. Hong, M. Wang and J. Y. Goulermas, IEEE Trans on Knowledge and Data Engineering (TKDE), 2020.

  • Neighbourhood structure preserving cross-modal embedding for video hyperlinking code
    Y. Hao, C.-W. Ngo and B. Huet, IEEE Trans on Multimedia (TMM), 2019.

  • Unsupervised t-distributed video hashing and its deep hashing extension
    Y. Hao, T. Mu, J. Y. Goulermas, J. Jiang, R. Hong and M. Wang, IEEE Trans on Image Processing (TIP), 2017.

  • Stochastic multiview hashing for large-scale near-duplicate video retrieval code
    Y. Hao, T. Mu, R. Hong, M. Wang, N. An and J. Y. Goulermas, IEEE Trans on Multimedia (TMM), 2016.

  • 3D human pose estimation via human structure-aware fully connected network
    X. Zhang, Z. Tang, J. Hou and Y. Hao, Pattern Recognition Letters, 2019.


主持的科研项目

  • 面向大规模事件检索的高效紧凑视频表征方法研究,国家自然科学青年基金,2021-2024年。

  • 基于因果与认知推理的用户行为建模关键技术研究,国家自然科学联合基金重点项目,2021-2025年。

  • 人机协同的稿件质量评价体系,科技部重点研发计划,2020-2023年。

  • 社交媒体热点事件知识图谱构建,安徽省高校协同创新项目,2021-2023年。


  • My Students

  • 王志才,硕博连读-博士,方向:Zero-Shot Recognition, Visual Generation。

  • 段敬儒,硕博连读-博士,方向:Video Hashing, Cross-modal Retrieval。

  • 裴茗,硕博连读-博士,方向:Video Relation Detection, Video Semantic Understanding。

  • 汪远,硕博连读-博士,方向:Cooking Procedural Image Generation, Visual Generation Unlearning。

  • 李讴翔,硕博连读-博士,方向:Model Inversion Attacks, Unlearning。

  • 宋方舟,硕士,方向:Cross-modal Food-Receipt Retrieval。

  • 周殿松,硕士,方向:Video Recognition。

  • 李浩然,硕士,方向:Image Captioning, Audio-Visual Segmentation。

  • 郑善乐,硕士,方向:AI for News Generation。

  • 盛苑,硕士,方向:Multi-modal Large Language Models。

  • 李晨旭,硕士,方向:Multi-modal Large Language Models。

  • 谭懿,博士毕业(2024),方向:Video Recognition。

  • 于佳睿,硕士毕业(2024),方向:Image Captioning, Audio-Visual Segmentation。

  • 曹培,硕士毕业(2022),方向:Video Recognition。


  • 长期招硕士生,可推荐读博

  • 欢迎报考。