Introduction

Human vision, by enabling people to interpret their surrounding environment, is one of our most important senses, as many experts consider that 80% of what we perceive comes through vision. Put simply, Computer Vision is the sub-discipline of Artificial Intelligence which teaches machines to “see like a human”. More precisely, it consists of specific hardware, and/or software algorithms providing computers with the ability to capture, process and interpret images, videos or signals taken from a camera or other sensors.

Researchers started to work on Computer Vision in the 1960’s, achieving constant progress in this field. In the 2010’s, Deep Learning, a branch of Machine Learning, revolutionized Computer Vision. Among other breakthroughs, Deep Learning-based algorithms surpassed human in their ability to recognize human faces in 2014. Since then, Computer Vision is one of the hottest topics in the broad field of Artificial Intelligence. Computer Vision is nowadays applied in most of the aspects of our daily life: medicine, manufacturing, biometry, autonomous vehicles, digitization of paper documents and books for electronic access, military and law enforcement, recycling household waste or other environmental applications using aerial/satellite images, etc.

Our research group focuses on the conception and development of high-speed, light-weight and effective algorithms for analysis and understanding different types of images/videos: natural images/videos (taken through regular cameras), medical images, remote sensing images, document images. See the slides here for more detail.

Contact: Dr. Nguyen Thi Oanh, Email: oanhnt@soict.hust.edu.vn

Research Directions

We are especially interested in the tasks of object detection, classification, semantic segmentation and tracking.

Some keywords about our research directions include:

  • Multimodality
    • Spatio-temporal information
    • Raw data (or text) associated with the images
  • Domain adaptation
    • Transferring the model learned from one set of images to a different set of images
  • Limited resources constraints (linked to embedded systems)
    • Definition of light weight models
  • User interaction

Examples of methods we use include both traditional Image Processing methods and Machine Learning methods, especially Deep Learning (often with Convolutional Neural Networks and Recurrent Neural Networks).

Research Problems

Our research problems include, but are not limited to: 

  • Medical imaging:
    • Segmentation of colon polyps and identifying lesions at high-risk of malignancy (cancer) during endoscopy
    • Detecting brain degeneration for Alzheimer’s patients from 3D MRI images and clinical data
  • Traffic monitoring and autonomous vehicles
    • Vehicles and pedestrian tracking in videos, including embedding the proposed algorithms in edge devices
    • Semantic segmentation for intelligent vehicles 
  • Remote sensing – satellite image processing and analysis:
    • Adjusting Geostationary (GEO) satellite images with Low-Earth-Orbit (LEO) images
    • Study of Urban Heat Islands and their impact on the environment and humans 
  • Gesture recognition from videos:
    • Human Action Recognition
    • Hand Gesture Recognition 
  • Document analysis and understanding:
    • Incremental multimodal classification from streams of documents
    • Understanding ancient Vietnamese text (Han-Nom characters)
  • Biometry access control: face verification and anti-spoofing

Team Members

Assoc. Prof. Muriel VISANI
Team Leader

Dr. Dinh Viet Sang
Member

Dr. Nguyen Thi Oanh
Member

Dr. Tran Nguyen Ngoc
Member

Dr. Dang Tuan Linh
Member

Dr. Ngo Thanh Trung
Member

Projects and Solutions

Collaborations

National partners (in Vietnam)

  • USTH: ICTLab & Space departments
  • MICA (HUST)
  • HUS-VNU
  • VNU-UET (FIMO)
  • VNUA (FIT)
  • HCMUS
  • Can Tho University
  • IRD: Institut de Recherche pour le Développement (Vietnam branch)

International partners

  • Asia-Pacific:
    • Australia: University of Technology Sydney, Bureau of meteorology, CSIRO, Griffith Uniersity, The University of Queensland
    • China: Lanzhou University
    • Japan: University of Tsukuba, Kochi University of Technology
    • South Korea: Chosun University
  • America:
    • USA: University of Hawaii
    • Brazil: University of Sao Paulo
  • Russia: Tula State University
  • Africa: Tunisia – Sfax University
  • Europe:
    • France: La Rochelle University, Poitiers University, Bordeaux University, INSA Lyon, Nancy University
    • Switzerland: Fribourg University
    • Spain: Universitat Autonoma de Barcelona

Latest Publications

Publications in 2025

  1. Quang Duc Nguyen, Tung Nguyen, Duc Anh Nguyen, Linh Ngo Van, Sang Dinh, Thien Huu Nguyen. GloCOM: A Short Text Neural Topic Model via Global Clustering Context. Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies. 1109–1124. Albuquerque, New Mexico. 28/04/2025
  2. Tung Nguyen, Tue Le, Hoang Tran Vuong, Quang Duc Nguyen, Duc Anh Nguyen, Linh Ngo Van, Sang Dinh, Thien Huu Nguyen. Sharpness-Aware Minimization for Topic Models with High-Quality Document Representations. Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies. 4507–4524. Albuquerque, New Mexico. 28/04/2025
  3. Dang Tuan Linh, Hoang Minh Hoang, Ngo Viet Anh, Duong Minh Quan, Ha Hoang Hiep, Nguyen The An, Le Hoang. Real-time person re-identification and tracking on edge devices with distributed optimization. Pattern Analysis and Applications. 1-22. 14/05/2025
  4. Cui Wei, Ullah Ismat, Lin Weiming, Zhang Jupei, Chen Zhaowei, Yang Shuyi, Peng Wei, Zhuang Yin, Chen Wenjin, Cao Yi, Zhang Shujun, Jin Shengyang, Yang Liang. Multifunctional Sr2+/Zn2+ Co‐Doped Mesoporous Silica Nanoparticles in Injectable Hydrogel for Ameliorating Osteoporotic Osseointegration. Advanced Healthcare Materials. 15/06/2025
  5. Tran Phuong Dong, Nguyen Hieu, Nguyen Thi-Oanh. Enhancing Unsupervised Domain Adaptation in Semantic Segmentation Through Selective Consensus and Gaussian Mixture Model-Based Pseudo-Labeling. 2025 IEEE International Conference on Image Processing (ICIP). 2832-2837. Anchorage, AK, USA. 14/09/2025
  6. Tue Le, Hoang Tran Vuong, Tung Nguyen, Linh Ngo Van, Sang Dinh, Trung Le, Thien Huu Nguyen. Multi-Surrogate-Objective Optimization for Neural Topic Models. Findings of the Association for Computational Linguistics: EMNLP 2025. 135–151. 04/11/2025
  7. Chau Nguyen Minh, Sang Dinh Viet. A Diffusion Model for Personalized Text-to-Image Generation. Communications in Computer and Information Science. 418-431. 13/12/2024
  8. Binh An Nguyen, Minh Bao Kha, Duc Manh Dao, Huu Kien Nguyen, My Duyen Nguyen, The Vu Nguyen, Namal Rathnayake, Yukinobu Hoshino, Tuan Linh Dang. UFR-GAN: A lightweight multi-degradation image restoration model. Pattern Recognition Letters. 282–287. 17/08/2025
  9. Son Pham Tien, Hieu Nguyen Doan, An Nguyen Dai, Sang Dinh Viet. Improving Vietnamese Legal Document Retrieval Using Synthetic Data. Communications in Computer and Information Science. 378-393. 13/12/2024
  10. Tu Vu, Manh Do, Tung Nguyen, Linh Ngo Van, Sang Dinh, Thien Huu Nguyen. Topic Modeling for Short Texts via Optimal Transport-Based Clustering. Findings of the Association for Computational Linguistics: ACL 2025. 7666–7680. 27/07/2025
  11. Nguyen Tung, Pham Duy-Tung, Nguyen Quang Duc, Ngo Van Linh, Nguyen Duc Anh, Dinh Viet Sang. TopiCOT: Neural topic model aligning with pre-trained clustering and optimal transport. Neurocomputing. 131268. 13/08/2025
  12. Le Huong, Luu Ngoc, Nguyen Thanh, Dao Tuan, Dinh Sang. Optimizing Answer Generator in Vietnamese Legal Question Answering Systems Using Language Models. ACM Transactions on Asian and Low-Resource Language Information Processing. 1-17. 12/02/2025
  13. Yukinobu Hoshino, Keigo Yoshimi, Tuan Linh Dang, Namal Rathnayake. Controlling Heterogeneous Multi-Agent Systems Under Uncertainty Using Fuzzy Inference and Evolutionary Search. Information. 2-23. 06/08/2025
  14. Hoshino Yukinobu, Rathnayake Namal, Dang Tuan Linh, Rathnayake Upaka. Flow Velocity Analysis of Rivers Using Farneback Optical Flow and STIV Techniques With Drone Data. SOICT 2024 (Kỷ yếu đăng tại Communications in Computer and Information Science book series). 17-26. Danang, Vietnam. 13/12/2024
  15. Dang Tuan Linh, Nguyen Trong Nghia, Vu Tuan Minh. Carixray: a periapical X-ray dataset for machine vision-based dental caries recognition. Machine Vision and Applications. 1-18. 01/12/2025
  16. Tran Anh Vu Ho, Le Duc Quan, Nguyen Thi-Oanh. Efficient Sign Language Recognition with Skeleton Data: A Study of Keypoint Selection, Pose Estimators, and GCN Models. 2025 International Conference on Multimedia Analysis and Pattern Recognition (MAPR). 1-6. Khanh Hoa, Vietnam. 14/08/2025
  17. Hoang Tran Vuong, Tue Le, Tu Vu, Tung Nguyen, Linh Van Ngo, Sang Dinh, Thien Huu Nguyen. HiCOT: Improving Neural Topic Models via Optimal Transport andContrastive Learning. Findings of the Association for Computational Linguistics: ACL 2025. 13894–13920. 27/07/2025
  18. Kim Hương Trang, Trịnh Quốc Dũng, Trần Hà Tiến Thịnh, Vũ Tuấn Thái, Hoàng Minh Tuấn, Phan Thị Ngọc Linh, TS. Trần Nguyên Ngọc, TS. Bùi Thị Mai Anh, ThS. Vũ Văn Đức. Khai thác hình ảnh vệ tinh, khám phá sự bất bình đẳng kinh tế các địa phương và hoạt động kinh doanh bền vững của doanh nghiệp. Tạp chí tài chính. 84-87. 11/02/2025
  19. Anh Duc Le, Nam Le Hai, Thanh Xuan Nguyen, Linh Ngo Van, Nguyen Thi Ngoc Diep, Sang Dinh, Thien Huu Nguyen. Enhancing Discriminative Representation in Similar Relation Clusters for Few-Shot Continual Relation Extraction. Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies. 2450–2467. Albuquerque, New Mexico. 28/04/2025
  20. Li Chenhao, Ngo Trung Thanh, Nagahara Hajime. Simultaneous acquisition of geometry and material for translucent objects. Image and Vision Computing. 105793. 24/10/2025
  21. Dao Manh, Dang Tuan Linh. Enhancing continual semantic segmentation with visual explanations and model adaptations. Neurocomputing. 131637. 20/09/2025
  22. Bui Tien Dung, Pham Tuan Tai, Dang Tuan Linh. CoNet: a lightweight color classification architecture using residual connection and MBConv. Neural Computing and Applications. 9705-9720. 19/02/2025
  23. Tien Phat Nguyen, Vu Minh Ngo, Tung Nguyen, Linh Van Ngo, Duc Anh Nguyen, Sang Dinh, Trung Le. XTRA: Cross-Lingual Topic Modeling with Topic and Representation Alignments. Findings of the Association for Computational Linguistics: EMNLP 2025. 5561–5575. 04/11/2025
  24. Toan Ngoc Nguyen, Nam Le Hai, Nguyen Doan Hieu, Dai An Nguyen, Linh Ngo Van, Thien Huu Nguyen, Sang Dinh. Improving Vietnamese-English Cross-Lingual Retrieval for Legal and General Domains. Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies. 142–153. Albuquerque, New Mexico. 28/04/2025
  25. Pham Thanh Duc, Hai Nam Le, Van Linh Ngo, Diep Nguyen Thi Ngoc, Dinh Sang, Nguyen Thien Huu. Mitigating Non-Representative Prototypes and Representation Bias in Few-Shot Continual Relation Extraction. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 10791-10809. Vienna, Austria. 27/07/2025

Publications in 2024

  1. Ren Zhiyao, Dinh Viet Sang, Wong Pooi-Mun, Chng Chin-Boon, Too Joan Jue-Ying, Foong Theng-Wai, Loh Will Ne-Hooi, Chui Chee-Kong. G2LCPS: End-to-end semi-supervised landmark prediction with global-to-local cross pseudo supervision for airway difficulty assessment. Computers in Biology and Medicine. 109246. 02/10/2024
  2. T. K. Lai, and I. L. Ngo. An investigation on the thermo-electrohydraulic performance of novel ECF micro-pump.. International Journal of Heat and Mass Transfer. 29/09/2024
  3. Sikandar Ali Qalati, MengMeng Jiang, Samuel Gyedu, and Emmanuel Kwaku Manu. Do Strong Innovation Capability and Environmental Turbulence Influence the Nexus Between Customer Relationship Management and Business Performance?. Business Strategy and the Environment. 02/07/2024
  4. Randika K. Makumbura; Lakindu Mampitiya; Namal Rathnayake; D.P.P. Meddage; Shagufta Henna; Tuan Linh Dang; Yukinobu Hoshino; Upaka Rathnayake. Advancing water quality assessment and prediction using machine learning models, coupled with explainable artificial intelligence (XAI) techniques like shapley additive explanations (SHAP) for interpreting the black-box nature. Results in Engineering. 1-14. 01/09/2024
  5. Tuan Linh Dang, Thuy Ha Hoang, Minh Hoang Cu, Duc Quang Nguyen, Huu Phuc Hoang. Semi-supervised Learning for Image Quality Assessment Problem. International Journal of Computer Applications. 9-13. 21/02/2024
  6. JYE Tin, WW Tan, AA Bakar, MS Mahali, FF Lothai, NF Mohammad, SSA Hassan & KF Chin. A Conceptual Design of Sustainable Solar Photovoltaic (PV) Powered Corridor Lighting System with IoT Application. ICREEM 2022. 09/03/2024
  7. Trinh Thi Ha, Nguyen Trung Dung, Nguyen Thanh Huong, Tran Trong An, Pham Van Tuan, Vu Ngoc Hung, Chu Manh Hoang. Investigating the coupling length of two triangle hybrid gap plasmonic waveguides. The International Conference on Advanced Materials and Technology (ICAMT 2024). 10-13. Hanoi. 09/10/2024
  8. Nguyen Quoc-Viet, Nguyen Thi-Oanh. SCA-DS: Face Anti-spoofing Leveraging Enhanced Spatial and Channel-Wise Attention and Depth Supervision. Communications in Computer and Information Science. 257-272. Da Nang, Vietnam. 13/12/2024
  9. Tuan Linh Dang, Dinh Minh Vu, Ngoc Dung Pham, The Vu Nguyen, Dinh Phu Mac, Nguyen Minh Nhat Hoang, Huy Hoang Pham. Enhance Massive Open Online Courses Integrity: AI for Exam Proctoring. Journal of Science and Technology: Smart Systems and Devices. 1-8. 25/06/2024
  10. Quang Minh Dang, Minh Tuyen Truong, Tuan Linh Dang. A lightweight approach for image quality assessment. Signal, Image and Video Processing. 1-8. 01/06/2024
  11. Sikandar Ali Qalati, Domitilla Magni, and Faiza Siddiqui. Senior Management's Sustainability Commitment and Environmental Performance: Revealing the Role of Green Human Resource Management Practices.. Business Strategy and the Environment. 02/08/2024
  12. T. K. Lai, and I. L. Ngo. A new design and optimization of VD-ECF micro-pump: Advancements in electrohydraulic performance. Physics of Fluids. 29/07/2024
  13. Tuan Linh Dang, Trung Hieu Pham, Duc Loc Le, Xuan Tung Tran, Hoang Nam Le, Khanh Hung Nguyen, Tran Tuan Nghia Trinh. Person re-identification on lightweight devices: end-to-end approach. Multimedia Tools and Applications. 1-14. 27/03/2024
  14. Vu Ho Tran Anh , Thi-Oanh Nguyen. Enhanced Topology Representation Learning for Skeleton-Based Human Action Recognition. Procedia Computer Science. 3093-3102,. 28/05/2024
  15. Tuan Linh Dang, Hoang Vu Nguyen, Nguyen Minh Nhat Hoang, Quang Minh Dang, The Vu Nguyen, Quang Hai Tran, Huy Hoang Pham. Auto-proctoring using computer vision in MOOCs system. Multimedia Tools and Applications. 1-27. 07/08/2024
  16. Huu Thang Nguyen, Anh Chung Hoang, Manh Cuong Bui, Tuan Linh Dang. IMBALANCE PROBLEM IN IMAGE QUALITY ASSESSMENT. ICIC Express Letters. 1145-1152. 01/04/2024
  17. T. K. Lai, and I. L. Ngo. An investigation on the electrohydraulic performance of novel ECF micro-pump with NACAshaped electrodes. Theoretical and Computational Fluid Dynamics. 29/02/2024
  18. Yukinobu Hoshino, Masahiro Shimasaki, Namal Rathnayake, Tuan Linh Dang. Performance verification and latency time evaluation of hardware image processing module for appearance inspection systems using FPGA. Journal of Real-Time Image Processing. 1-16. 26/11/2023
  19. Yukinobu Hoshino , Yuka Nishiyama, Toshimi Yamamoto, Yuki Shinomiya, Namal Rathnayake , Tuan Linh Dang. Human-inspired similarity control system: Enhancing line-following robot perception. Applied Soft Computing Journal. 1-15. 14/04/2024
  20. Nguyen Binh An, Dao Duc Manh, Nguyen Khanh Hung, Dang Tuan Linh. NIGHT VISION: ENHANCE OBJECT DETECTION IN LOW-LIGHT CONDITION BASELINE. The National Conference on Fundamental and Applied IT Research (FAIR’2024). 609-615. Hà Nội. 08/08/2024
  21. Chenhao Li, Trung Thanh Ngo, and Hajime Nagahara. Deep Polarization Cues for Single-Shot Shape and Subsurface Scattering Estimation. Lecture Notes in Computer Science. 55-73. Milan, Italy. 29/09/2024
  22. Yukinobu Hoshino; Namal Rathnayake; Tuan Linh Dang; Upaka Rathnayake. Empirical Research on 3D Analysis and Flow Prediction of Upstream Rivers Using Drones. 2024 Joint 13th International Conference on Soft Computing and Intelligent Systems and 25th International Symposium on Advanced Intelligent Systems (SCIS&ISIS). 1-6. Himeji, Japan. 09/11/2024
  23. Tuan Linh Dang, Trung Hieu Pham, Duc Manh Dao, Hoang Vu Nguyen, Quang Minh Dang, Ba Tuan Nguyen, Nicolas Monet. DATE: a video dataset and benchmark for dynamic hand gesture recognition. Neural Computing and Applications. 1-15. 09/05/2024