Introduction

Human vision, by enabling people to interpret their surrounding environment, is one of our most important senses, as many experts consider that 80% of what we perceive comes through vision. Put simply, Computer Vision is the sub-discipline of Artificial Intelligence which teaches machines to “see like a human”. More precisely, it consists of specific hardware, and/or software algorithms providing computers with the ability to capture, process and interpret images, videos or signals taken from a camera or other sensors.

Researchers started to work on Computer Vision in the 1960’s, achieving constant progress in this field. In the 2010’s, Deep Learning, a branch of Machine Learning, revolutionized Computer Vision. Among other breakthroughs, Deep Learning-based algorithms surpassed human in their ability to recognize human faces in 2014. Since then, Computer Vision is one of the hottest topics in the broad field of Artificial Intelligence. Computer Vision is nowadays applied in most of the aspects of our daily life: medicine, manufacturing, biometry, autonomous vehicles, digitization of paper documents and books for electronic access, military and law enforcement, recycling household waste or other environmental applications using aerial/satellite images, etc.

Our research group focuses on the conception and development of high-speed, light-weight and effective algorithms for analysis and understanding different types of images/videos: natural images/videos (taken through regular cameras), medical images, remote sensing images, document images. See the slides here for more detail.

Contact: Dr. Nguyen Thi Oanh, Email: oanhnt@soict.hust.edu.vn

Research Directions

We are especially interested in the tasks of object detection, classification, semantic segmentation and tracking.

Some keywords about our research directions include:

  • Multimodality
    • Spatio-temporal information
    • Raw data (or text) associated with the images
  • Domain adaptation
    • Transferring the model learned from one set of images to a different set of images
  • Limited resources constraints (linked to embedded systems)
    • Definition of light weight models
  • User interaction

Examples of methods we use include both traditional Image Processing methods and Machine Learning methods, especially Deep Learning (often with Convolutional Neural Networks and Recurrent Neural Networks).

Research Problems

Our research problems include, but are not limited to: 

  • Medical imaging:
    • Segmentation of colon polyps and identifying lesions at high-risk of malignancy (cancer) during endoscopy
    • Detecting brain degeneration for Alzheimer’s patients from 3D MRI images and clinical data
  • Traffic monitoring and autonomous vehicles
    • Vehicles and pedestrian tracking in videos, including embedding the proposed algorithms in edge devices
    • Semantic segmentation for intelligent vehicles 
  • Remote sensing – satellite image processing and analysis:
    • Adjusting Geostationary (GEO) satellite images with Low-Earth-Orbit (LEO) images
    • Study of Urban Heat Islands and their impact on the environment and humans 
  • Gesture recognition from videos:
    • Human Action Recognition
    • Hand Gesture Recognition 
  • Document analysis and understanding:
    • Incremental multimodal classification from streams of documents
    • Understanding ancient Vietnamese text (Han-Nom characters)
  • Biometry access control: face verification and anti-spoofing

Team Members

Assoc. Prof. Muriel VISANI
Team Leader

Dr. Dinh Viet Sang
Member

Dr. Nguyen Thi Oanh
Member

Dr. Tran Nguyen Ngoc
Member

Dr. Dang Tuan Linh
Member

Dr. Ngo Thanh Trung
Member

Projects and Solutions

Collaborations

National partners (in Vietnam)

  • USTH: ICTLab & Space departments
  • MICA (HUST)
  • HUS-VNU
  • VNU-UET (FIMO)
  • VNUA (FIT)
  • HCMUS
  • Can Tho University
  • IRD: Institut de Recherche pour le Développement (Vietnam branch)

International partners

  • Asia-Pacific:
    • Australia: University of Technology Sydney, Bureau of meteorology, CSIRO, Griffith Uniersity, The University of Queensland
    • China: Lanzhou University
    • Japan: University of Tsukuba, Kochi University of Technology
    • South Korea: Chosun University
  • America:
    • USA: University of Hawaii
    • Brazil: University of Sao Paulo
  • Russia: Tula State University
  • Africa: Tunisia – Sfax University
  • Europe:
    • France: La Rochelle University, Poitiers University, Bordeaux University, INSA Lyon, Nancy University
    • Switzerland: Fribourg University
    • Spain: Universitat Autonoma de Barcelona

Latest Publications

Publications in 2025

  1. Quang Duc Nguyen, Tung Nguyen, Duc Anh Nguyen, Linh Ngo Van, Sang Dinh, Thien Huu Nguyen. GloCOM: A Short Text Neural Topic Model via Global Clustering Context. Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies. 1109–1124. Albuquerque, New Mexico. 28/04/2025
  2. Tung Nguyen, Tue Le, Hoang Tran Vuong, Quang Duc Nguyen, Duc Anh Nguyen, Linh Ngo Van, Sang Dinh, Thien Huu Nguyen. Sharpness-Aware Minimization for Topic Models with High-Quality Document Representations. Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies. 4507–4524. Albuquerque, New Mexico. 28/04/2025
  3. Dang Tuan Linh, Hoang Minh Hoang, Ngo Viet Anh, Duong Minh Quan, Ha Hoang Hiep, Nguyen The An, Le Hoang. Real-time person re-identification and tracking on edge devices with distributed optimization. Pattern Analysis and Applications. 1-22. 14/05/2025
  4. Cui Wei, Ullah Ismat, Lin Weiming, Zhang Jupei, Chen Zhaowei, Yang Shuyi, Peng Wei, Zhuang Yin, Chen Wenjin, Cao Yi, Zhang Shujun, Jin Shengyang, Yang Liang. Multifunctional Sr2+/Zn2+ Co‐Doped Mesoporous Silica Nanoparticles in Injectable Hydrogel for Ameliorating Osteoporotic Osseointegration. Advanced Healthcare Materials. 15/06/2025
  5. Chau Nguyen Minh, Sang Dinh Viet. A Diffusion Model for Personalized Text-to-Image Generation. Communications in Computer and Information Science. 418-431. 13/12/2024
  6. Son Pham Tien, Hieu Nguyen Doan, An Nguyen Dai, Sang Dinh Viet. Improving Vietnamese Legal Document Retrieval Using Synthetic Data. Communications in Computer and Information Science. 378-393. 13/12/2024
  7. Tu Vu, Manh Do, Tung Nguyen, Linh Ngo Van, Sang Dinh, Thien Huu Nguyen. Topic Modeling for Short Texts via Optimal Transport-Based Clustering. Findings of the Association for Computational Linguistics: ACL 2025. 7666–7680. 27/07/2025
  8. Le Huong, Luu Ngoc, Nguyen Thanh, Dao Tuan, Dinh Sang. Optimizing Answer Generator in Vietnamese Legal Question Answering Systems Using Language Models. ACM Transactions on Asian and Low-Resource Language Information Processing. 1-17. 12/02/2025
  9. Hoshino Yukinobu, Rathnayake Namal, Dang Tuan Linh, Rathnayake Upaka. Flow Velocity Analysis of Rivers Using Farneback Optical Flow and STIV Techniques With Drone Data. SOICT 2024 (Kỷ yếu đăng tại Communications in Computer and Information Science book series). 17-26. Danang, Vietnam. 13/12/2024
  10. Hoang Tran Vuong, Tue Le, Tu Vu, Tung Nguyen, Linh Van Ngo, Sang Dinh, Thien Huu Nguyen. HiCOT: Improving Neural Topic Models via Optimal Transport andContrastive Learning. Findings of the Association for Computational Linguistics: ACL 2025. 13894–13920. 27/07/2025
  11. Kim Hương Trang, Trịnh Quốc Dũng, Trần Hà Tiến Thịnh, Vũ Tuấn Thái, Hoàng Minh Tuấn, Phan Thị Ngọc Linh, TS. Trần Nguyên Ngọc, TS. Bùi Thị Mai Anh, ThS. Vũ Văn Đức. Khai thác hình ảnh vệ tinh, khám phá sự bất bình đẳng kinh tế các địa phương và hoạt động kinh doanh bền vững của doanh nghiệp. Tạp chí tài chính. 84-87. 11/02/2025
  12. Anh Duc Le, Nam Le Hai, Thanh Xuan Nguyen, Linh Ngo Van, Nguyen Thi Ngoc Diep, Sang Dinh, Thien Huu Nguyen. Enhancing Discriminative Representation in Similar Relation Clusters for Few-Shot Continual Relation Extraction. Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies. 2450–2467. Albuquerque, New Mexico. 28/04/2025
  13. Li Chenhao, Ngo Trung Thanh, Nagahara Hajime. Simultaneous acquisition of geometry and material for translucent objects. Image and Vision Computing. 105793. 24/10/2025
  14. Bui Tien Dung, Pham Tuan Tai, Dang Tuan Linh. CoNet: a lightweight color classification architecture using residual connection and MBConv. Neural Computing and Applications. 9705-9720. 19/02/2025
  15. Toan Ngoc Nguyen, Nam Le Hai, Nguyen Doan Hieu, Dai An Nguyen, Linh Ngo Van, Thien Huu Nguyen, Sang Dinh. Improving Vietnamese-English Cross-Lingual Retrieval for Legal and General Domains. Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies. 142–153. Albuquerque, New Mexico. 28/04/2025

Publications in 2024

  1. Ren Zhiyao, Dinh Viet Sang, Wong Pooi-Mun, Chng Chin-Boon, Too Joan Jue-Ying, Foong Theng-Wai, Loh Will Ne-Hooi, Chui Chee-Kong. G2LCPS: End-to-end semi-supervised landmark prediction with global-to-local cross pseudo supervision for airway difficulty assessment. Computers in Biology and Medicine. 109246. 02/10/2024
  2. T. K. Lai, and I. L. Ngo. An investigation on the thermo-electrohydraulic performance of novel ECF micro-pump.. International Journal of Heat and Mass Transfer. 29/09/2024
  3. Sikandar Ali Qalati, MengMeng Jiang, Samuel Gyedu, and Emmanuel Kwaku Manu. Do Strong Innovation Capability and Environmental Turbulence Influence the Nexus Between Customer Relationship Management and Business Performance?. Business Strategy and the Environment. 02/07/2024
  4. Randika K. Makumbura; Lakindu Mampitiya; Namal Rathnayake; D.P.P. Meddage; Shagufta Henna; Tuan Linh Dang; Yukinobu Hoshino; Upaka Rathnayake. Advancing water quality assessment and prediction using machine learning models, coupled with explainable artificial intelligence (XAI) techniques like shapley additive explanations (SHAP) for interpreting the black-box nature. Results in Engineering. 1-14. 01/09/2024
  5. Tuan Linh Dang, Thuy Ha Hoang, Minh Hoang Cu, Duc Quang Nguyen, Huu Phuc Hoang. Semi-supervised Learning for Image Quality Assessment Problem. International Journal of Computer Applications. 9-13. 21/02/2024
  6. JYE Tin, WW Tan, AA Bakar, MS Mahali, FF Lothai, NF Mohammad, SSA Hassan & KF Chin. A Conceptual Design of Sustainable Solar Photovoltaic (PV) Powered Corridor Lighting System with IoT Application. ICREEM 2022. 09/03/2024
  7. Trinh Thi Ha, Nguyen Trung Dung, Nguyen Thanh Huong, Tran Trong An, Pham Van Tuan, Vu Ngoc Hung, Chu Manh Hoang. Investigating the coupling length of two triangle hybrid gap plasmonic waveguides. The International Conference on Advanced Materials and Technology (ICAMT 2024). 10-13. Hanoi. 09/10/2024
  8. Nguyen Quoc-Viet, Nguyen Thi-Oanh. SCA-DS: Face Anti-spoofing Leveraging Enhanced Spatial and Channel-Wise Attention and Depth Supervision. Communications in Computer and Information Science. 257-272. Da Nang, Vietnam. 13/12/2024
  9. Tuan Linh Dang, Dinh Minh Vu, Ngoc Dung Pham, The Vu Nguyen, Dinh Phu Mac, Nguyen Minh Nhat Hoang, Huy Hoang Pham. Enhance Massive Open Online Courses Integrity: AI for Exam Proctoring. Journal of Science and Technology: Smart Systems and Devices. 1-8. 25/06/2024
  10. Quang Minh Dang, Minh Tuyen Truong, Tuan Linh Dang. A lightweight approach for image quality assessment. Signal, Image and Video Processing. 1-8. 01/06/2024
  11. Sikandar Ali Qalati, Domitilla Magni, and Faiza Siddiqui. Senior Management's Sustainability Commitment and Environmental Performance: Revealing the Role of Green Human Resource Management Practices.. Business Strategy and the Environment. 02/08/2024
  12. T. K. Lai, and I. L. Ngo. A new design and optimization of VD-ECF micro-pump: Advancements in electrohydraulic performance. Physics of Fluids. 29/07/2024
  13. Tuan Linh Dang, Trung Hieu Pham, Duc Loc Le, Xuan Tung Tran, Hoang Nam Le, Khanh Hung Nguyen, Tran Tuan Nghia Trinh. Person re-identification on lightweight devices: end-to-end approach. Multimedia Tools and Applications. 1-14. 27/03/2024
  14. Vu Ho Tran Anh , Thi-Oanh Nguyen. Enhanced Topology Representation Learning for Skeleton-Based Human Action Recognition. Procedia Computer Science. 3093-3102,. 28/05/2024
  15. Tuan Linh Dang, Hoang Vu Nguyen, Nguyen Minh Nhat Hoang, Quang Minh Dang, The Vu Nguyen, Quang Hai Tran, Huy Hoang Pham. Auto-proctoring using computer vision in MOOCs system. Multimedia Tools and Applications. 1-27. 07/08/2024
  16. Huu Thang Nguyen, Anh Chung Hoang, Manh Cuong Bui, Tuan Linh Dang. IMBALANCE PROBLEM IN IMAGE QUALITY ASSESSMENT. ICIC Express Letters. 1145-1152. 01/04/2024
  17. T. K. Lai, and I. L. Ngo. An investigation on the electrohydraulic performance of novel ECF micro-pump with NACAshaped electrodes. Theoretical and Computational Fluid Dynamics. 29/02/2024
  18. Yukinobu Hoshino, Masahiro Shimasaki, Namal Rathnayake, Tuan Linh Dang. Performance verification and latency time evaluation of hardware image processing module for appearance inspection systems using FPGA. Journal of Real-Time Image Processing. 1-16. 26/11/2023
  19. Yukinobu Hoshino , Yuka Nishiyama, Toshimi Yamamoto, Yuki Shinomiya, Namal Rathnayake , Tuan Linh Dang. Human-inspired similarity control system: Enhancing line-following robot perception. Applied Soft Computing Journal. 1-15. 14/04/2024
  20. Nguyen Binh An, Dao Duc Manh, Nguyen Khanh Hung, Dang Tuan Linh. NIGHT VISION: ENHANCE OBJECT DETECTION IN LOW-LIGHT CONDITION BASELINE. The National Conference on Fundamental and Applied IT Research (FAIR’2024). 609-615. Hà Nội. 08/08/2024
  21. Chenhao Li, Trung Thanh Ngo, and Hajime Nagahara. Deep Polarization Cues for Single-Shot Shape and Subsurface Scattering Estimation. Lecture Notes in Computer Science. 55-73. Milan, Italy. 29/09/2024
  22. Yukinobu Hoshino; Namal Rathnayake; Tuan Linh Dang; Upaka Rathnayake. Empirical Research on 3D Analysis and Flow Prediction of Upstream Rivers Using Drones. 2024 Joint 13th International Conference on Soft Computing and Intelligent Systems and 25th International Symposium on Advanced Intelligent Systems (SCIS&ISIS). 1-6. Himeji, Japan. 09/11/2024
  23. Tuan Linh Dang, Trung Hieu Pham, Duc Manh Dao, Hoang Vu Nguyen, Quang Minh Dang, Ba Tuan Nguyen, Nicolas Monet. DATE: a video dataset and benchmark for dynamic hand gesture recognition. Neural Computing and Applications. 1-15. 09/05/2024

Publications in 2023

  1. Nguyễn Đức Ca, Phan Thị Thu, Hoàng Thị Minh Anh, Phạm Ngọc Dương, Nguyễn Hoàng Giang, Nguyễn Lệ Hằng. Nâng cao hiệu quả quản trị đại học trong bối cảnh đổi mới giáo dục tại Việt Nam. Tạp chí khoa học giáo dục Việt Nam. 14/03/2023
  2. Nguyen Quang Duc, Tran Khanh Luong, Le Hong Duc, Nguyen Huy Hoan, Trinh Anh Phuc, Dinh Viet Sang. Improving Single Positive Multi-label Classification via Knowledge-based Label-weighted Large Loss Rejection. The 12th International Symposium on Information and Communication Technology. 429-434. 07/12/2023
  3. Thuong Nguyen Canh; Trung Thanh Ngo; Hajime Nagahara. Human-Imperceptible Identification with Learnable Lensless Imaging. IEEE Access. 95724-95733. 18/08/2023
  4. Chenhao Li, Trung Thanh Ngo, Hajime Nagahara. Inverse Rendering of Translucent Objects using Physical and Neural Renderers. The IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023. 12510-12520. Vancouver Convention Center, Canada. 18/06/2023
  5. Huy-Hoang Nguyen, Thi-Oanh Nguyen. HRSeg: Leveraging High-Resolution Images to Enhance Polyp Segmentation Quality. 2023 15th International Conference on Knowledge and Systems Engineering (KSE). 1-4. 18/10/2023
  6. Ren-Jun Soon, Dinh Viet Sang, Chin-Boon Chng, Chee-Kong Chui. Explainable AI for CPS-Based Manufacturing Workcell. 2023 International Conference on System Science and Engineering (ICSSE). 332-337. 27/07/2023
  7. Ngo-Kien Duong, Viet-Sang Dinh, Thi-Oanh Nguyen. MCLDA: Multi-level Contrastive Learning for Domain Adaptive Semantic Segmentation. SOICT '23: Proceedings of the 12th International Symposium on Information and Communication Technology. 343–350. Ho Chi Minh, Viet Nam. 07/12/2023
  8. Sikandar Ali Qalati, Belem Barbosa, and Blend Ibrahim. Factors influencing employees’ eco-friendly innovation capabilities and behavior: the role of green culture and employees’ motivations. Environment, Development and Sustainability. 02/10/2023
  9. Tung Nguyen Quang, Thi-Oanh Nguyen. Language Knowledge-Assisted in Topology Construction for Skeleton-Based Action Recognition. SOICT 2023: The 12th International Symposium on Information and Communication Technology. 443-449. Ho Chi Minh, Vietnam. 07/12/2023
  10. Tuan Linh Dang, Gia Tuyen Nguyen, Thang Cao. Real-Time Image Processing Using Edge AI Devices. International Journal of Computer Applications. 1-7. 09/11/2023
  11. Nguyen Minh Chau, Nguyen Ngoc Toan, Le Dinh Tuyen, Dinh Viet Sang, Pooi-Mun Wong, Chin-Boon Chng and Chee-Kong Chui. Boosting Facial Landmark Detection via Self-supervised and Semi-supervised Learning. The 12th International Symposium on Information and Communication Technology. 485-492. 07/12/2023
  12. Nguyen Van Giang, Nguyen Minh Son, Kieu Anh Van, Tran Cat Khanh, Pham Ngoc Minh, Dinh Viet Sang. One-stage Robotic Grasp Detection. International Conference on Knowledge and Systems Engineering (KSE). 18/10/2023
  13. Namal Rathnayake, Tuan Linh Dang, Akira Miyazaki, and Yukinobu Hoshino. An Efficient Approach for Age-Wise Rice Seeds Classification using SURF-BOF with Modified Cascaded-ANFIS algorithm. Fifteenth International Conference on Machine Vision (ICMV 2022). 1-9. Rome, Ý. 18/11/2022
  14. Namal Rathnayake, Akira Miyazaki, Tuan Linh Dang, Yukinobu Hoshino. Age Classification of Rice Seeds in Japan Using Gradient-Boosting and ANFIS Algorithms. Sensors. 1-18. 03/03/2023
  15. Namal Rathnayake , Upaka Rathnayake, Tuan Linh Dang, Yukinobu Hoshino. Water level prediction using soft computing techniques: A case study in the Malwathu Oya, Sri Lanka. PLoS ONE. 1-21. 22/02/2023
  16. Sikandar Ali Qalati, Sonia Kumari, Kayhan Tajeddini, Namarta Kumari Bajaj, and Rajib Ali. Innocent devils: The varying impacts of trade, renewable energy and financial development on environmental damage: Nonlinearly exploring the disparity between developed and developing nations. Journal of Cleaner Production. 02/02/2023
  17. Wu Qinqin, Sikandar Ali Qalati, Rana Yassir Hussain, Hira Irshad, Kayhan Tajeddini, Faiza Siddique, Thilini Chathurika Gamage. The effects of enterprises' attention to digital economy on innovation and cost control: Evidence from A-stock market of China. Journal of Innovation & Knowledge. 02/12/2023
  18. Vu Quoc Hung, Tran Le Phuong Thao, Trinh Xuan Minh, Dinh Viet Sang. LSegDiff: A Latent Diffusion Model for Medical Image Segmentation. The 12th International Symposium on Information and Communication Technology. 456-462. 07/12/2023
  19. Tuan Linh Dang, Duc Loc Le, Trung Hieu Pham, Xuan Tung Tran. Lightweight Models’ Performances on a Resource-Constrained Device for Traffic Application. The Fourth International Conference on Artificial Intelligence and Computational Intelligence. (AICI 2023) (Kỷ yếu được đăng trong Deep Learning and Other Soft Computing Techniques, Studies in Computational Intelligence 1097 ). 1-14. Hà Nội. 13/01/2023
  20. Nguyen Viet Hoai, Pham Vu Hung, Dinh Viet Sang. Memory-Driven Region Contrast for Enhanced Polyp Semantic Segmentation. 2023 International Conference on Multimedia Analysis and Pattern Recognition, MAPR 2023 - Proceedings. 05/10/2023
  21. Nguyen Hong Son, Nguyen Thanh Huyen, Dinh Viet Sang. Semi-Supervised Learning with Dense Target Producer for End-to-End Lightweight Polyp Detection. International Conference on Knowledge and Systems Engineering (KSE). 18/10/2023
  22. Pham Van Toan, Dinh Viet Sang. M3C-Polyp: Mixed Momentum Model Committee for Improved Semi-Supervised Learning in Polyp Segmentation. World Symposium on Software Engineering. 274-279. 22/09/2023
  23. Sikandar Ali Qalati , Belem Barbosa & Shuja Iqbal. The effect of firms’ environmentally sustainable practices on economic performance. Economic. Economic Research-Ekonomska Istraživanja. 02/06/2023
  24. Tuan Linh Dang, Trung Hieu Pham, Quang Minh Dang, Nicolas Monet. A lightweight architecture for hand gesture recognition. Multimedia Tools and Applications. 28569–28587. 31/01/2023
  25. Toan Pham Van, Sang Dinh Viet, Linh Bao Doan, Thanh Tung Nguyen, Quang Hung Nguyen, Duc Trung Tran. Improve polyp semi-supervised segmentation with prioritizing the reliability of unlabeled images. ICSIE. 35-40. 21/10/2022
  26. Namal Rathnayake, Upaka Rathnayake, Imiya Chathuranika, Tuan Linh Dang, Yukinobu Hoshino. Projected Water Levels and Identified Future Floods: A Comparative Analysis for Mahaweli River, Sri Lanka. IEEE Access. 8920-8937. 17/01/2023
  27. Namal Rathnayake, Upaka Rathnayake, Imiya Chathuranika, Tuan Linh Dang, Yukinobu Hoshino. Cascaded-ANFIS to simulate nonlinear rainfall–runoff relationship. Applied Soft Computing. 1-14. 26/07/2023
  28. Pham Van Toan, Dinh Viet Sang. ESSL-Polyp: A Robust Framework of Ensemble Semi-supervised Learning in Polyp Segmentation. Computing Conference 2023. 39-52. 20/08/2023
  29. Yong Cheng, Wei Wang, Wenjie Zhang, Ling Yang, Jun Wang, Huan Ni, Tingzhao Guan, Jiaxin He, Yakang Gu and Ngoc Nguyen Tran. A Multi-Feature Fusion and Attention Network for Multi-Scale Object Detection in Remote Sensing Images. Remote Sensing. 1-19. 12/04/2023