Human vision, by enabling people to interpret their surrounding environment, is one of our most important senses, as many experts consider that 80% of what we perceive comes through vision. Put simply, Computer Vision is the sub-discipline of Artificial Intelligence which teaches machines to “see like a human”. More precisely, it consists of specific hardware, and/or software algorithms providing computers with the ability to capture, process and interpret images, videos or signals taken from a camera or other sensors.

Researchers started to work on Computer Vision in the 1960’s, achieving constant progress in this field. In the 2010’s, Deep Learning, a branch of Machine Learning, revolutionized Computer Vision. Among other breakthroughs, Deep Learning-based algorithms surpassed human in their ability to recognize human faces in 2014. Since then, Computer Vision is one of the hottest topics in the broad field of Artificial Intelligence. Computer Vision is nowadays applied in most of the aspects of our daily life: medicine, manufacturing, biometry, autonomous vehicles, digitization of paper documents and books for electronic access, military and law enforcement, recycling household waste or other environmental applications using aerial/satellite images, etc.

Our research group focuses on the conception and development of high-speed, light-weight and effective algorithms for analysis and understanding different types of images/videos: natural images/videos (taken through regular cameras), medical images, remote sensing images, document images. See the slides here for more detail.

Contact: Assoc. Prof. Muriel Visani, Email:

Research Directions

We are especially interested in the tasks of object detection, classification, semantic segmentation and tracking.

Some keywords about our research directions include:

  • Multimodality
    • Spatio-temporal information
    • Raw data (or text) associated with the images
  • Domain adaptation
    • Transferring the model learned from one set of images to a different set of images
  • Limited resources constraints (linked to embedded systems)
    • Definition of light weight models
  • User interaction

Examples of methods we use include both traditional Image Processing methods and Machine Learning methods, especially Deep Learning (often with Convolutional Neural Networks and Recurrent Neural Networks).

Research Problems

Our research problems include, but are not limited to: 

  • Medical imaging:
    • Segmentation of colon polyps and identifying lesions at high-risk of malignancy (cancer) during endoscopy
    • Detecting brain degeneration for Alzheimer’s patients from 3D MRI images and clinical data
  • Traffic monitoring and autonomous vehicles
    • Vehicles and pedestrian tracking in videos, including embedding the proposed algorithms in edge devices
    • Semantic segmentation for intelligent vehicles 
  • Remote sensing – satellite image processing and analysis:
    • Adjusting Geostationary (GEO) satellite images with Low-Earth-Orbit (LEO) images
    • Study of Urban Heat Islands and their impact on the environment and humans 
  • Gesture recognition from videos:
    • Human Action Recognition
    • Hand Gesture Recognition 
  • Document analysis and understanding:
    • Incremental multimodal classification from streams of documents
    • Understanding ancient Vietnamese text (Han-Nom characters)
  • Biometry access control: face verification and anti-spoofing

Team Members

Assoc. Prof. Muriel VISANI
Team Leader

Dr. Dinh Viet Sang

Dr. Nguyen Thi Oanh

Dr. Tran Nguyen Ngoc

Dr. Dang Tuan Linh

Projects and Solutions


National partners (in Vietnam)

  • USTH: ICTLab & Space departments
  • VNUA (FIT)
  • Can Tho University
  • IRD: Institut de Recherche pour le Développement (Vietnam branch)

International partners

  • Asia-Pacific:
    • Australia: University of Technology Sydney, Bureau of meteorology, CSIRO, Griffith Uniersity, The University of Queensland
    • China: Lanzhou University
    • Japan: University of Tsukuba, Kochi University of Technology
    • South Korea: Chosun University
  • America:
    • USA: University of Hawaii
    • Brazil: University of Sao Paulo
  • Russia: Tula State University
  • Africa: Tunisia – Sfax University
  • Europe:
    • France: La Rochelle University, Poitiers University, Bordeaux University, INSA Lyon, Nancy University
    • Switzerland: Fribourg University
    • Spain: Universitat Autonoma de Barcelona

Latest publications

  • Z. Ming, M. Visani, M.M. Luqman, J.C. Burie. A Survey on Anti-Spoofing Methods for Facial Recognition with RGB Cameras of Generic Consumer Devices. Journal of Imaging. 6(12):139, 56 pages, 2020.
  • C. Ostertag, M. Beurton-Aimar, M. Visani, T. Urruty, K. Bertet. Predicting Brain Degeneration with a Multimodal Siamese Neural Network. JInternational Conference on Image Processing Theory, Tools and Applications (IPTA), IEEE, pages 1-6, November 2020.