A.D (David) Le

Viettel AI.

my_avt.jpg

I am an AI researcher and engineer from Vietnam, currently working at the intersection of generative modeling, multimodal AI, and controllable visual generation.

My research journey began with visual text intelligence, especially OCR, handwriting recognition, and mathematical expression recognition. These problems taught me how difficult it is for AI systems to understand fine-grained visual patterns, spatial structure, and content constraints. More recently, my work has shifted toward generative modeling. In my latest research, I study one-shot handwriting generation with diffusion models, focusing on how to capture complex writer styles from a single reference image while preserving textual content and local visual details.

Beyond academic research, I have spent several years building AI systems in industry. At Viettel AI, I have worked on OCR, eKYC, document processing, handwriting-related problems, information extraction, and generative data synthesis for real-world applications. This experience has shaped my research style: I care not only about proposing new models, but also about building systems that are robust, scalable, and useful in practice.

I am now interested in broader questions in compositional and controllable generative AI. How can generative models compose multiple constraints at inference time? How can style, content, structure, and realism be represented modularly? How can diffusion models, energy-based models, and multimodal representations be combined to build more flexible generative systems?

My long-term goal is to contribute to the next generation of generative AI systems: models that are not only high-quality, but also controllable, interpretable, efficient, and useful across real-world domains.

news

Apr 29, 2026 Was invited to submit the extension of our WACV paper to Machine Vision and Applications journal.
Jan 22, 2026 Our WACV 2026 paper CONSTANT was selected for an oral presentation.
Nov 11, 2025 One paper accepted at WACV 2026!

selected publications

  1. Thesis
    A System for Extracting Mathematical Expressions from Document Images (Master Thesis)
    Anh Duy Le
    May 2026
  2. WACV
    CONSTANT: Towards High-Quality One-Shot Handwriting Generation with Patch Contrastive Enhancement and Style-Aware Quantization (Oral, Award Finalist)
    Anh-Duy Le, Van-Linh Pham, Thanh-Nam Vo, and 2 more authors
    In IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2026
  3. ICDAR
    Formerge: Recover Spanning Cells in Complex Table Structure Using Transformer Network (Poster)
    Nam Quan Nguyen, Anh Duy Le, Anh Khoa Lu, and 2 more authors
    In International Conference on Document Analysis and Recognition (ICDAR), 2023
  4. DICTA
    A Hybrid Vision Transformer Approach for Mathematical Expression Recognition (Oral)
    Anh Duy Le, Van Linh Pham, Vinh Loi Ly, and 3 more authors
    In International Conference on Digital Image Computing: Techniques and Applications (DICTA), 2022