Research Areas

My recent work focuses on multimodal generation, large language models, AI agents, and recommendation systems. I am especially interested in turning foundation models into reliable production systems for advertising, content understanding, and creative generation.

Multimodal Generation

Video and image generation with better controllability, consistency, and production quality.
Efficient training and inference for large-scale creative generation systems.
Editing and generation pipelines for advertisement videos and product-driven content.

Large Language Models and AI Agents

Domain-adapted LLM systems for advertising, customer service, diagnosis, and workflow automation.
Agentic systems that combine planning, retrieval, tool use, and evaluation.
Long-context understanding, structured outputs, and real-world deployment for business scenarios.

Recommendation and Advertising Intelligence

Generative recommendation for large-scale advertising platforms.
Multimodal understanding of products, landing pages, and creative materials.
Systems that connect model quality with measurable business outcomes such as production efficiency and conversion.

Earlier Research

Computer vision, cross-modal retrieval, and visual-text matching.
Medical AI and multimodal representation learning.
Video understanding, retrieval, and structured content analysis.