Research Areas
My recent work focuses on multimodal generation, large language models, AI agents, and recommendation systems. I am especially interested in turning foundation models into reliable production systems for advertising, content understanding, and creative generation.
Multimodal Generation
- Video and image generation with better controllability, consistency, and production quality.
- Efficient training and inference for large-scale creative generation systems.
- Editing and generation pipelines for advertisement videos and product-driven content.
Large Language Models and AI Agents
- Domain-adapted LLM systems for advertising, customer service, diagnosis, and workflow automation.
- Agentic systems that combine planning, retrieval, tool use, and evaluation.
- Long-context understanding, structured outputs, and real-world deployment for business scenarios.
Recommendation and Advertising Intelligence
- Generative recommendation for large-scale advertising platforms.
- Multimodal understanding of products, landing pages, and creative materials.
- Systems that connect model quality with measurable business outcomes such as production efficiency and conversion.
Earlier Research
- Computer vision, cross-modal retrieval, and visual-text matching.
- Medical AI and multimodal representation learning.
- Video understanding, retrieval, and structured content analysis.
