
Unsupervised expert alignment and importance-guided layer chunking merge multiple fine-tuned experts into one generalist model.

Hello there, I'm Dengming Zhang, and I received my master's degree from Zhejiang University in March 2026.
My primary research focuses on Multimodal Large Models under low-data and low-compute constraints. On the low-data side, I study how to merge multiple domain-specialized, fine-tuned expert LLMs into one generalist model using only 1–5 samples, while retaining SOTA-level performance (ICLR 2026). On the low-compute side, I explore how to equip vision foundation models with audio using a single RTX 4090, and improve audio-visual affective understanding to a SOTA-level. I am also interested in Generative AI (Image/Music), Affective Computing, Meta-learning, and HCI.
By the way, I am good at combining scientific research with engineering implementation, and I have rich experience in front-end development, back-end development, and cluster devops. Some of the open source projects that I lead/participate in can be found on my GitHub.
Seeking PhD opportunities for Fall 2026
Grouped by research direction.

Unsupervised expert alignment and importance-guided layer chunking merge multiple fine-tuned experts into one generalist model.

Audio-visual emotion understanding by teaching vision-language models to align sight and sound for artistic emotion.

Buffering-based spatial sparsity improves centrifugal token pruning efficiency in vision-language models under aggressive pruning rates.

Dual-scale attention meta-learning for personalized, dynamic music emotion recognition.

Discriminant space optimization improves few-shot bearing fault diagnosis with meta-learning.

Style-strength control and evaluation to improve style alignment in image creation.

Controllable text rendering with typography and style controls.

Generates music from video with multiple time-varying conditioning signals.

Decomposes spatial and temporal cues to improve controllable video-to-music generation.

Research internship on Model Merging (Expert Merging)[1].

Work on Game Character Material Generation with animation

Research internship on Model Merging (Expert Merging)[1].

Work on Game Character Material Generation with animation