One Model For All : Partial Diffusion for Unified Try-On and Try-Off in Any Pose

Jinxi Liu1, Zijian He1, Guangrun Wang1,2,3, Guanbin Li1,2, Liang Lin1,2,3
1Sun Yat-sen University,  2X-Era AI Lab  3 Guangdong Key Laboratory of Big Data Analysis and Processing 

Flexibility: OFMA supports person-to-person try-on and cross-identity garment swapping.

Versatility: OMFA handles multi-pose try-on while maintaining identity consistency.

Abstract

Recent diffusion-based approaches have made significant advances in image-based virtual try-on, enabling more realistic and end-to-end garment synthesis. However, most existing methods remain constrained by their reliance on exhibition garments and segmentation masks, as well as their limited ability to handle flexible pose variations. These limitations reduce their practicality in real-world scenarios—for instance, users cannot easily transfer garments worn by one person onto another, and the generated try-on results are typically restricted to the same pose as the reference image. In this paper, we introduce OMFA (One Model For All), a unified diffusion framework for both virtual try-on and try-off that operates without the need for exhibition garments and supports arbitrary poses. For example, OMFA enables removing garments from a source person (try-off) and transferring them onto a target person (try-on), while also allowing the generated target to appear in novel poses—even without access to multi-pose images of that person. OMFA is built upon a novel partial diffusion strategy that selectively applies noise and denoising to individual components of the joint input—such as the garment, the person image, or the face—enabling dynamic subtask control and efficient bidirectional garment-person transformation. The framework is entirely mask-free and requires only a single portrait and a target pose as input, making it well-suited for real-world applications. Additionally, by leveraging SMPL-X–based pose conditioning, OMFA supports multi-view and arbitrary-pose try-on from just one image. Extensive experiments demonstrate that OMFA achieves state-of-the-art results on both try-on and try-off tasks, providing a practical and generalizable solution for virtual garment synthesis.

Breaking Down the Process

Fig (a) illustrates the pipeline of person-to-person try-on, including two processes of try-off and try-on in one model. Fig (b) shows the model architecture based on the proposed partial diffusion. The model uses the concatenated multiple conditions as inputs, where the noise are add to the person image (try-on stream) or the garment image (try-off stream). Fig (c) presents the multi-pose try-on support of our framework.

Comparisons with Try-off Methods

Qualitative comparisons with state-of-the-art methods of Virtual Try-off.

More Try-off Results

Qualitative result virtual try-off on the DressCode dataset.

Qualitative result virtual try-off on the Deepfashion-MultiModal dataset.

Comparisons with Try-on Methods

Qualitative comparisons with state-of-the-art methods of Virtual Try-on.

More Virtual Try-on Results

Person-to-Person Setting

Multi-pose Setting