Inversion-Free Image Editing
with Natural Language

1University of Michigan  2University of California, Berkeley 

* Equal contribution    Correspondence

CVPR 2024

We present an inversion-free editing (InfEdit) method that allows for consistent editing at both the semantic and spatial levels, catering to intricate modifications without compromising on the image's integrity and explicit inversion. Through extensive experiments, InfEdit shows strong performance in complex editing tasks and also maintains a seamless workflow (less than 3 seconds on one A40), demonstrating the potential for real-time applications.


A painting of a waterfall
[+and angels] in the mountains

A woman in a coat
[+and dress] is dancing

[+Oil painting of] a lake with mountains in the background

A woman in a white red
dress sitting on a chair with flowers

A man in a white shirt standing in front of trees mountains

A light brown bear
sitting standing on the ground

Muffin Chihuahua

A football with OSU UMich logo

A blue droplet red fire emoji with a smiling angry face with yellow dot

Experiments

InfEdit in various complex image editing tasks:


Comparison

Comparison with inversion-base methods:

Performance in image editing: DDCM matches or exceeds other algorithms, with LCM and UAC bringing further improvement. Notably, it runs about an order of magnitude faster.

Qualitative examples: InfEdit vs prior methods. InfEdit attains editing goals with the best consistency with source images.

Comparison with existing methods:

Qualitative examples: InfEdit vs prior methods. InfEdit attains editing goals with the best consistency with source images.

More Results

Method

We make an attempt to eliminate the inversion process and introduce Denoising Diffusion Consistent Model (DDCM), a sampling strategy that enables virtual inversion. DDCM leverages a diffusion process that significantly enhances consistency throughout the image generation phases, ensuring fidelity and speed in transforming and refining visual content.

We also present Unified Attention Control (UAC) for tuning-free image editing through natural language that integrates cross-attention and self-attention control within a unified framework.

Detail

BibTeX

@article{xu2023infedit,
  title={Inversion-Free Image Editing with Natural Language}, 
  author={Sihan Xu and Yidong Huang and Jiayi Pan and Ziqiao Ma and Joyce Chai},
  booktitle={Conference on Computer Vision and Pattern Recognition 2024},
  year={2024}
}