Add latent upscale option to img2img

Recently, the option to do latent upscale was added to txt2img highres fix. This feature runs by scaling the latent sample of the image, and then running a second pass of img2img. But, in this edition of highres fix, the image and parameters cannot be changed between the first pass and second pass. We might want to do a fixup in img2img before doing the second pass, or might want to run the second pass at a different resolution. This change adds the option for img2img to perform its upscale in latent space, rather than image space, giving very similar results to highres fix with latent upscale. The result is not exactly the same because there is an additional latent -> decoder -> image -> encoder -> latent that won't happen in highres fix, but this conversion has relatively small losses
author: Andrew Ryan <andrewryanchama@gmail.com> 2022-12-08 07:09:09 +0000
committer: Andrew Ryan <andrewryanchama@gmail.com> 2022-12-08 07:09:09 +0000
commit: 358a8628f6abb4ca1e1bfddf122687c6fb13be0c (patch)
tree: 665cb5030ef0a8d1d1800e4f44c28806876c1cf4 /modules/processing.py
parent: 44c46f0ed395967cd3830dd481a2db759fda5b3b (diff)
1 files changed, 5 insertions, 1 deletions
diff --git a/modules/processing.py b/modules/processing.py
index 3d2c4dc9..ab5a34d0 100644
--- a/modules/processing.py
+++ b/modules/processing.py
@@ -795,7 +795,7 @@ class StableDiffusionProcessingImg2Img(StableDiffusionProcessing):
         for img in self.init_images:
             image = img.convert("RGB")
 
-            if crop_region is None:
+            if crop_region is None and self.resize_mode != 3:
                 image = images.resize_image(self.resize_mode, image, self.width, self.height)
 
             if image_mask is not None:
@@ -804,6 +804,7 @@ class StableDiffusionProcessingImg2Img(StableDiffusionProcessing):
 
                 self.overlay_images.append(image_masked.convert('RGBA'))
 
+            # crop_region is not none iif we are doing inpaint full res
             if crop_region is not None:
                 image = image.crop(crop_region)
                 image = images.resize_image(2, image, self.width, self.height)
@@ -840,6 +841,9 @@ class StableDiffusionProcessingImg2Img(StableDiffusionProcessing):
 
         self.init_latent = self.sd_model.get_first_stage_encoding(self.sd_model.encode_first_stage(image))
 
+        if self.resize_mode == 3:
+            self.init_latent = torch.nn.functional.interpolate(self.init_latent, size=(self.height // opt_f, self.width // opt_f), mode="bilinear")
+
         if image_mask is not None:
             init_mask = latent_mask
             latmask = init_mask.convert('RGB').resize((self.init_latent.shape[3], self.init_latent.shape[2]))
author	Andrew Ryan <andrewryanchama@gmail.com>	2022-12-08 07:09:09 +0000
committer	Andrew Ryan <andrewryanchama@gmail.com>	2022-12-08 07:09:09 +0000
commit	358a8628f6abb4ca1e1bfddf122687c6fb13be0c (patch)
tree	665cb5030ef0a8d1d1800e4f44c28806876c1cf4 /modules/processing.py
parent	44c46f0ed395967cd3830dd481a2db759fda5b3b (diff)