I have been struggling with getting a decent image from SD within fewer iterations.

I have played around with different sampling methods, CFG values, and steps. But unable to find a consistent configuration that gives me decent images.

Simple prompts that I am struggling with:

  1. a photo of a puppy, intricately detailed, realistic
  2. drawing of a bowl of fruits, manga style

If I am unable to get good output for simple prompts, I am afraid the output for more complex or abstract prompts will be completely unusable.

Are there any tricks that can reduce the iterations to give decent images? Any guidance would be really appreciated.

Thanks!

  • AusatKeyboardPremi@lemmy.worldOP
    link
    fedilink
    arrow-up
    2
    ·
    1 year ago

    Okay. I did not play with anything beyond the default models. Any suggestions for non-default models?

    As for tweaking prompts, yes, I am already doing it. But I still made the post to be sure I am not missing anything.

    Thanks for your answer and suggestions. :-)

    • Windex007@lemmy.world
      link
      fedilink
      arrow-up
      3
      ·
      1 year ago

      I’d check out https://civitai.com/ and see if any seem to align with what your intentions are.

      Basically, people will augment the training of existing models through various means.

      One of your prompts said something about anime or manga and I am aware that there are many models trained specifically to be good at that.

          • AusatKeyboardPremi@lemmy.worldOP
            link
            fedilink
            arrow-up
            2
            ·
            1 year ago

            I actually was able to generate much better images and learn about LoRA, hypernetworks, etc. thanks to the website you shared. :-)

            Having said that, Stable Diffusion is a bit too cumbersome and tedious when compared to MidJourney. But it is FOSS, and easier to get started with thanks to tools from Automatic1111.

            Looking forward to SDXL, hopefully it alleviates some of the pain points.

            • Windex007@lemmy.world
              link
              fedilink
              arrow-up
              2
              ·
              1 year ago

              I haven’t tried midjourney so I don’t really have a frame of reference. After some practice and getting a good flow going, I was surprised how quickly I could get things to a point that I liked them. Does midjourney do in painting? I’ve been using SD for creating game assets. I’m a trash artist but if I scribble a shit version and img2img with a batch size of like 20… I almost always get something REALLY close to what I need. My usecases are super dependent on inpainting so it’s kinda a must have.

              There are a few other tools that leverage SD. If you’re curious look into retro diffusion for aesprite, super neat workflow there

              • AusatKeyboardPremi@lemmy.worldOP
                link
                fedilink
                arrow-up
                2
                ·
                1 year ago

                I don’t think MJ does inpainting yet, or at least in an accessible way like SD.

                I haven’t used Aseprite but Retro Diffusion looks really cool and useful.

                I was initially trying to generate retro/pixel art with the help of prompts, but it was mostly hit or miss. I then found a few webui extensions, like sd-webui-pixelart, that got me closer to the goal.

                • Windex007@lemmy.world
                  link
                  fedilink
                  arrow-up
                  3
                  ·
                  1 year ago

                  OK well if that’s what you’re looking for, I can at least tell you about what I had luck with:

                  For backgrounds, I would usually start with a prompt and I would generate like 30 or 40 in a batch. Then I skim them to see if any are kinda in the zone. Sometimes you can have a good prompt but just not a great seed, so blasting a big pile of them out per prompt is a way to really establish how in-line your prompt is.

                  Then, if I find one, or some, that look along the lines of what I’m looking for, I usually want to make some more direct changes to… Get a lot more hands on.

                  I fire up my image editor (Gimp) and I do like the SHITTIEST hack job (not an artist) of drawing in how I want it to be different. Laughably bad drawings. Barely better thank stick men. Think “if I squinted hard I could maybe imagine this blob to be what I want”

                  Then I take that massacred image back to img2img for inpainting. Mask the parts where I want it to try again. Again, I’ll order up like 20 in a batch. Find the one that most closely aligns with what’s in my head, and then maybe iterate off of that version.

                  IMHO I think obsessing over prompts is overrated. Broad strokes and inpainting… Taking kind of a “genetic algorithm” approach to zeroing in on what you actually want is far superior of a workflow IMHO

                  • AusatKeyboardPremi@lemmy.worldOP
                    link
                    fedilink
                    arrow-up
                    1
                    ·
                    1 year ago

                    Thanks a lot for the inputs. I will emulate your workflow and share the results.

                    Also, I have realised myself and do agree with you that prompt engineering can only take one so far.