Getting started with AI

Moscow1578

New Member
Jan 20, 2023
13
5
Sup.

Recently, I found a lot of interest in AI-generated art. I decided that I wanted to try it myself.
First off, I tried to run stable diffusion on my AMD GPU. It barely worked, and the results were meh. I started searching for free alternatives and came across a couple different methods.
For one, there was Google Colab, but I struggled to find one that worked properly without crashing.
I also found some websites where you could generate stuff like one, but you'll have to expect long queue times.

I came across multiple guides on here like this one, but I couldn't really get it to work properly.

TL;DR
Can you recommend any AI-generating guides that worked for you? Or is this unrealistic and only possible by paying for or hosting it yourself?

Any help or reply is much appreciated. Thanks!
 
Last edited:

borramejm

New Member
Jan 2, 2023
5
1
I'd probably recommend you to try GLIDE as they have pre-trained models but I don't know how easy it will be to get it running. How familiar are you with Python and ML?
 

tanstaafl

Active Member
Oct 29, 2018
961
1,387
I'd probably recommend you to try GLIDE as they have pre-trained models but I don't know how easy it will be to get it running. How familiar are you with Python and ML?
Here's what ChatGPT has to say about GLIDE. Seems like quite a bit of extra work, but the results may be more refined in regard to what the desired end result is.

ChatGPT said:
GLIDE, which stands for Generative Latent-InDuced Encoder, is a technique used in the context of pretrained AI models to enable fine-grained control over the generated output. It is specifically designed for generative models that operate on latent spaces, such as variational autoencoders (VAEs) or generative adversarial networks (GANs).

In traditional generative models, generating a specific output involves modifying the latent vector representation directly. However, this approach can often lead to unpredictable changes and unintended modifications in the generated output. GLIDE addresses this limitation by introducing a more controlled and interpretable way to manipulate the latent space.

GLIDE achieves this by first training an additional neural network called an encoder network, which takes the generated output and maps it back to the corresponding latent vector representation. This encoder network essentially learns an inverse mapping of the generative model. By using this encoder network, GLIDE allows users to modify the latent vectors indirectly through the generated output.

The process typically involves the following steps:

  1. Generate an initial output using the pretrained generative model.
  2. Pass the generated output through the encoder network to obtain the corresponding latent vector.
  3. Modify the latent vector based on desired changes.
  4. Decode the modified latent vector using the generative model to obtain the updated output.
By manipulating the latent vector representation, users can control various attributes of the generated output, such as its appearance, style, or content. GLIDE provides a more intuitive and controllable way to generate outputs with specific characteristics, making it a valuable tool in the field of pretrained AI models.
 

Moscow1578

New Member
Jan 20, 2023
13
5
I'd probably recommend you to try GLIDE as they have pre-trained models but I don't know how easy it will be to get it running. How familiar are you with Python and ML?
I'm fairly familiar with Python. Will make sure to try it out, thanks!
 

Moscow1578

New Member
Jan 20, 2023
13
5
Update: I tried some other colab notebooks. I came around a couple good ones. Imma just use those until I find out more.
 

InfiniTales

Newbie
Aug 11, 2021
38
24
I started with it a few days ago as well, and I haven't had many issues running it on an 8 year old (at the time high end) Nvidia card. The main problem I have is the occasional vram "out of memory" crash, but that got better since I started using the --medvram or --lowvram startup parameter. It's slow enough, but I can live with that.

"the results were meh"... That means you did get it to work. If the results are not at all what you expected, maybe you need to give yourself some time to learn about the settings and prompts?

I've been at it for a few days now, and slowly starting to get decent results.
 

Moscow1578

New Member
Jan 20, 2023
13
5
I started with it a few days ago as well, and I haven't had many issues running it on an 8 year old (at the time high end) Nvidia card. The main problem I have is the occasional vram "out of memory" crash, but that got better since I started using the --medvram or --lowvram startup parameter. It's slow enough, but I can live with that.

"the results were meh"... That means you did get it to work. If the results are not at all what you expected, maybe you need to give yourself some time to learn about the settings and prompts?

I've been at it for a few days now, and slowly starting to get decent results.
I got an AMD card with not a lot of VRAM. The results I got were just weird and bugged textures. I may give it a try again but for now I'm looking into getting it to work smoothly with colab where I'm getting great results. I'd need to find a proper and fitting guide for AMD GPU's.
 

Pamphlet

Member
Aug 5, 2020
318
577
First off, I tried to run stable diffusion on my AMD GPU. It barely worked, and the results were meh.
I have it running decently with a AMD GPU. I ended up going for a dual-boot setup to run it under Ubuntu, since DirectML on Windows is dog slow. I also ended up switching to ROCm 5.5, building magma for it, and building torch 2 to replace the old version Automatic1111 goes for. Feel free to pick my brains if you want to try a similar route.

One thing that should probably be in huge neon letters in any discussion of Stable Diffusion in Automatic1111 with AMD: Head over to the settings, go to Optimizations, and switch Cross attention optimization from Automatic to Doggettx. That really helps with VRAM usage.
 
Last edited:

Moscow1578

New Member
Jan 20, 2023
13
5
I have it running decently with a AMD GPU. I ended up going for a dual-boot setup to run it under Ubuntu, since DirectML on Windows is dog slow. I also ended up switching to ROCm 5.5, building magma for it, and building torch 2 to replace the old version Automatic1111 goes for. Feel free to pick my brains if you want to try a similar route.

One thing that should probably be in huge neon letters in any discussion of Stable Diffusion in Automatic1111 with AMD: Head over to the settings, go to Optimizations, and switch Cross attention optimization from Automatic to Doggettx. That really helps with VRAM usage.
Yea I was just thinking about dual-booting too, but I think it's non-sense. I'm running a AMD RX 570 with 4 GB VRAM. I think that's simply not enough to get viable results that don't take hours to generate. Correct me if I'm wrong.
 

Pamphlet

Member
Aug 5, 2020
318
577
Don't be too discouraged, there's a lot you can do to keep the time down. Dropping your image size to something like 320x480 for initial generation and using the x/y/z grid script to test batches of different settings helps with identifying good candidates to re-run with the hires fix or feed through controlnet+ultimate-sd-upscale.

If you still have the Windows version installed, have a little play around like above and see if the Doggetx optimization makes things seem a bit less hopeless for your hardware.
 

Moscow1578

New Member
Jan 20, 2023
13
5
Don't be too discouraged, there's a lot you can do to keep the time down. Dropping your image size to something like 320x480 for initial generation and using the x/y/z grid script to test batches of different settings helps with identifying good candidates to re-run with the hires fix or feed through controlnet+ultimate-sd-upscale.

If you still have the Windows version installed, have a little play around like above and see if the Doggetx optimization makes things seem a bit less hopeless for your hardware.
Yeah true. Colab isn't a permanent solution. I will have a look around and decide whether I'll run it on windows or set up dual boot with ubuntu or something. Big thanks for the tips <3
 
  • Like
Reactions: Pamphlet

InfiniTales

Newbie
Aug 11, 2021
38
24
So... do AMD cards generate completely different images than nVidia cards? That sounds surprising. I'm asking, because I still think that getting weird textures and glitches may be more due to prompting than hardware.

I mean, I get those as well, especially so when I just started. As I get more used to putting a prompt together, it happens less and less. I'm now at the point where, if I get them, I instantly change up the prompt instead of trying to fix them by tweaking generation parameters.

Also: I've been experimenting with some models that are highly specialized in a specific style. They easily give me great results, until I try adding something that the model clearly has limited data points on. Then those artefacts pop up, and I start fighting the model trying to still force it through.
 

Pamphlet

Member
Aug 5, 2020
318
577
So... do AMD cards generate completely different images than nVidia cards?
Sort of - from what I gather GPU RNG gives different results between AMD and NVidia, while CPU RNG should be consistent.

For shits and giggles, CPU on the left, GPU on the right. All other parameters unchanged.
00056-1455293057.png 00055-1455293057.png
 

Moscow1578

New Member
Jan 20, 2023
13
5
Well I mean the results still look great. I looked at others testing with similar GPUs. As much as I found out it's gonna be like 2-3 minutes per image. While I could imagine getting it to work somewhat properly, I'm not keen on waiting that long. ¯\_(ツ)_/¯
 

InfiniTales

Newbie
Aug 11, 2021
38
24
Well I mean the results still look great. I looked at others testing with similar GPUs. As much as I found out it's gonna be like 2-3 minutes per image. While I could imagine getting it to work somewhat properly, I'm not keen on waiting that long. ¯\_(ツ)_/¯
Yeah... on my old GPU it easily goes to 1-2 minutes per image as well for me, once I redo a higher quality of a seed I like. I'm still having loads of fun, tho! (started about a week ago)
example.jpg