NSFW AI chatbots

Oct 12, 2020
19
90
It took fucking hours, but I managed to get some semblance of a SillyTavern instance going more or less with your suggestions between the two posts (using Nemomix because 10GB VRAM, but plenty of regular RAM). I've chosen to forego the Text to Speech as the voice stuff isn't quite there yet, and it tends to forget what's a narration and what's speech. Image generation with Stable Diffusion 1.5 alone is absolute garbage, but I figure if I add some LoRA's and other confusing technology, I can fine tune that. I'm getting pretty slow response times, though. Almost a full minute or more for some responses. That's probably another thing to tune endlessly. It will take several GB of space so folks need to make sure they can spare that.

Regardless, I've got a somewhat bare-bones version of this thing running after half a day, and the most important part is that the text generation is the best I've used so far. It's really good at what I have it set to, which is NSFW Chat Roleplay. I haven't even tried the other stuff yet (Story, Adventure, etc.). Aside from the monstrous setup requirements and the overuse of my luddite brain, this has been a worthwhile endeavor.

If anyone has a 30-series NVIDIA GPU or better/similar, and enough RAM in your system to take on the extra load that the VRAM can't, this is a viable option. I recommend just getting SillyTavern running with KoboldCPP to start with. That's hard enough for anyone who doesn't mess with command lines or GitHub repositories often. Here's the . Knock yourselves out, I almost did :KEK:

You don't have permission to view the spoiler content. Log in or register now.
Glad I could point you in the right direction! If you're having issues with the generation speed and you're using Nemomix at Q4M with 10GB VRAM, you should have more than enough VRAM to run it much faster. When you open koboldccp, try adjusting the "GPU Layers" setting to the max it'll go (it'll say something like 21/46 or something similar on the side of the box, just change the -1 in the box to 46 in this instance to run the whole model on GPU). See if that helps with the speed any. I'll always run my models at max for ~10 t/s or so depending on what model I'm using.

And if you want to use TTS and have it understand narration, go to Extentions > TTS and scroll down to the Alltalk settings, you can adjust the "Text Not Inside * or " is" setting and set it to character or narrator. Some character cards use white text instead of grey text to describe actions, so you might have to adjust that often. But yeah, the tts options right now is no elevenlabs, but after training alltalk is much better than you'd expect.
 
  • Yay, new update!
Reactions: Zilcho

abyss50055

New Member
Feb 19, 2018
11
9
If VRAM is a in demand feature of graphics cards for AI then what about the AMD 7900XTX? Significantly cheaper than a 4090 and has a 24GB of VRAM.
Due to nvidias stranglehold over AI their cards are always the first that come to my mind, but AMD GPUs are a good suggestion. More affordable than nvidia, but still expensive since multiple of these high end GPUs are needed to run large models at decent speeds. For now I'll keep using APIs, but I'll look into buying some GPU(s) next year.
 
Oct 12, 2020
19
90
If VRAM is a in demand feature of graphics cards for AI then what about the AMD 7900XTX? Significantly cheaper than a 4090 and has a 24GB of VRAM.
VRAM is the most important but it'd be better to just go for a 3090 instead. It's the same VRAM but only costs about $500 if you get it used or refurbished.
 
  • Like
Reactions: Zilcho