- Oct 12, 2020
- 19
- 90
Glad I could point you in the right direction! If you're having issues with the generation speed and you're using Nemomix at Q4M with 10GB VRAM, you should have more than enough VRAM to run it much faster. When you open koboldccp, try adjusting the "GPU Layers" setting to the max it'll go (it'll say something like 21/46 or something similar on the side of the box, just change the -1 in the box to 46 in this instance to run the whole model on GPU). See if that helps with the speed any. I'll always run my models at max for ~10 t/s or so depending on what model I'm using.It took fucking hours, but I managed to get some semblance of a SillyTavern instance going more or less with your suggestions between the two posts (using Nemomix because 10GB VRAM, but plenty of regular RAM). I've chosen to forego the Text to Speech as the voice stuff isn't quite there yet, and it tends to forget what's a narration and what's speech. Image generation with Stable Diffusion 1.5 alone is absolute garbage, but I figure if I add some LoRA's and other confusing technology, I can fine tune that. I'm getting pretty slow response times, though. Almost a full minute or more for some responses. That's probably another thing to tune endlessly. It will take several GB of space so folks need to make sure they can spare that.
Regardless, I've got a somewhat bare-bones version of this thing running after half a day, and the most important part is that the text generation is the best I've used so far. It's really good at what I have it set to, which is NSFW Chat Roleplay. I haven't even tried the other stuff yet (Story, Adventure, etc.). Aside from the monstrous setup requirements and the overuse of my luddite brain, this has been a worthwhile endeavor.
If anyone has a 30-series NVIDIA GPU or better/similar, and enough RAM in your system to take on the extra load that the VRAM can't, this is a viable option. I recommend just getting SillyTavern running with KoboldCPP to start with. That's hard enough for anyone who doesn't mess with command lines or GitHub repositories often. Here's theYou must be registered to see the links. Knock yourselves out, I almost did
You don't have permission to view the spoiler content. Log in or register now.
And if you want to use TTS and have it understand narration, go to Extentions > TTS and scroll down to the Alltalk settings, you can adjust the "Text Not Inside * or " is" setting and set it to character or narrator. Some character cards use white text instead of grey text to describe actions, so you might have to adjust that often. But yeah, the tts options right now is no elevenlabs, but after training alltalk is much better than you'd expect.