I'm a total AI noob but for the last few days I've been experimenting with KoboldAI running Erebus 13B on a 24GB 3090 with wildly varying levels of success.
I have a beefy CPU too so I split the layers between GPU and CPU - I even had some very limited success with the 20B model but tinkering with settings I don't yet understand broke things to the point where I completely reinstalled to take more careful note of defaults.
The 6.7B parameter variant is quick and stable - seemingly with little sensitivity towards settings or demand - but the output quality difference between 6.7B and 13B has, admittedly totally subjectively and unscientifically, been noticeable.
Erebus on KoboldAI definitely seems like the way forward for
local story generation at the moment but it's still hugely unapproachable even for the fairly technical among us.
I don't want to put anyone off - the actual download and installation process has actually been made very slick and friendly - great work by the project maintainers.
But it's once you're up and running that things rapidly become more under-documented and opaque.
I would, for example, really appreciate a decent guide/tutorial on what settings (with higher end hardware in mind). All the talk of generating soft prompts seems to assume a great deal of existing domain knowledge and there are precious few examples of prompts, settings or soft prompts out there that seem to be suitable.
Most of the stuff that
is out there seems to assume that you're using NovelAI, Kobold on Colab or perhaps even something from the legendary, seemingly mythical era of an AI Dungeon golden age. Some of us missed that boat entirely.
So if you do have any tips on settings for Erebus 13B or indeed where to find soft prompts or more helpful soft prompt creation guides, please do share; there's a real drought of compatible fresh advice - particularly about the settings.
As an aside, InvokeAI is a not dissimilar image generation rabbit hole I stumbled down while researching this topic. Funnily enough, the image generation tool seems much less resource hungry than the writing tool.
I fear I have scratched the surface of something way bigger than I can handle. So for now, noob tips on WTF to do with the Repetition sampler in my sampler order, would be great. As an example.
In my most recent prompt I mistyped widow as window. It got weird real quick.