how do you hook up the koboldcpp i tried and i can't get it to workYou're the only one I've seen an example of a model that's been used. I tried this model Moistral-11B-v3-f16 and it works very slowly, 2-3 words per minute. Does it need a very powerful computer or what? Everything else I tried with this kobold didn't work at all. What models are used for this, how to use them, nothing is clear.
That particular quant is very big; it does need either a powerful GPU or CPU, and plenty of RAM. I'd guess it's falling back to CPU.You're the only one I've seen an example of a model that's been used. I tried this model Moistral-11B-v3-f16 and it works very slowly, 2-3 words per minute. Does it need a very powerful computer or what? Everything else I tried with this kobold didn't work at all. What models are used for this, how to use them, nothing is clear.
gemini-1.5-pro works best, but for me it stops working after a few minutes. In that case I switch either to gemini-1.5-flash or to a local model.what's the best free api to use on this
Have you got Koboldcpp itself working onhow do you hook up the koboldcpp i tried and i can't get it to work
I have used the koboldcpp and it still never worked, Like it would install and i could turn on kobold but the localhost would not even when i used a ggup that was around 1 gigabyteA few local models I've tried:
For anyone who is struggling to make local models work: you need a GGUF small enough to leave some free space for the context and general video usage (there's an online calculator at
You must be registered to see the links(Q3_K_S) worked very well in story mode, but tended to be a bit liberal with the emojis on sandbox mode; You must be registered to see the links(Q4_K_S) and variants seemed to have better overall consistency; You must be registered to see the links(Q4_K_M) had some of the best writing on a few of scenes, but it pretty much avoided using most emojis.You must be registered to see the links). For instance, I have 8G VRAM, so I only try GGUF files under 6GB, and most under 5GB.
I suggest first trying to make koboldcpp itself work with a very small model;You must be registered to see the linksis one of the smallest I know that I can still have fun roleplaying with. Load that one in Koboldcpp, open the web interface, click Scenarios, and chat a little bit with Tiff or Nail. CPU usage should be fairly low.
That particular quant is very big; it does need either a powerful GPU or CPU, and plenty of RAM. I'd guess it's falling back to CPU.
As a reference, with a similar 11B model I have here on a smaller quant (Fimbulvetr-11B-v2.1-16K.i1-Q3_K_S.gguf), I can get the first Laura reply after ~75s, and ~1 word per second during generation. And I have a five year old AMD iGPU (Vulkan backend).
gemini-1.5-pro works best, but for me it stops working after a few minutes. In that case I switch either to gemini-1.5-flash or to a local model.
Have you got Koboldcpp itself working onYou must be registered to see the linksas described inYou must be registered to see the links?
almost certainly, it is based on AI; your input will be sent to a super-computer somewhere.Does this game log our choices and send it to someone over the world wide web?
play a bit longer after the boom, somewhere before you get to Fiona bath scene you can choose to play day 0 again or go for next chapterУ меня в начале 1 дня случается ядерный взрыв (После того как отказываюсь от тренировки или сразу после тренировки с Лорой) и игра начинается сначало.
At the beginning of day 1, I have a nuclear explosion (After I refuse training or immediately after training with Laura) and the game starts all over again.
Even better, a generic OpenAI-compatible API. Many local and remote providers could be supported that way.Please can you add Openrouter API