Ollama is a pain in the ass and can't use quant files like GGUFs direct from huggingface (you have to create what are called modelfiles from them, tutorials abound for this stupid requirement, or pull the models directly from ollama's library). KoboldCPP is a much superior option, in my opinion.I managed to install Ollama and make it work (including in game), but how I do make it load the model recommended by the dev for playing?
Q8s are overkill. Generally Q6 is considered near perfect. I only do Q8 when I have VRAM to waste (I have 24GB so I use Q8s for 12Bs.)Thanks! I got the Q8 but seems like my 8GB Vram is struggling lol.. Do you have recommended settings for more of a spicy output or do I gotta lead them in that direction? Never used a local ai like this before...
I see, thanks. I'll try switching to Q6 and then 4 if its still slowQ8s are overkill. Generally Q6 is considered near perfect. I only do Q8 when I have VRAM to waste (I have 24GB so I use Q8s for 12Bs.)
Don't be afraid of quants down to, say, Q4 with a 8-12B, but lower than that and it's going to get noticeable. I use Q4_K_S IIRC for 32Bs which can just fit in 24GB, though the larger the parameter count (B = billions of parameters) of the model, typically the better it can handle quantization.
I tried Kobold but it has a similar problem I'm running into right now, can't figure out how to save models somewhere else since it installs on C directory and mine doesn't have that much memory left. Trying to figure out how to change directories atm.Ollama is a pain in the ass and can't use quant files like GGUFs direct from huggingface (you have to create what are called modelfiles from them, tutorials abound for this stupid requirement, or pull the models directly from ollama's library). KoboldCPP is a much superior option, in my opinion.
KoboldCPP is just a single executable with no installer. You can put it anywhere you want and point it to a GGUF anywhere it's kept. Be sure you aren't using KoboldAI which is a related project, but not the one you want.I tried Kobold but it has a similar problem I'm running into right now, can't figure out how to save models somewhere else since it installs on C directory and mine doesn't have that much memory left. Trying to figure out how to change directories atm.
No, I didn't know that, I'm very glad you mentioned it, I was using the wrong one as you have guessed. Currently testing the CPP but I think I got it right this time. Thank you for sticking around and helping people troubleshoot stuff, users like you are the real MVP.KoboldCPP is just a single executable with no installer. You can put it anywhere you want and point it to a GGUF anywhere it's kept. Be sure you aren't using KoboldAI which is a related project, but not the one you want.
You must be registered to see the links
So to resume the game will never have non-consensual content and will always remain a carebearing dating sim, sorry for such a game the could have been a great one...You will be able to blackmail somebody, but never force it. You know... The rules, Patreon... I don't wanna touch it...
I haven't played the latest version but you can kind of get around that somewhat by cheating maximum dominance between you and the person you want to leave the house, then tell them to leave, they'll then move out into their own apartment(s) with a relationship hit but you can cheat that back. I'm not sure of what happens when you have a couple leave individually, likely they would move into their own separate apartments. As far as friends go, apparently in this newest version you can create custom, separate households, so you could create whatever friends you want and cheat an initial relationship after character creation.Apart from that, I don't know, maybe I've missed something, but during the ‘construction’ of the characters... Eg: I'm a guy and I'm married. I add mum, dads and a sister when I add people, they all seem to live under the same roof as my mc. Then I want to add my wife's family, so say yes to ‘add customization family’, so I create my wife's family, but I can't link them in ‘relationship’ if they don't live with us? I mean... Crap! If I want to add friends, will they also have to live under the same roof?
First thing to do is make sure the model and context are fitting into your VRAM. If you're using windows you can just Task Manager->Performance->GPU. When you load the model in KCPP you'll see your dedicated GPU memory usage spike. Make sure it doesn't fill enough to start using Shared GPU Memory. If it is, you need to use a smaller model or quant. You can also play with the layers value. If you absolutely cannot fit it in VRAM, you can split layers between GPU and CPU. This is still slow, but not as slow as it using shared memory.Its taking a long time to load dialogue responses, like 2 minutes. I tried the 6 and 8 models. Is there anything I can do on Kobold to make it be faster? Otherwise I might have to resort to the one online, assuming its faster.