I started using sillytavern, recently a pretty good 7b LLM was released that makes it possible to have pretty interesting nsfw adventures with sillytavern.
1. Install Oobabooga and make sure to start it with the --api command flag
(
You must be registered to see the links
)
2. Use the webui to install the TheBloke/Silicon-Maid-7B-GGUF LLM ( I use the silicon-maid-7b.Q4_K_M.gguf version)
(
You must be registered to see the links
)
- load the model with llama.cpp, as many in gpu layers as possible (max 32 n-gpu-layers, a 12gb vram gpu should be fine I think. I have a 4090 but I use less than half of the vram on it with this model.)
3. Install Sillytavern
(
You must be registered to see the links
)
4. Create a world with some world info (keep it around 500 tokens max)
5. Create a whole bunch of characters, keep the characters as low on tokens as possible, don't exceed a total of 4k for all combined.
6. Set response tokens to 512 and context tokens to 8192, (text completion preset you can play around with, I tend to use novelai ones)
7. Make a group with your characters and link them to the world and have fun, mute and unmute the characters depending on situation/presense.
I made a world called frontier town with 8 futanari characters for example yesterday and added some locations to try it out and was surprised by how well it worked out, despite only making very simple characters.
I've already gotten a lot of ideas for future worlds/characters/teams.
For a 7b model, this model is quite good for RP. It's fast, doesn't require a lot of vram and it has 8k context. (More context is better if you make group roleplay chats. You can easily get away with a 4k model for 1on1 roleplay.)