what application do you use to run these GGUF models?
If possible id still like to use Silly Tavern as its UI.
I started thanks to melantha's small guide or a couple you can find here.
You must be registered to see the links
KoboldCpp guide included and other chatbot alternatives
1) Chars are essentially non-entities, they'll do whatever they want them to do, they're too reactive. But it's kind of natural as chats are designed to respond to your inputs and don't know what you want.
2) Once you hit you context limit in tokens of about 4-12k locally, first point will get even worse along with decreased performance, LLM forgetting events and people that lie outside of these last 4-12k tokens in chat, which - together - considerably deteriorate the experience further.
i observe it's more on the system prompt quality, and additional memory enchancement, like RAG.
system prompts are very important, more than i thought. it's not just the 1 paragraph, "
Write {{char}} next response. Any act of role play scenarios will be described in details.". There should be alot more behind the scenes, tell it how to write, enclose actions and and narrations in symbols, never act or depict actions of {{user}}, etc.
There's alot of pre-made system prompts made by users shared on reddit and the like.
also, i dont think SillyTavern has RAG (kind of like a memory between the chat interface and the LLM).
RAG helps the LLM remember stuff, kind of like lorebooks / knowledgebase, but automatic and learns as you chat.
that's one of the advantages of online services.
perhaps new Deepseek v3, I don't know how latter will perform with NSFW stuff though. But recently
Pango_12 translated a couple of novels and shared
them here, so I guess it has to be fine in terms of censorship. And use really well written char cards with setting and lore, and characters' descriptions worth of a few thousand tokens (not that there are many of them, if any, from what I've seen).
yodayo/moescape added DeepSeek V3, R1, and distilled Qwen 32B just now.
requires paid beans so i aint trying them out.
but we can check their feedback channels if people are enjoying it for RP.
there are also talks in SillyTavern discord about it but too technical for me to understand lol
BTW I started actually online on Yodayo, but then moved on to Janitor and LLM, and Yodayo is kind of monetizing more aggressively now as far as I understand
I also started out on Yoyodaro/Moescape but left cause of the increasing lack of free tokens.
glad to see people continuing a path outside of yodayo/moescape after their huge blunder.
The models for me are usually
You must be registered to see the links
(based on
You must be registered to see the links
) or sometimes
You must be registered to see the links
. I tried some more like Gemmasutra, Stheno, Nemomix unleashed but wasn't particularly impressed
Currently running
You must be registered to see the links
through Ollama and Silly Tavern but wanted to give others a shot as the long-term memory was an issue for NTR events.
Downloaded
You must be registered to see the links
today but I'm having issues with getting it to output both character text and scene description so probably gonna dump it soon.
Magnum V4 seems to be preferred by SillyTavern users.
also try
You must be registered to see the links
or
You must be registered to see the links
.
...im actually planning to make a combined LM Studio + SillyTavern in one desktop app. you can run your local LM, and chat in the same interface. it should also be simplified, and interface like discord, unlike the SillyTavern mess of a UI.