melantha
Member
- Jan 21, 2019
- 277
- 649
what application do you use to run these GGUF models?
If possible id still like to use Silly Tavern as its UI.
I started thanks to melantha's small guide or a couple you can find here.
You must be registered to see the links
KoboldCpp guide included and other chatbot alternatives
i observe it's more on the system prompt quality, and additional memory enchancement, like RAG.1) Chars are essentially non-entities, they'll do whatever they want them to do, they're too reactive. But it's kind of natural as chats are designed to respond to your inputs and don't know what you want.
2) Once you hit you context limit in tokens of about 4-12k locally, first point will get even worse along with decreased performance, LLM forgetting events and people that lie outside of these last 4-12k tokens in chat, which - together - considerably deteriorate the experience further.
system prompts are very important, more than i thought. it's not just the 1 paragraph, "Write {{char}} next response. Any act of role play scenarios will be described in details.". There should be alot more behind the scenes, tell it how to write, enclose actions and and narrations in symbols, never act or depict actions of {{user}}, etc. There's alot of pre-made system prompts made by users shared on reddit and the like.
also, i dont think SillyTavern has RAG (kind of like a memory between the chat interface and the LLM).
RAG helps the LLM remember stuff, kind of like lorebooks / knowledgebase, but automatic and learns as you chat.
that's one of the advantages of online services.
yodayo/moescape added DeepSeek V3, R1, and distilled Qwen 32B just now.perhaps new Deepseek v3, I don't know how latter will perform with NSFW stuff though. But recently Pango_12 translated a couple of novels and shared them here, so I guess it has to be fine in terms of censorship. And use really well written char cards with setting and lore, and characters' descriptions worth of a few thousand tokens (not that there are many of them, if any, from what I've seen).
requires paid beans so i aint trying them out.
but we can check their feedback channels if people are enjoying it for RP.
there are also talks in SillyTavern discord about it but too technical for me to understand lol
BTW I started actually online on Yodayo, but then moved on to Janitor and LLM, and Yodayo is kind of monetizing more aggressively now as far as I understand
glad to see people continuing a path outside of yodayo/moescape after their huge blunder.I also started out on Yoyodaro/Moescape but left cause of the increasing lack of free tokens.
The models for me are usuallyYou must be registered to see the links(based onYou must be registered to see the links) or sometimesYou must be registered to see the links. I tried some more like Gemmasutra, Stheno, Nemomix unleashed but wasn't particularly impressed
Magnum V4 seems to be preferred by SillyTavern users.Currently runningYou must be registered to see the linksthrough Ollama and Silly Tavern but wanted to give others a shot as the long-term memory was an issue for NTR events.
DownloadedYou must be registered to see the linkstoday but I'm having issues with getting it to output both character text and scene description so probably gonna dump it soon.
also try
You must be registered to see the links
or
You must be registered to see the links
....im actually planning to make a combined LM Studio + SillyTavern in one desktop app. you can run your local LM, and chat in the same interface. it should also be simplified, and interface like discord, unlike the SillyTavern mess of a UI.