Request Tool KoboldAI + Erebus Model for text-based adventure

Gluttonous · Oct 4, 2023

ruswolf said:
I also have a question. do you know a model with a large enough context to write a long story with a plot and stay consistent?

You must be registered to see the links

koboldcpp can raise the model's context limit up to 32K, although I usually use the 8K limit. Also I recommend these two 20B models.

You must be registered to see the links

ruswolf · Oct 4, 2023

Gluttonous said:
You must be registered to see the links
koboldcpp can raise the model's context limit up to 32K, although I usually use the 8K limit. Also I recommend these two 20B models.

You must be registered to see the links

You must be registered to see the links

Thank you. In this case, another question. How strong does an increase in the context of the rubberized memory size affect? Some models declared as 8k in my case really managed to launch only with 1-2k context

Gluttonous · Oct 5, 2023

ruswolf said:
Thank you. In this case, another question. How strong does an increase in the context of the rubberized memory size affect? Some models declared as 8k in my case really managed to launch only with 1-2k context

I'm not quite sure what you mean by that, all I can say is that in the introductory screen of koboldcpp it would indicate that expanding the context limit would bring an inevitable increase in confusion, which is normal. In my personal use of the two 20B models, it does work properly with 8K contexts, and I don't make any guarantees about the other models.

defp · Oct 11, 2023

What would be a good model to run on a 4090 for use as a nsfw game NPC chatbot?

Gluttonous · Oct 12, 2023

defp said:
What would be a good model to run on a 4090 for use as a nsfw game NPC chatbot?

Of the two models I recommended above, MLewd-ReMM-L2-Chat-20B-GGUF is better for chat, Emerhyst-20B-GGUF is better in terms of writing quality, and the 24G of video memory should allow you to run them fully loaded into the GPU with a 4K or 8K context size.

HammerAI · Dec 21, 2023

If anyone wants to try a new UI for running uncensored models via llama.cpp locally, we just released our Windows desktop app - you can see the link in my profile!

HammerAI · Dec 21, 2023

Nice, we also really like llama2. If you'd like try a new UI for running uncensored models via llama.cpp locally, we just released our Windows desktop app - you can see the link in my profile!

smirk · Dec 25, 2023

Gluttonous said:
You must be registered to see the links
koboldcpp can raise the model's context limit up to 32K, although I usually use the 8K limit. Also I recommend these two 20B models.

You must be registered to see the links

You must be registered to see the links

Koboldcpp + Emerhyst-20B-Q4 was actually pretty easy to set up and get running on my 3060Ti w/8k context; it's not super fast but runs well enough. I find the model writes pretty consistently good prose, keeps track of objects, locations and people and generally performs well. It does need a good prompt and some parameter tweaking to get it to write the way you want and I find that it sometimes goes off on long tangents in the middle of the action - as if characters just start day-dreaming and reflecting on all their life choices for no apparent reason.

If anyone has any tips on how to squeeze a little more performance out of this setup without sacrificing too much quality, I'd like to hear it. As with everything AI, VRAM is probably the limiting factor on these mid-tier 30xx cards.

Blackjack1982 · Mar 7, 2024

ruswolf said:
For local usage

You must be registered to see the links
- easy to install and use
Erebus 20b work on RTX 2060 with 4 bit quantization (can be downloaded through provided ui) there is also a large list of other available models

You must be registered to see the links
- a general-purpose web interface for large language models, works with many types of models downloaded from the Internet, but requires certain knowledge for gram configuration.

Can you tell how to launch Erebus with 4 bits? Simply launching from the provided interface did not work on the 3090. On reddit they write that it is impossible to launch without 40GB of video memory.

tooldev · Mar 7, 2024

Blackjack1982 said:
Can you tell how to launch Erebus with 4 bits? Simply launching from the provided interface did not work on the 3090. On reddit they write that it is impossible to launch without 40GB of video memory.

I hope you meant 4GB otherwise I am really outdated gpu wise

Blackjack1982 · Mar 7, 2024

tooldev said:
I hope you meant 4GB otherwise I am really outdated gpu wise

Nope. 40

Zairus · Mar 7, 2024

well, i'm prety disapointed i would say, spend couple of days on this whole, and can say now, thats it's so shit. It's basically all about navigating AI into the correct road that in your mind, and while it's sounds not that bad, it's actually very bad, coz you always have to correct it, you have to spend an enourmous ammount of time for it to get something decent, always return, fixing, hoping it will go in the right direction and if it's not, well, you have to return and guide it better. And at some point it will become a nonsence it will start forget everything, so you have to split things accordingly, also the writing by itself is pretty mediocre. Well, maybe things will improve in the future. I actually tried Novel AI before. And while people was happy with, it was basically not much better, even if using all that options they provided. Let see how it will go.

Request Tool KoboldAI + Erebus Model for text-based adventure

Gluttonous

Newbie

ruswolf

Newbie

Gluttonous

Newbie

defp

Newbie

Gluttonous

Newbie

HammerAI

New Member

HammerAI

New Member

smirk

Member

Blackjack1982

New Member

tooldev

Active Member

Blackjack1982

New Member

Zairus

Member