Unity GPTits [v0.4.10] [MultisekaiStudio]

3.30 star(s) 6 Votes

bkku

New Member
Jan 23, 2022
7
1
In offline mode I'm having an issue where sometimes GGML indicates that the reply is finished but it doesn't appear in the game itself. Any clue on what might cause that?
 

ZedBee

New Member
Sep 2, 2017
1
0
is there a limit to how long a session can be using the open api i increase the max tokens and messages in memory options but it seems after awhile i enter a response but get a blank response back i can open a seperate new session and works fine but can't continue the other session and with no save would lose the progress.
 

IllumiNaughty

Newbie
Aug 5, 2017
29
40
Thanks for the reply, I figured out about CL out clbast using GPU acceleration. how many threads/layers do you use for Pygmalion? do you happen to know what the other settings/check boxes are for in configuration?
The amount of threads and layers are dependent on what CPU and BPU you are running. My Ryzen 7 5800x Maxes out at 16 threads with 16GB of RAM and my 6800xt 16GB of VRAM only wants to go up to 28 layers.
 

IllumiNaughty

Newbie
Aug 5, 2017
29
40
is there a limit to how long a session can be using the open api i increase the max tokens and messages in memory options but it seems after awhile i enter a response but get a blank response back i can open a seperate new session and works fine but can't continue the other session and with no save would lose the progress.
In offline mode I'm having an issue where sometimes GGML indicates that the reply is finished but it doesn't appear in the game itself. Any clue on what might cause that?
I also ran into this issue after a while, I just undo my response and retry. Usually works.
 

Spike9

New Member
Jun 25, 2017
6
1
Any tips to make them remember things longer?
Or any tips so i dont have to clear chat log? the AI seems to forget mid roleplay rather fast.
 

paicadra

Newbie
Jun 9, 2018
41
51
After some testing of the Offline Models, I have found Vicuna to be smarter but slow as hell where Pygmalion is fast but dumb as shit.
Some testing seems to indicate that file size seems to be correlated to performance. smaller models being fast and dumb and larger models being smarter but slow as shit. but i haven't tested enough models by any means to make a solid connection there. I did find a good compromise though:

It behaves pretty much like Vicuna, but generates tokens much faster.
Also Y'all might wanna check out . the GGML.exe appears to be a version of that.
 

Kyookino

New Member
Aug 17, 2018
9
8
In offline mode I've tried using several different models. Whenever I type something I either get a warning that I can't connect to destination host, or I get no response at all and in GGML.exe it says "exception: access violation reading" followed by a hex value. Is it something I've done wrong with the settings, or is it the program not working correctly for some reason?

Edit: Turns out the "exception: access violation reading" came from putting too big of a number in "Max Tokens". After lowering it I managed to get Pygmalion to work.

Edit 2: I guess saying Pygmalion "Works" is pretty generous. It has no idea what the characters are supposed to be like and just spits out random garbage. Apparently Harlen is a 5 year old boy who loves cars and trucks. I guess I'll wait for this to get a bit more advanced before trying it again.
 
Last edited:

JCD

Newbie
Jan 3, 2018
21
13
fellas im not that savvy when it comes to setting up ai but I have a novelai subscription, how do I make it work on this?
 

Kyookino

New Member
Aug 17, 2018
9
8
After managing to get other offline models to work I've realized none of them seem to be very smart. It's a bit ridiculous how big the difference is between the offline models and the Open AI. None of the offline models I've tested can understand anything in the character profile other than the name, and sometimes they mess that up as well. They go from simple responses to immediately spitting out entire Wikipedia entries on psychology or repeat descriptions of refurbished Japanese lanterns on repeat. If I just repeat my message several time I can sometimes get a response that somewhat matches, but the AI doesn't seem to be able to follow a coherent plot. Eventually the model just breaks down and stops responding saying HTTP/1.1 503 Service Unavailable.

Unless someone knows how to fix these issues I don't think I'll be able to play this properly.
 

ApxuMpak

New Member
Apr 11, 2019
4
2
I've experimented with offline models (vicula mostly) and what is interesting is that CLblast with my 6700XT makes generation speed to 0.5 token a second with 20, 40 and 43 layers as well as with 5 and 10 threads.
OpenBLAS works best with the 5 threads (tried 5, 10, 12) giving 2.2-2.5 tokens a second on Ryzen 5600. My problem is that I cannot change temperature it is always 1.06, and I cannot understand how to use a command line to change that
 
  • Like
Reactions: IllumiNaughty
3.30 star(s) 6 Votes