how do i do that exactly cuz i can handle alot but if i need to trim some tokens to get a bit faster respones time then i will cuz i relly have no clue what im doing in lms other then more tokens= more story. so like what exactly is gpu offload out of 30 and cpu thread pool size mean
how do i do that exactly cuz i can handle alot but if i need to trim some tokens to get a bit faster respones time then i will cuz i relly have no clue what im doing in lms other then more tokens= more story. so like what exactly is gpu offload out of 30 and cpu thread pool size mean