What model, quantization, and context size are you running?
Define repetition in your case. Does it start repeating the same or similar phrases in every scene? If so, I find that Mistral derived models are very susceptible to that particular behavior.
Give this page a look, filtering on the model size you are comfortable running:
You must be registered to see the links
The author has added metrics tracking repetition, originality, and semantic redundancy (
Semantic Redundancy: Detects when the same concept is expressed multiple times with different wording)
UGI and Willingness are the first two metrics I consider before anything else, but the other values can confirm behaviors you may or may not see in use.