Yeah that's always been the standard for models around 30b parameters sadly...22.5GB model file.... wow.
looks at the source model...
WTF
98.36GB!!!
brb, calling Jensen Huang to request a HGX-H200 sample.
Once it's loaded it's super quick though
Yeah that's always been the standard for models around 30b parameters sadly...22.5GB model file.... wow.
looks at the source model...
WTF
98.36GB!!!
brb, calling Jensen Huang to request a HGX-H200 sample.
there is v0.2 apparentlyI tried TheBloke_Noromaid-13B-v0.1.1-GPTQ recently. It was very good at chat (wink wink), not so much at prose.
me: create a php mysql pdo class with insert, update, delete
Output generated in 9.21 seconds (55.46 tokens/s, 511 tokens, context 71, seed 646386357)
tell me an erotic story with vulgar language and at least 400 token, describe sexual intercourse in a pornographic way