There's a new paper that proposes new way to reduce model size by 50-70% without drastically nerfing the quality of model. Basically promising something like 70b model on phones. This guy on twitter tried it and its looking promising but idk if it'll work for image gen
https://x.com/i/status/2005995329485959507
https://redd.it/1q0a42k
@rStableDiffusion
https://x.com/i/status/2005995329485959507
https://redd.it/1q0a42k
@rStableDiffusion
X (formerly Twitter)
Brian Roemmele (@BrianRoemmele) on X
BOOM!
It works on LLMs!
I am using Nash Equilibrium on the attention head of an LLM! I may be the first to do this at this level.
I am achieving a 50-70% effective size reduction on a quantization of 4-bit weights shrinking the model and is enabling on…
It works on LLMs!
I am using Nash Equilibrium on the attention head of an LLM! I may be the first to do this at this level.
I am achieving a 50-70% effective size reduction on a quantization of 4-bit weights shrinking the model and is enabling on…