This looks really interesting!
Some recent studies have shown that (for the performance demonstrated) most models are nowhere near as compact as they could/should be. This means that we should expect an explosion in the capability of small models like this as new techniques find ways to improve our models.
Unfortunately, I couldn't find a recommendation for how much VRAM you need to run this model, though it does call out being able to run it locally, which is awesome!
I'll try it out after work and see if it can run on an old 8GB 2070. 😄