Lol the suggested hardware for usable performance is ~$50k MSRP for the GPUs alone and that’s an SXM5 socket so all proprietary extremely expensive and specific hardware.
My PC currently has a 7900 XTX which gives me about 156 GB combined VRAM, but it literally generates 1-3 words per second even at this level. DDR5 wouldn’t really help, because it’s a memory bandwidth issue.
TBH for most reasonable use cases 8 bit parameter size quantizations that can run on a laptop will give you more or less what you want.
Demo Driven Development is wayyy worse.