A Modder Repurposed a Used V100 For LLM Acceleration
… In another test using Gemma 4 E4B, it managed 108 tokens per second, compared to the RTX 3060's mere 76 tokens per second. It needed nearly 200W of power to do that, but when power limited to just 100W, it achieved similar results: 95 tokens on the same test using Gemma 4. …