Google's $100 AI Ultra is still overpriced, and here's what I use instead
…That pricing shift is unsustainable, and I've started running a local LLM as a hybrid model to offset the costs. It's not as fast, or as capable, but it's…
Tracked topic
Gemma is a family of open-weight language models released by Google for text generation and related NLP tasks.
…That pricing shift is unsustainable, and I've started running a local LLM as a hybrid model to offset the costs. It's not as fast, or as capable, but it's…
…Related I ran local LLMs on Intel's cheapest iGPU, and the results were surprisingly decent It ain't no match for a dedicated GPU, but you can run some light LLMs…
…File "/usr/local/lib/python3.11/dist-packages/transformers/models/gemma3/modeling_gemma3.py", line 880, in forward [rank0]: logits = self.lm_head(hidden_states[:, slice_indices, :]) [rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [rank0]: File "/usr/local…
…Related Google's Gemma 4 isn't the smartest local LLM I've run, but it's the one I reach for most Google's newest Gemma 4 models are both powerful…
…OpenClaw can now run local models through LM Studio on NVIDIA GPUs, unlocking faster on-device performance. 🦥 Unsloth and NVIDIA have teamed up to eliminate hidden bottlenecks that slow down fine…
…giving local models a try. There has never been a best time for that either, particularly after Google released a 12-billion-parameter version of Gemma 4 yesterday, designed to run on…
…Related 7 self-hosted services I use that can run perfectly on a Raspberry Pi Not every self-hosted application requires a top-of-the-line workstation Open WebUI + local LLMs provide…
I made my first macOS utility app that ships with a bundled Gemma 4 model, specifically the Gemma E4B one. It made my app DMG have 5.3 GB in size, but I think it is a small size for the power that this free local model c…
Gemma just crushed Qwen in a local LLM gamedev contest! Device: MacBook Pro M5 Max, 64GB RAM Qwen 3.6 27B: 32 tokens/sec · 18m 04s · 33,946 tokens. Gemma 4 31B: 27 tokens/sec · 3m 51s · 6,209 tokens. So what is more impo…
Hi guys.I have been working on Hitoku Draft, an open-source, voice-first AI assistant that runs entirely locally. I posted about it already, and now it has also transcription with voice editing. Looking for feedback, as …
Update from the lawyer with the V100 server. A few of you asked what I actually ended up running once the dust settled, so here it is. Still just a lawyer, still driving the whole thing through Claude Code, still not ful…
Claude Code like agentic workflow ai too costly for me.Any LLM can I run with VSCode at the below setup? 16ram Intel core i7 h processor 13gen 512gb NVMe SSD I want to run the ai as local agentic workflow with Vscode.I w…
…And the option that really pulled me in was BYOK with any OpenAI-compatible endpoint, which means you can also point it at a local model running through something like LM Studio…
…To run local models, your system will need at least 64GB of RAM. For running larger models, like DeepSeek v4, Pae recommends systems with about 128GB of RAM. But Pae believes local…
…Adversarial attacks against VLMs can leverage open-box optimization techniques such as Projected Gradient Descent (PGD) to craft minimally perceptible pixel perturbations or localized adversarial patches that manipulate model outputs, even for…