Running Ollama on a 15W CPU sounded ridiculous until I got it working with decent results
… The Intel Core i5-10210U was never designed with local LLMs in mind. It's a mobile chip slapped onto a compact mini PC motherboard. …
… The Intel Core i5-10210U was never designed with local LLMs in mind. It's a mobile chip slapped onto a compact mini PC motherboard. …
… Canvas, Artifacts, Codex, and others are transitional designs We're not there yet OpenAI, Anthropic, and Google knew that the chatbot mode wasn't going to suffice for coding, designing, and other hands-on projects. …
… They've got multimodal input text, image, and audio , a 128K context window, and a hybrid attention design that keeps memory use low. On a modern phone with enough RAM and a modern chipset, both of these models can run at surprising speed, complete with tool calling. …
… General models can largely handle most tasks, but specific LLMs designed for use in these areas can outshine larger general counterparts, unlocking more performance without touching hardware or making a single tweak. …
… When a GPU isn't available, LLMs run entirely in system RAM. …
… Related 7 things I wish I knew when I started self-hosting LLMs I've been self-hosting LLMs for quite a while now, and these are all of the things I learned over time that I wish I knew at the start. …
… Most NAS devices have an older-generation chip. I have a Ugreen DH4300+ NAS, which runs an 8-core rocket chip processor. …
… However, for the specific job of running large local LLMs, the architecture Apple landed on almost by accident as a power-efficiency play for laptops turned out to be the right shape for a workload nobody was thinking about when it was designed. …
… So, it shouldn’t sound weird when I say I’ve got a bunch of ESP32 MCUs in my arsenal, including tiny system-on-chip boards. …
… It already powers the DGX Spark desktop, which is something that company CEO Jensen Huang himself confirmed when he tied the consumer N1 and N1X chips to the same design. …