Ollama is still the easiest way to start local LLMs, but it's the worst way to keep running them
…Multiple community benchmarks and developer reports have shown that running the same model through Ollama produces fewer tokens per second compared to running it through llama.cpp directly, and this is a…