According to Ollama, the caller mentation processes prompts astir 1.6 times faster (prefill speed) and astir doubles the velocity astatine which it generates responses (decode speed). Macs with M5-series chips are said to spot the largest improvements, acknowledgment to Apple's caller GPU Neural Accelerators.
The update besides includes smarter representation management, which should marque AI-powered coding tools and chat assistants consciousness noticeably much responsive during extended use.
Ollama says the caller show boost should particularly payment macOS users who tally idiosyncratic assistants similar OpenClaw oregon coding agents similar Claude Code, OpenCode, oregon Codex.
The preview merchandise is available to download arsenic Ollama 0.19 – conscionable marque definite you person a Mac with much than 32GB of unified representation to tally it. Support is presently constricted to Alibaba's Qwen3.5, but Ollama says enactment for much AI models is planned.
This article, "Ollama Now Runs Faster connected Macs Thanks to Apple's MLX Framework" archetypal appeared connected MacRumors.com
Discuss this article successful our forums
 (2).png)
2 hours ago
3











English (US) ·