Running Google Gemma 4 Locally With LM Studio’s New Headless CLI & Claude Code
2026-04-30
![]()
Google Gemma 4 26B-A4B is an efficient mixture-of-experts model that activates only 3.8B of its 25.2B parameters per token, delivering performance comparable to much larger dense models while running at 51 tokens/second on a 48GB MacBook Pro M4. LM Studio 0.4.0's new headless CLI enables local deployment of this model for use with Claude Code, providing zero API costs, improved privacy, and reduced latency compared to cloud alternatives, making it an ideal option for developers who need a capable local AI model without expensive hardware requirements.
Was this useful?