The M5 MacBook Air after three weeks: what devs actually got

The MacBook Air M5 is a real but incremental upgrade for developers, and its biggest wins aren't where Apple's marketing points you.

Three weeks after the M5 Air hit shelves on March 11, the dust has settled enough to separate the spec sheet from the daily reality of building software on this machine. I've been digging through benchmarks, Apple's own ML research data, and the thermal throttling reports that reviewers have been flagging since day one.

Here's the short version: if you're on an M3 Air or older, this is a strong upgrade. If you're on an M4 Air, stay seated.

What the M5 actually delivers

The M5 keeps the same 10-core CPU layout as its predecessor (four performance cores, six efficiency cores), built on TSMC's third-generation 3nm process. Apple's headline claim of 15% faster multithreaded CPU performance over the M4 checks out. MacRumors reported a Geekbench 6 multi-core score of 17,073 for the M5 Air, compared to the M4 Air's average of 14,731.

For context, that puts the M5 Air ahead of the M3 Pro MacBook Pro (15,260) in multi-core. A fanless $1,099 laptop outscoring last generation's Pro chip is worth noting.

Wired's review measured roughly 10% CPU improvement and a more meaningful 30% GPU uplift in Cinebench 2026. The GPU gains matter more for developer workflows than the CPU bump, especially for anyone doing Metal compute, shader work, or ML inference.

Memory bandwidth jumped to 153 GB/s from the M4's 120 GB/s, a 28% increase. This directly affects how fast models run during token generation. The Air now supports up to 32GB of unified memory, configurable up to 4TB of storage, and ships with 512GB base storage instead of 256GB.

The ML inference story is where it gets interesting

Apple's Machine Learning Research team published benchmarks comparing MLX inference on M4 and M5 MacBook Pros. The M5's GPU Neural Accelerators, a new per-core matrix multiplication unit, deliver 3.5-4x speedups on time-to-first-token for LLM inference. That's the compute-bound prefill step where the model processes your prompt.

The numbers: an M5 MacBook Pro pushes TTFT under 10 seconds for a dense 14B model and under 3 seconds for a 30B Mixture-of-Experts architecture (Qwen 30B, 4-bit quantized), both using MLX with a 4096-token prompt. Subsequent token generation, which is memory-bandwidth-bound, showed a 19-27% improvement over M4.

These benchmarks were run on the MacBook Pro with 24GB, but the base M5 chip in the Air is the same silicon. With 24GB or 32GB configured, you can comfortably hold an 8B model in BF16 or a 30B MoE at 4-bit quantization under 18GB of memory. That's a usable local LLM dev setup in a fanless laptop.

The catch: you need macOS 26.2 or later to unlock the Neural Accelerator support in MLX. Without it, you're leaving the biggest M5 improvement on the table.

The thermal reality check

Here's where the Air's developer story gets complicated. Tom's Hardware found that the M5 Air starts strong in Cinebench 2026 at around 3,415, then steadily drops to the low 2,300s under sustained load. That's a 32% performance decline once thermal throttling kicks in.

Gizmodo noted that even the base M5 MacBook Pro with its single fan showed slight thermal throttling, raising concerns about the fanless Air during extended workloads. The design hasn't changed since the M2 redesign in 2022, still using a thin graphite sheet for heat dissipation.

For short bursts, compile jobs, CI test suites, quick inference runs, the Air delivers full M5 performance. For sustained compilation of a large monorepo or a long training run, you'll hit the thermal wall within 10-15 minutes. This isn't new to the Air, but the M5 produces more heat than the M4, which makes the gap between burst and sustained performance wider.

Pricing and the dev config calculus

The 13-inch M5 Air starts at $1,099 ($999 for education), up $100 from the M4 Air. The base config ships with 16GB of memory and 512GB of storage. The 15-inch starts at $1,299 and guarantees the full 10-core GPU (the cheapest 13-inch ships with an 8-core GPU variant).

For developer work, the 16GB base config will handle web development, mobile app builds, and moderate container workloads. If you're planning to run local LLMs, train models, or juggle Docker alongside an IDE, you want 24GB minimum. Stepping up to 32GB with 1TB of storage pushes the price into MacBook Pro territory, which is where the decision gets awkward: you're paying Pro prices for Air thermals.

The sweet spot for most developers is probably the 24GB/512GB 13-inch at whatever Apple charges for that config. You get enough memory for local inference work without bleeding into Pro pricing.

What's next

The M5 Pro and M5 Max landed in updated MacBook Pros at the same time as the Air, and they introduce Apple's new Fusion Architecture with dual-die packaging. The M5 Pro supports up to 64GB of unified memory at 307 GB/s bandwidth. For developers who need sustained performance and bigger model headroom, that's the real upgrade path.

Apple's Metal 4 and the new Tensor APIs let developers program the Neural Accelerators directly, and frameworks like Core ML and Metal Performance Shaders get automatic gains. The MLX ecosystem continues to grow on Hugging Face, with a dedicated mlx-community for quantized models.

The M5 Air is a solid machine that does one thing particularly well: it puts genuinely useful ML inference hardware in a $1,099 laptop. The 3.5-4x TTFT improvement for local LLMs is the headline stat that matters for developers, not the 15% CPU bump. But if you're already on an M4 Air and your workflow doesn't involve ML, the upgrade math doesn't work. Wait for M6.

Leon Vasquez covers developer tools and infrastructure for The Daily Vibe.