Program-as-Weights: Compile-Once Adapter Paradigm Matches 32B Models at 1/50 the Memory
Researchers from the University of Waterloo introduce Program-as-Weights (PAW), where a 4B-parameter compiler generates small reusable adapter weights for tasks that resist rule-based solutions. A 0.6B Qwen3 interpreter guided by these adapters matches a 32B model while using 1/50th the inference memory and running at 30 tokens/second on a MacBook M3. The authors also release FuzzyBench, a 10-million-example training dataset.
Why it matters
PAW reframes foundation model usage from per-input inference to a compile-once, run-many pattern. The 50x memory reduction enables frontier-quality task performance on consumer hardware.
Importance: 3/5
92 HF Daily Papers upvotes (July 3, highest of the day); 50× memory reduction vs 32B model; 30 tok/sec on MacBook M3