Article Summary

ANEMLL is a new open-source project focused on accelerating the porting of Large Language Models (LLMs) to Apple’s Neural Engine (ANE) for on-device inference. Currently, it supports optimized versions of Meta’s LLaMA 3.2 models (1B & 8B), DeepSeek R1 8B, and DeepHermes models. The project provides a pipeline from model conversion to inference, utilizing CoreML and Swift UI sample code. The Alpha Release 0.3.0 includes tools for converting models, running inference via a chat interface (with conversation history management), and provides a reference implementation in Swift. To use ANEMLL, you’ll need a macOS Sequoia system with at least 16GB of RAM, Xcode Command Line Tools, and a virtual environment. The project is actively seeking community contributions and showcasing integrations like the anemll-server. It’s licensed under the MIT License and can be found on Hugging Face: [https://hugginface.co/anemll](https://hugginface.co/anemll).