Get This Tool
CoreAI Model Zoo
Pricing
- Model
- Free
Summary
Converting a model to run on-device for Apple Silicon is a documentation minefield — undocumented quantization gotchas, format mismatches, and conversion steps that work on one chip variant and silently fail on another. CoreAI-Model-Zoo is a community repository that pre-converts LLMs and vision-language models to Apple's `.aimodel` format and publishes the verified conversion code alongside them.
The repo ships Qwen3.5, Qwen3.6, Gemma 4, GLM-4, and LFM variants already converted, verified against iPhone 17 Pro GPU and ANE, and downloadable from Hugging Face. Conversion code, known gotchas, custom Metal kernels, and a Swift runner are included so teams can replicate or extend the work rather than reverse-engineer it. The larger dense and MoE models — Qwen3.6-27B, Qwen3.6-35B-A3B, GLM-4.7-Flash — are flagged Mac-only, so iPhone deployment is constrained to the smaller quantized variants. There is no API, no inference server, and no tooling outside the Apple ecosystem; teams targeting Android, Windows, or server-side inference will find nothing applicable here.
Bottom line: Pick this when you are shipping an iPhone or Mac app that needs on-device LLM inference and want a verified starting point rather than a blank conversion script — abandon it the moment your deployment target is anything outside Apple Silicon.
Community Performance Report Card
No community ratings yet. Be the first to rate this tool!
Community Benchmarks Community
Sign in to submit a benchmarkNo community benchmarks yet. Be the first to share a real-world data point.
Pros
Sign in to edit- Pre-converted `.aimodel` files verified on iPhone 17 Pro GPU and ANE, so you skip the conversion trial-and-error that otherwise consumes a sprint before you write a single line of app code.
- Conversion scripts and documented gotchas are published alongside the models, which means when Apple updates the format and your model breaks, you have a reproducible starting point rather than a blank slate.
- Custom Metal kernel examples for ANE versus GPU benchmarking are included, so teams optimizing inference latency on-device have concrete code to profile against rather than guessing at kernel configuration.
- Apache-2.0 and MIT licensed models in the zoo, so commercial iOS app deployments are not blocked by license restrictions on the converted artifacts.
- Self-hosted and fully offline — no API calls, no telemetry, no dependency on an external service going down during your demo or your App Store submission review.
Cons
Sign in to edit- Larger models — Qwen3.6-27B, Qwen3.6-35B-A3B, GLM-4.7-Flash — are explicitly Mac-only; iPhone deployment is limited to the smaller quantized variants, and teams building iPhone features around a 27B-class model will hit this wall at the architecture decision stage, not at integration.
- Model coverage reflects a single maintainer's conversion queue. When a team needs a model family not in the zoo — Mistral, Phi-4, LLaMA variants — there is no community pipeline to request or submit conversions, so they fork the conversion scripts and maintain their own repo from that point forward.
- There is no inference API, no server runtime, and no cross-platform path; teams that start here and later need Android parity or a backend inference endpoint abandon this entirely and re-implement against a different runtime such as llama.cpp or ONNX Runtime.
Community Reviews
Sign in to write a reviewNo reviews yet. Be the first to share your experience.
About
- Platforms
- iOS 27, macOS 27, iPhone 17 Pro, M4 Max
- API Available
- No
- Self-Hosted
- Yes
- Last Updated
- 2026-06-13T16:22:56.910Z
Best For
Who it's for
- Apple Silicon on-device AI development
- Edge deployment of Qwen, Gemma, and LFM models
- Custom Metal kernel experimentation for Core AI
What it does well
- On-device LLM inference on iPhone and Mac
- Running converted vision-language models locally
- Object detection and segmentation with RF-DETR variants
- Benchmarking Core AI performance with custom kernels
Discussion Community
Sign in to commentNo discussion yet. Sign in to start the conversation.
Compare CoreAI Model Zoo
Spotted incorrect or missing data? Join our community of contributors.
Sign Up to ContributeCommunity Notes & Tips Community
Sign in to contributeBe the first to contribute. General notes, observations, gotchas, and tips from people who use this tool day-to-day.
Frequently Asked Questions
- Is CoreAI Model Zoo free?
- Yes — CoreAI Model Zoo is fully free to use. There is no paid tier.
- Is CoreAI Model Zoo open source?
- Yes. CoreAI Model Zoo is open source.
- Can I self-host CoreAI Model Zoo?
- Yes. CoreAI Model Zoo supports self-hosting on your own infrastructure.
- What platforms does CoreAI Model Zoo support?
- CoreAI Model Zoo is available on: iOS 27, macOS 27, iPhone 17 Pro, M4 Max.
Hours Saved & ROI Stories Community
Sign in to contributeBe the first to contribute. Concrete time/cost savings, with context. e.g. "Cut my code review backlog from 4h to 45m per week."
Curated lists that include this category
CoreAI-Model-Zoo is an open-source community repository that converts open-weight LLMs into Apple’s Core AI `.aimodel` format — the successor to CoreML — and makes the results downloadable, verified, and reproducible. The core workflow is: download a pre-converted model from the linked Hugging Face repos, drop it into an iOS 27 or macOS 27 project, and run inference through the included Swift runner. For teams who need to convert their own models, the `conversion/` directory contains the scripts and the documented gotchas that are otherwise scattered across Apple developer forums.
The differentiating feature is verification. The models are not just converted and uploaded — they are confirmed to run on actual hardware (iPhone 17 Pro GPU and ANE), with notes on what broke during conversion and what workaround was applied. That paper trail is the part Apple’s own documentation does not provide, and it is what separates this from a generic model dump.
The repository fits teams doing early-stage on-device AI development on Apple platforms who need a working baseline fast. It breaks — or rather, becomes irrelevant — the moment a project requires cross-platform deployment, a REST inference API, production monitoring, or any model family outside the zoo’s current coverage. There is no versioning policy, no SLA, and the model selection reflects one maintainer’s conversion priorities. Teams that outgrow the zoo’s model list will use the conversion code as a template but end up maintaining their own fork.
The `knowledge/` directory documents Core AI-specific behavior: quantization constraints, MoE handling, MLA architecture notes for GLM-4.7-Flash, and custom Metal kernel experiments for benchmarking ANE versus GPU throughput. The `apps/` directory provides reference Swift integrations. No CI pipeline for model validation is documented in the scraped content, so freshness of any given model file depends on maintainer cadence.
