Skip to main content
AIDiveForge AIDiveForge

Multimodal LLMs With an API

As of June 2026, AIDiveForge tracks 3 multimodal llms with an api. Curated multimodal llms with an api tracked by AIDiveForge. Listings are verified against each tool's live website and re-checked regularly.

Last updated June 4, 2026 · 3 tools

  1. Claude Sonnet 4.5

    1. Claude Sonnet 4.5

    Claude Sonnet 4.5 is a large language model from Anthropic with particular strengths in software coding, agentic tasks where it runs in a loop and uses tools, and in using computers. The model maintains focus for more than 30 hours on complex, multi-step tasks. Pricing remains the same as Claude Sonnet 4, at $3/$15 per million tokens. It is the most aligned frontier model Anthropic has released, showing large improvements across several areas of alignment compared to previous Claude models.

    Paid
  2. Llama 4 Scout

    2. Llama 4 Scout

    Scout carries a 10M token context window, meaning you can feed it an entire codebase or a stack of legal documents in a single pass without chunking pipelines or retrieval hacks. Maverick trades raw context depth for stronger multimodal reasoning, handling interleaved image and text inputs through native early-fusion architecture rather than a bolted-on vision adapter. Both models ship as open weights, downloadable from Hugging Face after license acceptance, with no API bill required if you run them yourself. The ceiling appears at inference: the Mixture-of-Experts architecture demands hardware that most teams do not have sitting idle, and running Scout's full 10M context window in practice requires significant GPU memory that a standard cloud instance will not cover.

    FreeOpen Source
  3. Muse Spark

    3. Muse Spark

    A natively multimodal reasoning model with support for tool-use, visual chain of thought, and multi-agent orchestration developed by Meta Superintelligence Labs.

    Paid

Listings on this page are sourced and verified by the AIDiveForge data pipeline. AIDiveForge is editorially independent — no money changes hands for inclusion.