Cohere Embed v4 vs Llama 3.2 90B Vision Instruct
Cohere Embed v4 and Llama 3.2 90B Vision Instruct are both large language models tracked by AIDiveForge. Below is a side-by-side comparison of pricing, capabilities, platforms, and ownership — sourced from each tool's live website and verified before publishing.

Cohere Embed v4
Cohere Embed v4 transforms text, images, and mixed content into unified vector representations for semantic search, RAG, document clustering, and similarity matching. The model supports 1,536-dimensional embeddings with flexible compression via Matryoshka embeddings (256, 512, 1024, 1536 dimensions). Priced at $0.12/1M text tokens and $0.47/1M image tokens, it delivers multimodal capabilities competitive with text-only alternatives. The API supports batch processing up to 128,000 tokens per request with asymmetric search optimization. Limitation: incompatible with v3 embeddings; corpus re-embedding required for upgrades.

Llama 3.2 90B Vision Instruct
Meta's 90B multimodal large language model with vision capabilities, fine-tuned for instruction-following across text and image understanding tasks.
| Attribute | Cohere Embed v4 | Llama 3.2 90B Vision Instruct |
|---|---|---|
| Pricing | Paid | — |
| Price | $0.12 per 1M text tokens; $0.47 per 1M image tokens | — |
| Free trial | 0 days | No |
| Open source | No | Yes |
| Has API | Yes | No |
| Self-hosted option | No | No |
| Platforms | Cohere Platform, AWS Bedrock, Azure AI Foundry, Amazon SageMaker, GitHub Models | — |
| Languages | English and 100+ languages for text input; English for image input | — |
| Released | 2025-04-15 | — |
| Pros |
|
|
| Cons |
|
|
Llama 3.2 90B Vision Instruct is open source; only Cohere Embed v4 exposes a public API. Choose based on which difference matters most for your workflow.
Comparison data is sourced and verified by the AIDiveForge data pipeline. AIDiveForge is editorially independent.