Model Hosting APIs
Hosted inference APIs (Replicate, Together, Fireworks).
PromptUnit
AI proxy that automatically routes requests to cheaper models while maintaining quality.
Xinference
Open-source library for unified deployment and serving of language, speech, and multimodal models across diverse hardware and…