Is Whissle Gateway free?

Whissle Gateway is a paid tool. No permanent free tier is offered.

Is Whissle Gateway open source?

Yes. Whissle Gateway is open source.

Does Whissle Gateway have an API?

Yes. Whissle Gateway exposes a developer API. See the official documentation at https://whissle.ai for details.

Can I self-host Whissle Gateway?

Yes. Whissle Gateway supports self-hosting on your own infrastructure.

What platforms does Whissle Gateway support?

Whissle Gateway is available on: macOS, Linux, WSL, Docker.

Visit Whissle Gateway

Get This Tool

License: License: unverified

Local-run terms: Self-host full stack via Docker on macOS/Linux/WSL after one-line install script; free browser app available.

Paid Hosted API Official Website

Whissle Gateway

FreemiumOpen SourceAPISelf-HostedAgentic

Summary

Standard ASR pipelines capture words and throw away everything else — emotion gone, intent gone, speaker context gone, all in the same moment the transcript lands. Whissle is built for the teams who needed that context to actually route, escalate, or act on the call.

Whissle's Stream2Action architecture feeds audio, text, or video through a single-pass discriminative model — META-1 — and returns structured JSON carrying transcription, speaker diarization, emotion, intent, age, gender, and entities simultaneously. The full stack (ASR, LLM, TTS, diarization) runs self-hosted on a single GPU via Docker, which is the core production story here. The cloud API is documented as temporarily down while on-prem infrastructure is reinforced, so teams who need cloud failover have no fallback path right now. Video input is on a stated roadmap; text streaming arrives next. For contact center or privacy-sensitive workloads where you control the hardware, the on-prem path is active — for anything cloud-dependent, you are waiting.

Bottom line: Pick Whissle for a self-hosted, single-GPU voice intelligence stack where emotion and intent metadata matter — plan on a different architecture if your deployment requires a live cloud API or video ingestion before those roadmap items ship.

Community Performance Report Card

No community ratings yet. Be the first to rate this tool!

Best For: Real-time voice and multi-modal processing, Self-hosted or on-prem deployments, Applications needing emotion/intent alongside transcription, Single-GPU voice AI stacks, Developers building streaming agents

Community Benchmarks Community

No community benchmarks yet. Be the first to share a real-world data point.

Audio & Voice Transcription / STT

Added on June 18, 2026

Pros

Single-pass emotion, intent, speaker, and entity extraction alongside transcription, so downstream routing logic gets a structured JSON payload instead of raw text that requires a second model call to interpret.
Full stack — ASR, LLM, TTS, diarization — runs on a single GPU via self-hosted Docker, which means teams in regulated industries can keep audio on-prem without stitching together separate self-hosted components.
META-1 processes in real time rather than post-call, so a contact center agent or escalation router receives intent signals while the call is still active — not after it ends.
Provider-agnostic, open-source self-hosted architecture, so teams are not locked to a vendor's cloud pricing model when inference volume scales.
The browser and macOS app extend the same intelligence stack to ambient and on-device scenarios, so developers can prototype voice agents locally before committing to a server deployment.

Cons

The cloud API is explicitly offline at the time of listing. Teams that need a hosted endpoint for testing, staging, or production fallback have no active path — they either self-host immediately or wait for service restoration with no stated timeline.
Video input is on a multi-month roadmap and text streaming is listed as coming next month; teams building pipelines that ingest video or require text-stream intelligence today will hit a hard capability gap and need a different tool for those modalities.
Agents Studio — the interface for building and deploying multi-modal voice agents — is listed as cloud-only and coming soon. Teams who need a visual agent-building environment now will find no equivalent on the self-hosted Gateway path, pushing them toward competitors like Vapi or Retell that have live agent-building tooling.
Community stress-test data on single-GPU throughput under sustained concurrent call load is not publicly available. Teams running high-volume contact center deployments cannot size hardware requirements from documented benchmarks — they are provisioning blind until they run their own load tests.

Community Reviews

No reviews yet. Be the first to share your experience.

About

Platforms: macOS, Linux, WSL, Docker
API Available: Yes
Self-Hosted: Yes
Last Updated: 2026-06-18T04:32:07.270Z

Best For

Who it's for

Real-time voice and multi-modal processing
Self-hosted or on-prem deployments
Applications needing emotion/intent alongside transcription
Single-GPU voice AI stacks
Developers building streaming agents

What it does well

Contact center voice agents resolving calls quickly
Real-time transcription with speaker and emotion metadata
Stream-to-action pipelines for audio/text/video inputs
On-device or on-prem enterprise search and intelligence
Development of privacy-first voice agents

Discussion Community

No discussion yet. Sign in to start the conversation.

Compare Whissle Gateway

Spotted incorrect or missing data? Join our community of contributors.

Community Notes & Tips Community

Be the first to contribute. General notes, observations, gotchas, and tips from people who use this tool day-to-day.

Frequently Asked Questions

Is Whissle Gateway free?: Whissle Gateway is a paid tool. No permanent free tier is offered.
Is Whissle Gateway open source?: Yes. Whissle Gateway is open source.
Does Whissle Gateway have an API?: Yes. Whissle Gateway exposes a developer API. See the official documentation at https://whissle.ai for details.
Can I self-host Whissle Gateway?: Yes. Whissle Gateway supports self-hosting on your own infrastructure.
What platforms does Whissle Gateway support?: Whissle Gateway is available on: macOS, Linux, WSL, Docker.

Hours Saved & ROI Stories Community

Be the first to contribute. Concrete time/cost savings, with context. e.g. "Cut my code review backlog from 4h to 45m per week."

Curated lists that include this category

Traditional ASR returns a transcript. What it discards — tone, speaker identity, emotional state, intent signal — is often the information a contact center agent or routing system actually needs to act. Whissle’s Stream2Action pipeline addresses that gap by running audio (and eventually text and video) through META-1, a multi-modal discriminative model that produces a single structured JSON payload per pass: transcription with punctuation, speaker diarization, emotion tags, intent classification, entity extraction, and speech analysis metrics like fluency and pitch. The output feeds directly into a generative LLM layer, a routing engine, or third-party APIs via webhooks — the vendor describes the architecture as converting any stream into actionable intelligence without a multi-step pipeline.

The differentiating claim is speed with depth. The vendor positions META-1 as bridging the gap between fast-but-shallow ASR and deep-but-slow multi-modal LLMs — a single forward pass that returns semantic metadata in real time rather than after the fact. Whether that holds at production call volumes is something the community has not yet stress-tested publicly, and the cloud API being offline limits independent verification.

Whissle fits best on teams with the infrastructure to self-host and the use case — contact center voice agents, on-prem enterprise search, privacy-first voice apps — where sending audio to a third-party cloud is a non-starter. The Docker install is documented as a one-line curl command that pulls the image and starts with Docker Compose, covering macOS, Linux, and WSL. The Agents Studio for building and deploying multi-modal voice agents is listed as coming soon on cloud; on-prem agent workflows are available through the Gateway today.

The browser product (macOS, free download) and a desktop macOS app extend Whissle to ambient voice intelligence and on-device AI scenarios. The API surface covers ASR, TTS, LLM, and voice agents with developer docs and streaming examples published — though with cloud services offline, live API testing against the hosted endpoint is blocked until service resumes.

Get This Tool

Whissle Gateway

Summary

Community Performance Report Card

Community Benchmarks Community

Pros

Cons

Community Reviews

About

Best For

Who it's for

What it does well

Discussion Community

Compare Whissle Gateway

Community Notes & Tips Community

Frequently Asked Questions

Hours Saved & ROI Stories Community

Curated lists that include this category

Murf

Krisp

Riverside.fm

Get This Tool

Share This Tool

Whissle Gateway

Summary

Community Performance Report Card

Community Benchmarks Community

Pros

Cons

Community Reviews

About

Best For

Who it's for

What it does well

Discussion Community

Compare Whissle Gateway

Community Notes & Tips Community

Frequently Asked Questions

Hours Saved & ROI Stories Community

Curated lists that include this category