×
Google Speech Recognition

Price per Channel

$50.00

By using Google Speech Recognition (GSR) plugin to UniMRCP Server, IVR platforms can utilize Google Cloud Speech API via the industry-standard Media Resource Control Protocol (MRCP) version 1 and 2.

Google Cloud Speech API performs speech to text conversion powered by machine learning providing the following main features.

Automatic Speech Recognition

Automatic Speech Recognition (ASR) powered by deep learning neural networking to power your applications like voice search or speech transcription.

Global Vocabulary

Recognizes over 110 languages and variants with an extensive vocabulary.

Streaming Recognition

Returns recognition results while the user is still speaking.

Word Hints

Speech recognition can be customized to a specific context by providing a set of words and phrases that are likely to be spoken. Especially useful for adding custom words and names to the vocabulary and in voice-control use cases.

Noise Robustness

Handles noisy audio from many environments without requiring additional noise cancellation.

Inappropriate Content Filtering

Filter inappropriate content in text results for some languages.

Kaldi Speech Recognition

Price per Channel

$50.00

By using Kaldi Speech Recognition plugin to UniMRCP Server, IVR platforms can utilize Kaldi Speech Recognition Toolkit via the industry-standard Media Resource Control Protocol (MRCP) version 1 and 2.

The Kaldi plugin connects to the Kaldi GStreamer Server, which needs to be installed separately. This integration is primarily intended for teams experienced with Kaldi building their own speech recognition systems with a special attention to Deep Neural Networks (DNNs). The plugin allows both an easy integration and reuse of existing infrastructure.

PocketSphinx Speech Recognition

Price per Channel

$50.00

By using PocketSphinx Speech Recognition plugin to UniMRCP Server, IVR platforms can utilize PocketSphinx speech recognition engine via the industry-standard Media Resource Control Protocol (MRCP) version 1 and 2.

PocketSphinx is a lightweight open source speaker-independent continuous speech recognition engine.

Julius Speech Recognition

Price per Channel

$50.00

By using Julius Speech Recognition plugin to UniMRCP Server, IVR platforms can utilize Julius speech recognition engine via the industry-standard Media Resource Control Protocol (MRCP) version 1 and 2.

Julius is a high-performance, small-footprint large vocabulary continuous speech recognition (LVCSR) decoder. Based on word N-gram and context-dependent HMM, it can perform real-time decoding on various computers and devices from micro-computer to cloud server. The algorithm is based on 2-pass tree-trellis search, which fully incorporates major decoding techniques such as tree-organized lexicon, 1-best / word-pair context approximation, rank/score pruning, N-gram factoring, cross-word context dependency handling, enveloped beam search, Gaussian pruning, Gaussian selection, etc.

Watson Speech Recognition

Price per Channel

$50.00

By using Watson Speech Recognition (SR) plugin to UniMRCP Server, IVR platforms can utilize IBM Watson Speech to Text API via the industry-standard Media Resource Control Protocol (MRCP) version 1 and 2.

IBM Watson Speech to Text API performs speech transcription powered by machine learning and supporting the following main features.

Powerful Real-time Speech Recognition

Automatically transcribe audio in real-time. Rapidly identify and transcribe what is being discussed, even from lower quality audio, across a variety of audio formats and programming interfaces.

Highly Accurate Speech Engine

Customize your model to improve accuracy for language and content you care most about, such as product names, sensitive subjects or names of individuals. Recognizes different speakers in your audio Spot specified keywords in real-time with high accuracy and confidence.

Built to Support Various Use Cases

Transcribe audio for various use cases ranging from real-time transcription for audio from a microphone, to analyzing 1000s of audio recording from your call center to provide meaningful analytics.

Languages

The speech recognition API currently supports 7 languages.

Yandex Speech Recognition

Price per Channel

$50.00

By using Yandex Speech Recognition (SR) plugin to UniMRCP Server, IVR platforms can utilize Yandex Cloud Speech to Text API via the industry-standard Media Resource Control Protocol (MRCP) version 1 and 2.

Yandex Speech to Text API performs speech transcription powered by machine learning and supporting the following main features.

Real-time Speech Recognition

Automatically transcribe audio in real-time using gRPC streaming.

Fault-free Service Infrastructure

The service infrastructure is designed with high loads in mind to ensure that the system is available and fault-free.

Numerous Models

Numerous recognition models such as maps, dates, names and numbers are supported.

Languages

The speech recognition API currently supports two languages.

GoVivace Speech Recognition

Price per Channel

$50.00

By using GoVivace Speech Recognition plugin to UniMRCP Server, IVR platforms can utilize GoVivace Speech APIs via the industry-standard Media Resource Control Protocol (MRCP) version 1 and 2.

The GoVivace plugin connects to the GoVivace Server, which needs to be installed separately or be used as a service.

Azure Speech Recognition

Price per Channel

$50.00

By using Azure Speech Recognition (SR) plugin to UniMRCP Server, IVR platforms can utilize Microsoft Azure Speech API via the industry-standard Media Resource Control Protocol (MRCP) version 1 and 2.

Microsoft Azure Speech API performs speech to text conversion powered by machine learning and supporting the following main features.

Advanced Speech Recognition Technologies

Advanced speech recognition technologies from Microsoft that are used by Cortana, Office Dictation, Office Translator, and other Microsoft products.

Real-time Continuous Recognition

The speech recognition API enables users to transcribe audio into text in real time, and supports to receive the intermediate results of the words that have been recognized so far. The speech service also supports end-of-speech detection.

Customized Language and Acoustic Models

For user scenarios which require customized language models and acoustic models, Custom Speech Service allows you to create speech models that tailored to your application and your users.

Languages

The speech recognition API supports many spoken languages in multiple dialects.

Transcribe Speech Recognition

Price per Channel

$50.00

By using Amazon Web Services (AWS) Transcribe plugin to UniMRCP Server, IVR platforms can utilize AWS Transcribe API via the industry-standard Media Resource Control Protocol (MRCP) version 1 and 2.

Amazon Transcribe uses deep learning to convert speech to text quickly and accurately providing the following main feature:

Easy-to-Read Transcriptions

Amazon Transcribe automatically adds punctuation and formatting so that the output closely matches the quality of manual transcription at a fraction of the time and expense.

Streaming Transcription

You can process audio in batch or in near real-time. Using a secure connection, you can send a live audio stream to the service, and receive a stream of text in response.

Timestamp Generation

Amazon Transcribe returns a timestamp for each word, so that you can easily find a word or phrase in the original recording or add subtitles to video.

Custom Vocabulary

You can add new words to the base vocabulary to generate more accurate transcriptions for domain-specific words and phrases like product names, technical terminology, or names of individuals.

Speech Recognition