YouTube - :)

Merion Networks

Voice technologies made a huge leap in past few years. In advanced contact-centers you can see speech recognition technologies more often, and, sometimes, voice verification technologies. All such kind of technologies are called ASR- Automatic Speech Recognition

We wrote really cool article about setting up ASR on IP-PBX Asterisk. As recognition platform we are gonna use Yandex Speechkit. Are you ready to make a leap into voice recongition world?

Important! If you are using SpeechKit in commercial projects, then you need to sign an agreement with Yandex.

How it`s supposed to work?

We will consider following scenario: your Asterisk is receiving the call. Caller will hear following message: Hello! Thank you for your call. If you want to know order status, just say order, to talk with secretary just say secretary, and to talk with operator say operator. After that, the call will be routed to correct extension.

It`s all working according to the following scheme:

Integration scheme between Asterisk and Yandex SpeechKit

Getting API - key

First thing that you need to do is to receive the token from Yandex. Firstly, go to developer`s cabinet on following address: and click Get the key :

Getting the key for Yandex SpeechKit

In key creation field state it`s name. Choose the SpeechKit Cloud service:

Choosing the service for the Yandex API

Token is ready. Copy it and save it on your computer.

Token for advanced speech recognition
Integration script

A little bit of work left and our Asterisk will start to recognise the speech. We will use PHP-script, which will be called through AGI application in dialplan. Script is bellow, with comments in code.

#!/usr/bin/php -q 
$agi = new AGI(); #connecting AGI library; 
$audio = $argv[1]; #writing value of variable from AGI. We transfer auio-file, with all what Caller said.;
$token = 'your_token';
$theme = "queries"; #language model. queries -  general requests on different topics;
$lang = "en-EN"; #recognition language;
$uuid = md5(uniqid(rand(), true)); #unique ID which consist of 32 symbols;
system('sox '.$audio.'.wav -r 16000 -b 16 -c 1 '.$audio.'-conv.wav'); #converting audio into required audio format;
$cmd = exec('curl --silent -F "Content-Type=audio/x-pcm;bit=16;rate=16000;" -F "audio=@'.$audio.'-conv.wav"\?key='.$token.'\&uuid='.$uuid .'\&topic='.$theme.'\&lang='.$lang, $result); #generating a CURL to Yandex API for recognition;
$result_asr = implode($result); #creating the string from the array;
if (preg_match('!!si', $result_asr, $arr)) {
$asr_res = $arr[1];
} else {
} #highlight potential recognition values from the result;
if (intval(substr_count($asr_res, 'operator')) > 0) {
$ress = 1; #if there is a word operator in results, 1 will be returned to dialplan;
} elseif (intval(substr_count($asr_res, 'order')) > 0) {
$ress = 2; #if there is a word order in results, 2 will be returned to dialplan;
} elseif (intval(substr_count($asr_res, 'secretary')) > 0) {
$ress = 3; #if there is a word secretary in results, 3 will be returned to dialplan;
} else {
$ress = 0; #if no match is found, then 0 will be returned to dialplan;
$agi->set_variable("asr", $ress); #return the variable with recognition value to dialplan ;
system('rm -f '.$audio.'.wav');
system('rm -f '.$audio.'-conv.wav'); #delete temporary files with Caller`s request;

We want to highlight, that you can have your own triggers for voice recongition (words, which Caller will say). So you just need to change the words operator, order and secretary inside the script.

Working scheme is simple. By the way, you can download script itself via link below:

Download ASR script
After you downloaded the file, please save it with .php extension.

Save the script as asr.php in the /var/lib/asterisk/agi-bin directory and execute following commands on the server:

dos2unix /var/lib/asterisk/agi-bin/asr.php 
chown asterisk:asterisk /var/lib/asterisk/agi-bin/asr.php 
chmod 775 /var/lib/asterisk/agi-bin/asr.php


Now we are gonna change the script in the dialplan. Open the file /etc/asterisk/extensions_custom.conf and add there following information:

exten => s,1,Answer()
exten => s,n,Playback(custom/asr)
exten => s,n,Wait(1)
exten => s,n,Record(/tmp/${UNIQUEID}.wav,3,20)
exten => s,n,AGI(asr.php,/tmp/${UNIQUEID})
exten => s,n,Set(varasr=${asr})
exten => s,n,GotoIf($["${varasr}" = "1"]?dial111:ordercheck)
same => n(dial111),Dial(SIP/111,15,rt)
same => n,Hangup()
exten => s,n(ordercheck),GotoIf($["${varasr}" = "2"]?dial222:secretarycheck)
same => n(dial222),Dial(SIP/222,15,rt)
same => n,Hangup()
exten => s,n(secretaryheck),GotoIf($["${varasr}" = "3"]?dial333)
same => n(dial333),Dial(SIP/333,15,rt)
same => n,Hangup()

In this context you are greeting the subscriber (telling him about voice capabilities (with use of file /var/lib/asterisk/sounds/ru/custom/asr.wav)), and then, we are playing voice signal for Caller. Then caller is saying his request, which we`ll store in the file. In case, if caller said operator we`ll call the number 111, if the word was order - 222, and if it was secretary we`ll call the number 333. You can change numbering scheme according to your dialplan.

We almost there. Let`s set it up through FreePBX. We are gonna use Custom Destinations module. Now let`s follow this route:AdminCustom Destinations and click Add Destination:

Custom Destinations for speech recognition in FreePBX

Set it up the same way, as un screenshot above: Click Submit and Apply Config. Then go to your IVR menu (or it could be just inbound route). In our example we`ll set the speech recognition when you are pressing 1 in main interactive voice menu:

Yandex ASR in FreePBX

We`ve done all necessary steps. In the close future we`ll try to tell you about caller authentication via his voice, fully based on open-source solutions.