Speech to text using javascript

Published on November 25, 2016

Speech recognizing software everywhere nowadays. From Google’s latest Assistance to Apple’s Siri, Amazon’s Echo and so many out there. Almost all big Sofware development companies have their Own AI implementation. This is like Browsers war 10 years back but this time AI war begins…

Anyway, Today we’ll create a simple web application to which we can shout at. It will recognize what we speak, Convert it into text and show in a textbox. Isn’t it cool?

Demo

In order to do this, we will use [SpeechRecognition](https://dvcs.w3.org/hg/speech-api/raw-file/tip/speechapi) API.

This technology is still new. All browser may not support this. Some browser may support with a browser prefix.

Ok, now let’s create our index.html.

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="UTF-8" />
    <title>Speech To Text</title>
    <link rel="stylesheet" href="./app.css" />
  </head>
  <body>
    <form role="form">
      <legend>Speech to text</legend>
      <em id="speek-now">Speak now...</em>
      <div class="form-group">
        <textarea type="text" class="form-control" id="editer" rows="10">
        </textarea>
      </div>
      <button type="button" class="btn btn-primary" id="again">
        Click Here to Speek
      </button>
    </form>
    <script src="./app.js"></script>
  </body>
</html>

Simple form with textarea and a button element.

Let’s see the app.js

(function() {
  var SpeechRecognition =
    SpeechRecognition ||
    webkitSpeechRecognition ||
    mozSpeechRecognition ||
    msSpeechRecognition;

  // End if SpeechRecognition api not available in Browser
  if (!SpeechRecognition) {
    alert("Your Browser dosen't support");
    return;
  }

  var speech = new SpeechRecognition();
  speech.lang = "en-US";

  // if any error occour
  speech.onerror = function(event) {
    if (event.error == "not-allowed") {
      alert("Please allow microphone.");
    } else {
      alert("There is an error. Please see your console");
      console.log(event);
    }
  };

  // on result event;
  speech.onresult = function(event) {
    document.querySelector("#editer").value = event.results[0][0].transcript;
    toggle();
  };

  //on speak button click
  document.querySelector("#speek").addEventListener("click", function(e) {
    e.preventDefault();
    toggle();
    speech.start();
  });

  var nowflg = true;
  function toggle() {
    document.querySelector("#speek-now").style.visibility = nowflg
      ? "hidden"
      : "visible";
    nowflg = !nowflg;
  }
  toggle();
})();

In the first line, we try to get the SpeechRecognition function.

There are following events available for the SpeechRecognition API.

"onaudiostart",
  "onaudioend",
  "onend",
  "onerror",
  "onnomatch",
  "onresult",
  "onsoundstart",
  "onsoundend",
  "onspeechend",
  "onstart";

For this tutorial, we will use only onresult and onerror event callbacks.

In onerror callback we will handle with user permission to the browser.

In onresult callback we will update the textbox with speech transcript text.

Live demo here.