Speech Recognition in a C# WPF Application

Last week, I posted a tutorial on how to save and retrieve images using SQL Server and WPF. This is an extension to that article. In my previous article, the images were displayed in the image box as soon as its name was selected from the list box. In this tutorial the picture will be displayed from the database as soon as you the speak the name of the picture.

Yes.. we will be playing with Speech Recognition technology! (Please refer to this article for Speech to Text)

The database schema used for this tutorial is similar to the one I used in my previous article.

Lets get started! Create a new WPF Application in Visual Studio 2008 and follow the steps below:

1. Add Reference to System.Speech and make Global Declarations

  • Add a reference to the System.Speech assembly:

 

  • First declare an object of SpeechRecognizer and a list of string globally.
  • We use a list data structure because it lets us add items to it dynamically.
SpeechRecognizer speechReco = new SpeechRecognizer();
 
List<string> grammerList = new List<string>();
  • Add  the namespace on the top :
using System.Speech.Recognition;

 

2. Build the Grammar

  • Get all the names of the pictures from the database in a dataset.
  • Add all the names through a loop in the ‘grammerlist’.
  • Create an object of Choices which hold the list in an array.
  • Create an object of Grammer Builder to hold ‘mychoices’.
  • Create an object of Grammer with the help of the Grammer Builder object.
  • Load the Grammer object into the Speech Recognition object.
  • Enable the speech recognition object.
  • Add the event handler to the SpeechRecognized property of SpeechRecognizer-as soon as it hears something from the grammar list, the event will be fired.

Add the following code in the constructor of your Window():

public Window2()
 
  {
 
          InitializeComponent();
 
          sqlCon.Open();
 
          Dataset ds = new DataSet(); 
 
          SqlDataAdapter sqa = new SqlDataAdapter
 
          ("select name from picture", sqlCon);
 
          sqa.Fill(ds);
 
          sqlCon.Close();
 
          for (int i = 0; i < ds.Tables[0].Rows.Count; i++)
 
          {
 
              grammerList.Add(ds.Tables[0].Rows[i][0].ToString());
 
          }
 
          Choices myChoices = new Choices(grammerList.ToArray());
 
          GrammarBuilder builder = new GrammarBuilder(myChoices);
 
          Grammar gram = new Grammar(builder);
 
          speechReco.LoadGrammar(gram);
 
          speechReco.Enabled = true; 
 

speechReco.SpeechRecognized += new

EventHandler<SpeechRecognizedEventArgs>

         (speechReco_SpeechRecognized);
 
      } 
 

Grammar is added to narrow down the number of possibilities the speech recognizer has. Limiting the recognition pool increases the accuracy. Otherwise it takes a lot of time for it to recognize the words you speak precisely.

3. Display the Picture

  • Select the picture in bytes corresponding to what the user has said from the database in the dataset.
  • If the dataset is not null then display that picture in the image box.

 

void speechReco_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
 
 {
            sqlCon.Open();
 
            Dataset ds = new DataSet(); 
 
            SqlDataAdapter sqa = new SqlDataAdapter("select pic from picture
 
            where name='"+e.Result.Text+"'", sqlCon);
 
            sqa.Fill(ds);
 
            sqlCon.Close();
 
            if (ds!=null)
 
            { 
 
                byte[] data = (byte[])ds.Tables[0].Rows[0][0]; 
 
                MemoryStream strm = new MemoryStream();
 
                strm.Write(data, 0, data.Length);
 
                strm.Position = 0;
 
                data = System.Text.UnicodeEncoding.Convert
 
               (Encoding.Unicode, Encoding.Default, data); 
 
                System.Drawing.Image img = System.Drawing.Image.FromStream(strm);
 
                BitmapImage bi = new BitmapImage(); 
 
                bi.BeginInit();
 
                MemoryStream ms = new MemoryStream(); 
 
                img.Save(ms, System.Drawing.Imaging.ImageFormat.Bmp); 
 
                ms.Seek(0, SeekOrigin.Begin); 
 
                bi.StreamSource = ms; 
 
                bi.EndInit(); 
 
                imageshow.Source = bi;
 
            }
 
}

Run the program and you will see a problem!

You will take the name of the picture and you will see the name getting highlighted in the listbox and no picture is displayed in the imagebox!

Well as soon as the event is fired, the listboxitem is highlighted with the name automatically and the picture does not appear.

In order to set this straight you disable the listbox in the load event so that the code works properly:

listbox.IsEnabled = false;

 

Now if I speak “Yellow”, The application will retrieve a picture named “Yellow” from the database:

 

 

Likewise, speaking “Green” in the mic will get you a picture named “Green” from the database:

Easy isnt it? Questions are welcome! If you have a question, please ask it as a reply to this post. 🙂