Tuesday, March 12, 2013

SayItNow! A free text to speech software for reading text aloud with save to WAV file capability

While I was helping one of my relatives to learn English two years before, I noticed she really likes to listen to English pronunciation of words and sentences as fast as possible and I wrote a software called SayItNow! to help her, using C# and WinForm controls. 

When I started to install it on her PC, I noticed it doesn't help because when she is offline, there is no way to listen to the words. The problem had an easy solution. I could extend the software to save the audio in WAV or MP3 files so that she can carry the voice files anywhere. 

I also had a chance to look at Windows RT and develop some software for it and as you know you need to use .Net RT framework on such devices. Anyway just after I decided to complete my old work and port my solution to Windows RT I found that unfortunately Windows RT doesn't support TTS!  :)

At the end I decided to port the the software on normal Windows 8 Pro so that it can read text aloud and save it in audio (WAV) files to listen to them in offline mode. Writing this software only took 4 hours! Even more, now it can load a website as a human readable text to read it using Microsoft TTS engine. I used Syncfusion components for re-building the tool and then gave it a new version number: 1.9 (not big enough to be version 2.0!) This is a free software and you may use it for personal use. (you are not allowed to distribute it by any mean) 

You can download the executable setup from CNET Download.com:
http://download.cnet.com/SayItNow/3000-2140_4-75886035.html?tag=mncol;1

Or directly from this website:

http://www.xogasoft.com/downloads/sayitnow/setup.exe


Note: you need to have .Net 4. installed on your PC to run SayItNow!


Now I want to explain some aspects of the code you need to be aware of. 
To be able to use TTS functionality, you need to add the following line to your source code first:

using System.Speech.Synthesis;

Next step is to use an instance of SpeechSenthesizer class:

private SpeechSynthesizer _sp = new SpeechSynthesizer();


With this instance you can playback the human voice using a pre-selected voice:


        private void ReadText(string text, int rate, int volume, string voice)
        {
            _sp.Rate = rate;
            _sp.Volume = volume;
            _sp.SelectVoice(voice);

            try
            {
                _sp.SpeakCompleted -= sp_SpeakCompleted;
                _sp.SetOutputToDefaultAudioDevice();

                _sp.SpeakCompleted += sp_SpeakCompleted;
                var textToRead = new PromptBuilder();
                textToRead.AppendText(text);
                _sp.SpeakAsync(textToRead);

            }
            catch (OperationCanceledException e)
            {
                Logger.Log(e);
            }
        }
Note: As you can see I have used Asynchronous version of Speek method. So I have added a callback. For synchronous version of Speak you don't need it.

The voice parameter should have a string value extracted from the list of available names of installed TTS voices on the target machine. Here is a code to extract these names and add them to a combo box I named it ctrlVoice:


var voices = _sp.GetInstalledVoices();
foreach (var installedVoice in voices)
{
   ctrlVoice.Items.Add(new VoiceListItem(installedVoice));
}



And here is the part of code that I save the voice if user clicks on the save button. In this case even a long text will be saved very fast and no voice is played on speaker:


         private void SaveVoice(string text, int rate, int volume, string voice, string fileName)
        {
            _sp.Rate = rate;
            _sp.Volume = volume;
            _sp.SelectVoice(voice);

            try
            {
                _sp.SetOutputToWaveFile(fileName + ".wav");
                var textToRead = new PromptBuilder();
                textToRead.AppendText(text);
                _sp.Speak(textToRead);

                DisposeSpeechSynthesizer();
            }
            catch (OperationCanceledException e)
            {
                Logger.Log(e);
            }
        }


The above methods (ReadText and SaveVoice) have been used as bellow in my code:

        private void btnSay_Click(object sender, EventArgs e)
        {
            _sp.SpeakAsyncCancelAll();

            var text = txtMemo.Text;
            var rate = ctrlRate.Value;
            var volume = ctrlVolume.Value;
            var voice = ctrlVoice.SelectedItem.ToString();

            Task.Factory.StartNew(() => ReadText(text,rate,volume,voice));

        }

        private void btnSave_Click(object sender, EventArgs e)
        {
            _sp.SpeakAsyncCancelAll();

            var text = txtMemo.Text;
            var rate = ctrlRate.Value;
            var volume = ctrlVolume.Value;
            var voice = ctrlVoice.SelectedItem.ToString();

            Task.Factory.StartNew(() =>
            {
                if (chLinebyLine.Checked)
                {
                    var strings = text.Split('\n');
                    foreach (var s in strings)
                    {
                        if (s.Trim().Length != 0)
                        {
                            SaveVoice(s, rate, volume, voice, GetFileName(s));
                        }
                    }
                }
                else
                {
                    if (text.Trim().Length != 0)
                    {
                        SaveVoice(text, rate, volume, voice, GetFileName(text));
                    }
                }

            });

        }

The second method has an if branch just to decide on saving in a single file or one file per line. I hope you find these explanations useful when you want to develop your TTS software using Microsoft TTS engine. 







No comments:

Post a Comment