| <h2 id="overview">Overview</h2> |
| |
| <p>Chrome provides native support for speech on Windows (using SAPI |
| 5), Mac OS X, and Chrome OS, using speech synthesis capabilities |
| provided by the operating system. On all platforms, the user can |
| install extensions that register themselves as alternative speech |
| engines.</p> |
| |
| <h2 id="generating_speech">Generating speech</h2> |
| |
| <p>Call <code>speak()</code> from your extension or |
| packaged app to speak. For example:</p> |
| |
| <pre>chrome.tts.speak('Hello, world.');</pre> |
| |
| <p>To stop speaking immediately, just call <code>stop()</code>: |
| |
| <pre>chrome.tts.stop();</pre> |
| |
| <p>You can provide options that control various properties of the speech, |
| such as its rate, pitch, and more. For example:</p> |
| |
| <pre>chrome.tts.speak('Hello, world.', {'rate': 2.0});</pre> |
| |
| <p>It's also a good idea to specify the language so that a synthesizer |
| supporting that language (and regional dialect, if applicable) is chosen.</p> |
| |
| <pre>chrome.tts.speak( |
| 'Hello, world.', {'lang': 'en-US', 'rate': 2.0});</pre> |
| |
| <p>By default, each call to <code>speak()</code> interrupts any |
| ongoing speech and speaks immediately. To determine if a call would be |
| interrupting anything, you can call <code>isSpeaking()</code>. In |
| addition, you can use the <code>enqueue</code> option to cause this |
| utterance to be added to a queue of utterances that will be spoken |
| when the current utterance has finished.</p> |
| |
| <pre>chrome.tts.speak( |
| 'Speak this first.'); |
| chrome.tts.speak( |
| 'Speak this next, when the first sentence is done.', {'enqueue': true}); |
| </pre> |
| |
| <p>A complete description of all options can be found in the |
| $ref:tts.speak below. |
| Not all speech engines will support all options.</p> |
| |
| <p>To catch errors and make sure you're calling <code>speak()</code> |
| correctly, pass a callback function that takes no arguments. Inside |
| the callback, check |
| $ref:runtime.lastError |
| to see if there were any errors.</p> |
| |
| <pre>chrome.tts.speak( |
| utterance, |
| options, |
| function() { |
| if (chrome.runtime.lastError) { |
| console.log('Error: ' + chrome.runtime.lastError.message); |
| } |
| });</pre> |
| |
| <p>The callback returns right away, before the engine has started |
| generating speech. The purpose of the callback is to alert you to |
| syntax errors in your use of the TTS API, not to catch all possible |
| errors that might occur in the process of synthesizing and outputting |
| speech. To catch these errors too, you need to use an event listener, |
| described below.</p> |
| |
| <h2 id="events">Listening to events</h2> |
| |
| <p>To get more real-time information about the status of synthesized speech, |
| pass an event listener in the options to <code>speak()</code>, like this:</p> |
| |
| <pre>chrome.tts.speak( |
| utterance, |
| { |
| onEvent: function(event) { |
| console.log('Event ' + event.type ' at position ' + event.charIndex); |
| if (event.type == 'error') { |
| console.log('Error: ' + event.errorMessage); |
| } |
| } |
| }, |
| callback);</pre> |
| |
| <p>Each event includes an event type, the character index of the current |
| speech relative to the utterance, and for error events, an optional |
| error message. The event types are:</p> |
| |
| <ul> |
| <li><code>'start'</code>: The engine has started speaking the utterance. |
| <li><code>'word'</code>: A word boundary was reached. Use |
| <code>event.charIndex</code> to determine the current speech |
| position. |
| <li><code>'sentence'</code>: A sentence boundary was reached. Use |
| <code>event.charIndex</code> to determine the current speech |
| position. |
| <li><code>'marker'</code>: An SSML marker was reached. Use |
| <code>event.charIndex</code> to determine the current speech |
| position. |
| <li><code>'end'</code>: The engine has finished speaking the utterance. |
| <li><code>'interrupted'</code>: This utterance was interrupted by another |
| call to <code>speak()</code> or <code>stop()</code> and did not |
| finish. |
| <li><code>'cancelled'</code>: This utterance was queued, but then |
| cancelled by another call to <code>speak()</code> or |
| <code>stop()</code> and never began to speak at all. |
| <li><code>'error'</code>: An engine-specific error occurred and |
| this utterance cannot be spoken. |
| Check <code>event.errorMessage</code> for details. |
| </ul> |
| |
| <p>Four of the event types—<code>'end'</code>, <code>'interrupted'</code>, |
| <code>'cancelled'</code>, and <code>'error'</code>—are <i>final</i>. |
| After one of those events is received, this utterance will no longer |
| speak and no new events from this utterance will be received.</p> |
| |
| <p>Some voices may not support all event types, and some voices may not |
| send any events at all. If you do not want to use a voice unless it sends |
| certain events, pass the events you require in the |
| <code>requiredEventTypes</code> member of the options object, or use |
| <code>getVoices()</code> to choose a voice that meets your requirements. |
| Both are documented below.</p> |
| |
| <h2 id="ssml">SSML markup</h2> |
| |
| <p>Utterances used in this API may include markup using the |
| <a href="http://www.w3.org/TR/speech-synthesis">Speech Synthesis Markup |
| Language (SSML)</a>. If you use SSML, the first argument to |
| <code>speak()</code> should be a complete SSML document with an XML |
| header and a top-level <code><speak></code> tag, not a document |
| fragment.</p> |
| |
| <p>For example:</p> |
| |
| <pre>chrome.tts.speak( |
| '<?xml version="1.0"?>' + |
| '<speak>' + |
| ' The <emphasis>second</emphasis> ' + |
| ' word of this sentence was emphasized.' + |
| '</speak>');</pre> |
| |
| <p>Not all speech engines will support all SSML tags, and some may not support |
| SSML at all, but all engines are required to ignore any SSML they don't |
| support and to still speak the underlying text.</p> |
| |
| <h2 id="choosing_voice">Choosing a voice</h2> |
| |
| <p>By default, Chrome chooses the most appropriate voice for each |
| utterance you want to speak, based on the language and gender. On most |
| Windows, Mac OS X, and Chrome OS systems, speech synthesis provided by |
| the operating system should be able to speak any text in at least one |
| language. Some users may have a variety of voices available, though, |
| from their operating system and from speech engines implemented by other |
| Chrome extensions. In those cases, you can implement custom code to choose |
| the appropriate voice, or to present the user with a list of choices.</p> |
| |
| <p>To get a list of all voices, call <code>getVoices()</code> and pass it |
| a function that receives an array of <code>TtsVoice</code> objects as its |
| argument:</p> |
| |
| <pre>chrome.tts.getVoices( |
| function(voices) { |
| for (var i = 0; i < voices.length; i++) { |
| console.log('Voice ' + i + ':'); |
| console.log(' name: ' + voices[i].voiceName); |
| console.log(' lang: ' + voices[i].lang); |
| console.log(' gender: ' + voices[i].gender); |
| console.log(' extension id: ' + voices[i].extensionId); |
| console.log(' event types: ' + voices[i].eventTypes); |
| } |
| });</pre> |