Getting Started with Web Speech Synthesis API and Svelte
Browser get new APIs all the time, one of such APIs is Web Speech Synthesis. Let's explore it with Svelte.
Getting Started with the API
The entry point of the API is speechSynthesis
object. To get list of available voices we can do:
console.log(speechSynthesis.getVoices())
Which on my Chrome on OSX laptop returns a list of 67 voices.
So let's just create a new page, and put that in JavaScript, and... it returns an empty array. What just happened?
Unfortunately the Web Speech Synthesis API is terribly designed. The list of voices is populated asynchronously, which is fair enough, but instead of just returning a promise or some event like onSpeechSynthesisReady
it will just happily return an empty array if you call it too early.
There is onVoicesChanged
event. Nothing in the spec says even implies that it will only trigger once during page load, and I think a browser could work the other way around (have voice list pre-populated, so no event triggers), but it seems to work fine in Chrome.
For any production use it would likely needa lot more robust code and some cross-browser cross-OS testing. Arguably a timeout loop of checking it every 16ms until it's non-empty or some max timeout elapsed might even be more robust, but we're just exploring the API here.
Display list of available voices
To get started we can wrap this event in a promise, and use Svelte to await on the promise.
<script>
let voicesPromise = new Promise((resolve) => {
speechSynthesis.addEventListener("voiceschanged", ev => {
resolve(speechSynthesis.getVoices())
})
})
</script>
<div>Available Voices:</div>
{#await voicesPromise then voices}
<ul>
{#each voices as voice}
<li>{voice.name} - {voice.lang}</li>
{/each}
</ul>
{/await}
<style>
:global(body) {
margin: 0;
min-height: 100vh;
display: flex;
flex-direction: column;
justify-content: center;
align-items: center;
}
</style>
Say something
Now it's just a matter of adding some radio boxes for voice selection, text input for text to say, a button to run it, and it works.
<script>
let loading = true
let voices = []
speechSynthesis.addEventListener("voiceschanged", ev => {
loading = false
voices = speechSynthesis.getVoices()
})
let text = "Hello, world!"
let voiceIndex = 0
$: voice = voices[voiceIndex]
function sayIt() {
let u = new SpeechSynthesisUtterance(text)
u.voice = voice
speechSynthesis.speak(u)
}
</script>
<div>
<label>Text to say:
<input bind:value={text} />
</label>
<button on:click={sayIt}>Say it</button>
</div>
{#if loading}
<div>Please wait for voices to load</div>
{:else}
<div>Available Voices:</div>
<ul>
{#each voices as v, i}
<li>
<label>
<input type="radio" bind:group={voiceIndex} value={i}>
{v.name} - {v.lang}
</label>
</li>
{/each}
</ul>
{/if}
<style>
:global(body) {
margin: 0;
min-height: 100vh;
display: flex;
flex-direction: column;
justify-content: center;
align-items: center;
}
</style>
Some notes:
- I changed from promise to
loading
flag as I want the array of voices in the script as well as in rendered mode, and it's slightly easier this way - We need to create
new SpeechSynthesisUtterance(text)
instead of just doing more obviousvoice.say("some text")
. SpeechSynthesisUtterance object has some additional properties like speed and pitch, so you can use it for speed reading, or for UwU voice etc. - the API has some additional event for when the speech starts, ends etc. so you might consider listening to the events to know if the browser is speaking right now if you need some visual feedback as well