Instead of Web Seech API, you can use Google Cloud Text-to-Speech (TTS) API.
Prerequisite
To do so, you need to complete installation and initialization steps as specified in this documentation. Make sure to complete up to “Install the client library” step.
- Create a Google Cloud account
- Create a Text-to-Speech API credentials (you should have a project id at this point).
- Install the Google Cloud CLI
- Initialize it
- Create local authentication credentials for your Google Account (you will use a browser to sign in to your Google cloud console)
Note that Google Cloud TTS only works on a server environment. If you try to include the following APIs, you will run into 403 error or Type error depending on the framework of your choice.
API
GoogleCloudTTSGenerator(sound, config)
This function returns an AudioBuffer
of the synthesized speech.
sound
(object
)
The sound
object has the following properties.
speech
(string
, required): A text to synthesize.language
(string
, required if necessary): a BCP-47 code (or other standards) for language (default:en-US
).pitch
(number
, optional): A detune amount ranging from -20 to 20. 0 is the default detune amount. You could use data generated from a prerendered/compiled queue item.speechRate
(number
, optinoal): A speed of speech that is greater than 0. (0 is not acceptable). 1 is the default speed. If you don’t specify, 1 is used.
config
(object
)
The config
object has the following properties.
language
(optinoal): Same as above. If specified insound
,config.language
is ignored.speechRate
(optinoal): Same as above. If specified insound
,config.language
is ignored.ssmlGender
(string
, optional): The gender of the output voice. This should be eitherNEUTRAL
(default),FEMALE
, orMALE
.audioEncoding
(string
, optional): The output audio file encoding. Refer to the Google Cloud documentation. The default isMULAW
(wav
format). Technically, setting it asMP3
should not affect the use ofAudioQueue.getFullAudio
method.
An example using SvelteKit
Write a POST
method function in your routes/api/TTS/+server.js
file
Have a function in your client side file, like routes/.../+page.svelte
This code will save output audio in your server directory.