Voxygen Cloud is a service that transforms text into human-like expressive speech and generates high-quality audio messages in a variety of languages and expressiveness.
With its easy-to-use API, Voxygen Cloud can help you enrich your voice interactions, bring colour and personality to your contents and let you connect with customers like never before.
Being extremely easy to integrate with any existing solutions and products, Voxygen Cloud allows you to reinvent customer experience and create new services. You can also create your own unique and personalised voice, a stand-out component of your brand identity.
Voxygen Cloud API is a "REST-like" API. Any client application can send text to be vocalised through an HTTP request containing all of the necessary information and optional parameters (voice, audio format, speaking rate, pitch tuning, …). The service responds immediately with the corresponding speech audio data.
A main URL specifies the network address of the API service.
A user account is required to access Voxygen Cloud service. The user account is defined by a login and a password. Client application must set the login value in each request to the service. Password must never be sent to Voxygen Cloud. Client application uses the password to compute an HMAC and sets this HMAC value in each request.
Voxygen Cloud complies with W3C’s recommendations Speech Synthesis Mark-up Language (SSML 1.0 and 1.1) and Pronunciation Lexicon Specification (PLS 1.0)
The API accepts input as raw text or SSML (UTF-8 encoded) which helps you fully control several aspects of the speech such as pauses, specific pronunciations, acronyms, numbers, dates, etc.
You can also adjust the rate, the pitch or the volume of the speech.
Additional extension SSML tags can also be added within the text in order to further customize the output of the audio messages, such as background music mix, audio fade controls or synchronisation.
Voxygen Cloud allows you the choice of several audio formats such as .RAW, .WAV, .AU, .MP3 or .OGG.
|.RAW, .WAV, .AU||16 bits, PCM, G.711 (A-law, μ-law)|
|.MP3||Bitrate 16,31, 64, 96, 128 or 160. Quality from 0 to 9|
|.OGG||Quality from 0.0 to 1.0|
For all formats, the speech signal output can be sampled at any frequency from 6kHz to 48kHz.The speech signal can be mixed with external audio files.
The following languages are supported by Voxygen Cloud:
|French||France, Belgium, Switzerland,
Senegal, Ivory Coast, Cameroon, Niger
|English||United Kingdom, United States|
|Dutch||Netherlands, Belgium Flemish|
Voxygen Studio is the perfect companion to Voxygen Cloud. It provides an easy-to-use graphical user interface that allows you to prepare the text, adapt silences and pauses, modify specific word’ pronunciation and introduce other SSML tags, in order to create seamlessly high-quality audio messages in the Voxygen Cloud, with Voxygen’s expressive voices.
Its easy-to-use graphical user interface, accessible through a web browser, gives you access to many editing and audio optimization features in order to take full advantage of Voxygen Cloud service.
Once the voice and the language are chosen, you can start editing the text of your message and:
Over the years, Voxygen has developed unique know-how and expertise to deliver personalized and Brand Voice of the highest quality, tailored according to customers’ needs. Voxygen’s Expressive Voice creation process has been optimised so that you can launch your Brand Voice in a smooth and timely manner.
Once developed, Voxygen’s Expressive Voices can be gradually enriched by incorporating domain-specific vocabulary, paralinguistic features and additional expressivity through state-of-the-art features (Smart Lexicons, Domain-Specific Cor- pus, Multilingual Voices). Your investment is always protected.
Not only is your Brand Voice created to reflect your identity and enriched to meet your needs, but Voxygen’s technology and tools also give you full control to fine-tune your messages, through high-level standard interfaces (SSML, PLS).
Read more on SSML compliance.