Text to Speech

Edgen AI Text-to-Speech SDK

The Edgen AI Text-to-Speech SDK is a cutting-edge solution for real-time voice generation that transforms text into audio with lightning-fast speed and high accuracy. Designed for both English and Spanish, it delivers exceptional performance under optimal conditions, achieving natural voices in less than 400 ms.


Key Features

  • High Accuracy: Achieves natural voices in ideal conditions.
  • Multilingual Support: Currently supports English and Spanish.
  • Real-Time Processing: Returns transcriptions in under 400 ms.
  • Simple Integration: Easily integrate with Node.js and other modern tech stacks.

Getting Started

Setting Up the Client

To get started, install the Edgen AI library and use it to configure the client:

import EdgenAI from "edgenai";
import fs from "fs";
 
// Initialize the EdgenAI client with your API key
const client = new EdgenAI({
  apiKey: "YOUR_API_KEY",
});

Transcribing Audio Files

Here’s an example of how to create speech using the SDK:

async function createSpeechFromText() {
  // Create the required text
  const text = "Hello, how are you?";
 
  // Create the speech
  const response = await client.textToSpeech.createSpeech(text);
 
  // Save the speech to a file
  fs.writeFileSync("speech.mp3", response.speech);
}
 
createSpeechFromText();

By default, the SDK returns the audio as a base64 encoded string.

{
  "speech": "the base64 encoded audio"
}

Errors

Here is an example of an error response:

{
    "status": [300-500],
    "error": "Issue with speech generation"
}