> ## Documentation Index
> Fetch the complete documentation index at: https://docs-vrsebuilder.autovrse.app/llms.txt
> Use this file to discover all available pages before exploring further.

# Add your own TTS Provider for Voiceovers

> Add your own API for generating Voice Over audio files (Requires a Developer)

## Overview

VRseBuilder's Text-to-Speech (TTS) system converts text strings into `AudioClip` assets for VoiceOvers in your story. The system is designed to be extensible—you can integrate any TTS service (Google Cloud TTS, Amazon Polly, ElevenLabs, or a custom local solution) by implementing a provider interface.

This guide walks through creating a custom TTS provider, configuring it, and making it available in VRseBuilder.

***

## How TTS Providers Work

TTS providers act as adapters between VRseBuilder and external speech synthesis services. When the system needs to generate a VoiceOver, it calls your provider's conversion methods, which handle the actual audio generation (typically via API requests or local processing).

Two interfaces are available in `VRseBuilder.Core.Systems.TextToSpeech`:

| Interface                  | Use Case                                                  |
| -------------------------- | --------------------------------------------------------- |
| `ITextToSpeechProvider`    | Basic single-language TTS                                 |
| `IVRBTextToSpeechProvider` | Multi-language support and batch processing (recommended) |

The `IVRBTextToSpeechProvider` interface extends the base interface with language-specific and batch conversion methods, making it the better choice for most implementations.

***

## 1. Create the Provider Class

Create a new C# script in your project. Place it in a `Providers` folder under `Runtime` to ensure VoiceOvers generate correctly when using LiveLink.

```csharp theme={null}
using System.Threading.Tasks;
using UnityEngine;
using VRseBuilder.Core.Systems.TextToSpeech;

public class MyCustomTTSProvider : IVRBTextToSpeechProvider
{
    private TextToSpeechConfiguration _configuration;

    // Implementation follows...
}
```

> **Important:** Your provider class must be under a `Runtime` assembly. This ensures VO generation works when editing via LiveLink.

***

## 2. Implement SetConfig

Store the configuration object passed by the system. This object contains global settings like API keys, cache directory paths, and other provider-specific values.

```csharp theme={null}
public void SetConfig(TextToSpeechConfiguration configuration)
{
    _configuration = configuration;
}
```

***

## 3. Implement Conversion Methods

Implement the core methods that generate audio from text.

### ConvertTextToSpeech (Default Language)

This method handles single-text conversion using a default language. Typically, delegate to the language-specific method.

```csharp theme={null}
public async Task<AudioClip> ConvertTextToSpeech(string text)
{
    return await ConvertTextToSpeechForLanguage(text, "en");
}
```

### ConvertTextToSpeechForLanguage

This method converts text for a specific language code (e.g., `"en"`, `"es"`, `"de"`).

```csharp theme={null}
public async Task<AudioClip> ConvertTextToSpeechForLanguage(string text, string language)
{
    if (string.IsNullOrWhiteSpace(text))
        return null;

    return await FetchAudioFromService(text, language);
}
```

### ConvertMultipleTextToSpeech (Batch Processing)

Implement batch processing for generating multiple clips. VRseBuilder's core logic checks whether an AudioClip already exists before calling the provider, so only missing clips trigger generation.

```csharp theme={null}
public async Task<AudioClip[]> ConvertMultipleTextToSpeech(string[] texts, string[] languages)
{
    var audioClips = new AudioClip[texts.Length];
    
    for (int i = 0; i < texts.Length; i++)
    {
        audioClips[i] = await ConvertTextToSpeechForLanguage(texts[i], languages[i]);
    }
    
    return audioClips;
}
```

> **Tip:** Add caching logic using `_configuration.StreamingAssetCacheDirectoryName` to avoid redundant API calls.

***

## 4. Add Configuration Settings

If your provider requires custom settings (API keys, voice IDs, endpoint URLs), add them to the shared configuration class.

### Define Your Settings

Open `Assets/VRseBuilder/_Core/Runtime/Systems/TextToSpeech/Runtime/TextToSpeechConfiguration.cs` and add public fields for your settings:

```csharp theme={null}
// In TextToSpeechConfiguration.cs

[Header("My Custom Provider Settings")]
public string MyCustomTTSAPIEndpoint = "https://api.example.com/tts";
public string MyCustomApiKey = "your_api_key_here";
public string MyCustomVoiceId = "default_voice";
```

### Access Settings in Your Provider

Reference these fields through the stored configuration object:

```csharp theme={null}
private async Task<AudioClip> FetchAudioFromService(string text, string language)
{
    string apiKey = _configuration.MyCustomApiKey;
    string endpoint = _configuration.MyCustomTTSAPIEndpoint;
    
    // Use these values in your request...
}
```

### Configure via Project Settings

1. Navigate to **Edit → Project Settings → VRseBuilder → Text To Speech**
2. Your new fields appear under the header you defined
3. Enter your API keys and other settings

> **Note:** You can also edit settings directly on the `TextToSpeechConfiguration` asset at `Assets/VRseBuilder/Resources/TextToSpeechConfiguration.asset`, but using Project Settings is the recommended approach.

***

## 5. Implement Web API Requests

For providers that use external APIs, use `UnityWebRequest` with `DownloadHandlerAudioClip` to fetch audio.

```csharp theme={null}
using UnityEngine.Networking;

private async Task<AudioClip> FetchAudioFromService(string text, string language)
{
    string url = _configuration.MyCustomTTSAPIEndpoint;
    
    using (var request = UnityWebRequestMultimedia.GetAudioClip(url, AudioType.MPEG))
    {
        // Set authorization headers if required
        request.SetRequestHeader("Authorization", "Bearer " + _configuration.MyCustomApiKey);
        
        var operation = request.SendWebRequest();
        
        while (!operation.isDone)
            await Task.Yield();
        
        if (request.result != UnityWebRequest.Result.Success)
        {
            Debug.LogError($"TTS Request Failed: {request.error}");
            return null;
        }
        
        return DownloadHandlerAudioClip.GetContent(request);
    }
}
```

***

## 6. Select Your Provider

The `TextToSpeechProviderFactory` automatically discovers all classes implementing `ITextToSpeechProvider`. No manual registration is required.

After compiling your script:

1. Navigate to **Edit → Project Settings → VRseBuilder → Text To Speech**
2. Open the **Provider** dropdown
3. Select your provider (`MyCustomTTSProvider`)

Your provider is now active and will be used for VoiceOver generation.

***

## Limitations

The **VO Preview** panel (available via **StoryEditWindows → VO Preview** in version 0.6.1+) currently does not support custom TTS providers. It only supports changing settings for:

* `OpenAITextToSpeechProvider`
* `VRBAPITextToSpeechProvider`

Custom providers must be configured through Project Settings.

***

## Best Practices

| Practice                  | Recommendation                                                                                  |
| ------------------------- | ----------------------------------------------------------------------------------------------- |
| Error Handling            | Wrap API calls in try-catch blocks and return `null` on failure to prevent crashes              |
| Async Pattern             | Use `await` for web requests to avoid blocking the main thread                                  |
| Logging                   | Use `VRseLogger` (recommended) or `Debug.Log` to track request status                           |
| Reference Implementations | Review `OpenAITextToSpeechProvider.cs` and `VRBAPITextToSpeechProvider.cs` for working examples |