API Reference
Use our REST API to generate audio programmatically.
Authentication
All API requests require authentication using a Bearer token. You can create API tokens in your dashboard.
Authorization: Bearer YOUR_API_TOKEN
Base URL
https://voicethistext.com/api/v1
Overview
Jump to a section:
- Providers (list, voices, models)
- Audio Generations (list, create, get, update)
- Variants (list, create)
- Transcripts (get, queue, upload)
- Webhooks (list, create/update, delete)
- Embed (public data, transcript, download)
- Generation Status
- Rate Limiting
How it works
VoiceThisText connects to your TTS provider accounts (ElevenLabs, OpenAI, etc.). To generate audio:
-
Get your providers: Call
/providersto list your connected TTS providers -
Choose a voice: Call
/providers/{id}/voicesto list available voices for a provider -
Generate audio: POST to
/audio-generationswith theprovider_idfrom step 1 andvoice_idfrom step 2 -
Get transcripts (optional): Call
/audio-generations/{id}/transcriptor queue generation withPOST -
Subscribe to webhooks (optional): Use
/webhook-subscriptionsto receive completion and failure notifications
Note: You need both a provider_id (your connected provider) and a voice_id (the voice from that provider) to generate audio.
Each provider has a default model configured, but you can override it with the optional model_id parameter.
Endpoints
All endpoints in this section are under /api/v1 and require a Bearer API token unless noted as public.
/providers
List all connected TTS providers for your organization.
Example Response
{
"data": [
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"name": "My ElevenLabs",
"provider": "elevenlabs",
"provider_label": "ElevenLabs",
"is_active": true,
"created_at": "2026-02-01T12:00:00Z"
}
]
}
/providers/{id}/voices
List all available voices for a specific provider connection.
Example Response
{
"data": [
{
"id": "21m00Tcm4TlvDq8ikWAM",
"compound_id": "550e8400-e29b-41d4-a716-446655440000:21m00Tcm4TlvDq8ikWAM",
"name": "Rachel",
"language": "en-US",
"language_name": "English (US)",
"gender": "female",
"provider": "ElevenLabs",
"connection_name": "My ElevenLabs"
}
]
}
/providers/{id}/models
List all available models for a specific provider connection, including adjustable voice settings keys.
Example Response
{
"data": [
{
"id": "eleven_multilingual_v2",
"name": "Eleven Multilingual v2",
"voice_settings": ["stability", "similarity_boost"]
},
{
"id": "eleven_turbo_v2_5",
"name": "Eleven Turbo v2.5",
"voice_settings": ["stability", "similarity_boost"]
}
]
}
/audio-generations
List audio generations for your organization.
Query Parameters
| Parameter | Type | Description |
|---|---|---|
per_page |
integer | Number of results per page (default 15, max 100) |
Example Response
{
"data": [
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"text": "Hello, this is a test.",
"provider_id": "550e8400-e29b-41d4-a716-446655440000",
"voice_id": "21m00Tcm4TlvDq8ikWAM",
"model_id": null,
"language": "en-US",
"status": "completed",
"audio_url": "https://storage.voicethistext.com/audio/...",
"transcript_status": "completed",
"transcript_url": "https://storage.voicethistext.com/transcripts/..."
}
],
"links": { ... },
"meta": { ... }
}
/audio-generations
Create a new audio generation from text or structured content.
Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
text |
string | Required without content |
Plain text to convert to speech |
content |
array | Required without text |
Structured content payload (used by the editor) if you want to preserve formatting |
provider_id |
uuid | Yes | Provider UUID from /providers |
voice_id |
string | Yes | Voice ID from /providers/{id}/voices |
model_id |
string | No | Override the provider default model |
language |
string | No | Explicit language hint (BCP-47 preferred, e.g. en-US, zh-HK, yue-HK). If omitted, language is auto-detected and then falls back to the selected voice language. |
generate_transcript |
boolean | No | Queue transcript generation after audio completes |
voice_settings |
object | No | Provider-specific settings (keys from /providers/{id}/models response) |
Example Request
curl -X POST https://voicethistext.com/api/v1/audio-generations \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"text": "Hello, this is a test.",
"provider_id": "550e8400-e29b-41d4-a716-446655440000",
"voice_id": "21m00Tcm4TlvDq8ikWAM",
"language": "en-US",
"generate_transcript": true,
"voice_settings": { "stability": 0.5 }
}'
Example Response
{
"data": {
"id": "660e8400-e29b-41d4-a716-446655440001",
"text": "Hello, this is a test.",
"provider_id": "550e8400-e29b-41d4-a716-446655440000",
"voice_id": "21m00Tcm4TlvDq8ikWAM",
"model_id": null,
"language": "en-US",
"temperature": null,
"speed": null,
"status": "pending",
"audio_url": null,
"transcript_status": "pending",
"transcript_url": null,
"created_at": "2026-02-01T12:00:00.000000Z",
"updated_at": "2026-02-01T12:00:00.000000Z"
}
}
/audio-generations/{id}
Get a specific audio generation by its UUID.
Example Response
{
"data": {
"id": "550e8400-e29b-41d4-a716-446655440000",
"text": "Hello, this is a test.",
"status": "completed",
"audio_url": "https://storage.voicethistext.com/audio/...",
"transcript_status": "completed",
"transcript_url": "https://storage.voicethistext.com/transcripts/..."
}
}
/audio-generations/{id}
Update generation text (regenerate audio) or update variant_label without regeneration.
Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
text |
string | Required without variant_label |
New text to convert. Triggers regeneration. |
variant_label |
string \| null | Required without text |
Human-readable label for this version (e.g. Cantonese). Does not regenerate audio. |
Example Request
curl -X PUT https://voicethistext.com/api/v1/audio-generations/550e8400-... \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{ "text": "Updated text for regeneration." }'
Example Request (Rename only)
curl -X PUT https://voicethistext.com/api/v1/audio-generations/550e8400-... \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{ "variant_label": "Cantonese" }'
/audio-generations/{id}/variants
List all versions in the same variant family, including the main generation.
{
"data": [
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"main_generation_id": null,
"is_variant": false,
"variant_type": "original",
"variant_label": "Cantonese",
"language": "zh-HK",
"voice_id": "yue-HK-Standard-A"
}
]
}
/audio-generations/{id}/variants
Create a new variant from an existing generation by overriding voice and/or language. Requires Pro plan or higher.
curl -X POST https://voicethistext.com/api/v1/audio-generations/550e8400-.../variants \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"language": "zh-CN",
"voice_id": "cmn-CN-Standard-A",
"variant_label": "Mandarin"
}'
/audio-generations/{id}/transcript
Get transcript words for a completed audio generation.
Returns 422 if the audio is not completed yet.
{
"has_transcript": true,
"words": [
{ "word": "Hello", "start": 0.0, "end": 0.4 },
{ "word": "world", "start": 0.4, "end": 0.8 }
]
}
/audio-generations/{id}/transcript
Queue transcript generation for a completed audio file.
Returns 202 when queued, or 422 if audio is not completed or a transcript already exists.
{
"message": "Transcript generation has been queued."
}
/audio-generations/{id}/transcript
Upload a custom transcript (replaces any generated transcript).
Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
words |
array | Yes | List of word timing objects |
words[].word, start, end |
string, number, number | Yes | Each word with start/end timestamps in seconds |
{
"message": "Transcript uploaded successfully.",
"has_transcript": true,
"word_count": 2
}
/webhook-subscriptions
List webhook subscriptions for the organization.
{
"data": [
{
"id": "wh_123",
"url": "https://example.com/webhooks",
"events": ["audio-generation.completed", "audio-generation.failed"],
"is_active": true,
"created_at": "2026-02-01T12:00:00Z"
}
]
}
/webhook-subscriptions
Create or update a subscription for a URL.
| Parameter | Type | Required | Description |
|---|---|---|---|
url |
string (URL) | Yes | Endpoint that will receive events |
events |
array | Yes | One or more of: audio-generation.completed, audio-generation.failed, audio-generation.deleted |
{
"data": {
"id": "wh_123",
"url": "https://example.com/webhooks",
"events": ["audio-generation.completed", "audio-generation.failed"],
"is_active": true,
"secret": "whsec_****************************************",
"created_at": "2026-02-01T12:00:00Z"
}
}
Responses include secret only when a new subscription is created. Webhook requests include X-VTT-Signature, X-VTT-Event, and X-VTT-Delivery-ID headers.
/webhook-subscriptions/{id}
Delete a webhook subscription.
/embed/{generation}/data
Public data for the JS embed player (no authentication).
{
"uuid": "550e8400-e29b-41d4-a716-446655440000",
"title": "Sample title",
"text": "Original input text...",
"audio_url": "https://storage.voicethistext.com/audio/...",
"has_transcript": true,
"variants": [
{
"uuid": "550e8400-e29b-41d4-a716-446655440000",
"variant_label": "Cantonese",
"language": "zh-HK",
"voice_id": "yue-HK-Standard-A",
"audio_url": "https://storage.voicethistext.com/audio/..."
},
{
"uuid": "660e8400-e29b-41d4-a716-446655440001",
"variant_label": "Mandarin",
"language": "zh-CN",
"voice_id": "cmn-CN-Standard-A",
"audio_url": "https://storage.voicethistext.com/audio/..."
}
],
"settings": {
"branding": true
}
}
Selector label priority is: variant_label → title → fallback formatting. Main generation defaults to Original when no explicit label is set.
/embed/{generation}/transcript
Public transcript JSON used by the embed player.
/embed/{generation}/download
Downloads or redirects to a signed URL for the audio file.
Generation Status
Audio generations go through the following statuses:
Rate Limiting
API requests are rate limited based on your plan. Rate limit information is included in response headers:
X-RateLimit-Limit– Maximum requests per minuteX-RateLimit-Remaining– Remaining requests in current window