Licence
MIT
Version
0.3.53
Deps
2
Size
45 kB
Vulns
0
Weekly
213
@pie-players/tts-server-polly
AWS Polly provider for server-side text-to-speech with native speech marks support.
For the cross-package TTS architecture and browser/server flow, see TTS Architecture. This README focuses on AWS Polly-specific provider configuration and behavior.
Overview
This package provides a server-side TTS provider that uses AWS Polly to generate high-quality neural speech with millisecond-precise word timing through speech marks.
Features
- Native Speech Marks - Millisecond-accurate word timing from AWS Polly
- Neural Voices - High-quality neural TTS (default) or standard voices
- 25+ Languages - Wide language support
- Full SSML - Supports Speech Synthesis Markup Language
- Parallel Requests - Audio and speech marks fetched simultaneously
- 60+ Voices - Multiple voices per language
Installation
npm install @pie-players/tts-server-pollyUsage
Basic Setup
import { PollyServerProvider } from '@pie-players/tts-server-polly';
const provider = new PollyServerProvider();
await provider.initialize({
region: 'us-east-1',
credentials: {
accessKeyId: process.env.AWS_ACCESS_KEY_ID!,
secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY!,
},
engine: 'neural', // or 'standard'
defaultVoice: 'Joanna',
});Synthesize Speech
const result = await provider.synthesize({
text: 'Hello world, this is a test of AWS Polly text to speech.',
voice: 'Joanna', // Optional, uses defaultVoice if not specified
includeSpeechMarks: true,
});
console.log('Audio:', result.audio); // Buffer
console.log('Speech marks:', result.speechMarks); // Array of word timings
console.log('Duration:', result.metadata.duration, 'seconds');List Available Voices
// Get all neural voices
const voices = await provider.getVoices();
// Filter by language
const spanishVoices = await provider.getVoices({ language: 'es-ES' });
// Filter by gender
const femaleVoices = await provider.getVoices({ gender: 'female' });Speech Marks Example
const result = await provider.synthesize({
text: 'Hello world',
includeSpeechMarks: true,
});
// result.speechMarks:
// [
// { time: 0, type: 'word', start: 0, end: 5, value: 'Hello' },
// { time: 340, type: 'word', start: 6, end: 11, value: 'world' }
// ]Configuration
PollyProviderConfig
interface PollyProviderConfig {
region: string; // AWS region (required)
credentials?: { // AWS credentials (optional if using IAM)
accessKeyId: string;
secretAccessKey: string;
sessionToken?: string;
};
engine?: 'neural' | 'standard'; // Voice engine (default: 'neural')
defaultVoice?: string; // Default voice ID (default: 'Joanna')
}Environment Variables
AWS_REGION=us-east-1
AWS_ACCESS_KEY_ID=your_key_id
AWS_SECRET_ACCESS_KEY=your_secret_keyCapabilities
| Feature | Support |
|---|---|
| Speech Marks | Native |
| SSML | Full |
| Pitch Control | SSML only |
| Rate Control | SSML |
| Volume Control | Client-side |
| Max Text Length | 3000 chars |
| Audio Format | MP3 |
Cost
- Standard voices: $4 per 1M characters
- Neural voices: $16 per 1M characters
- Speech marks: Included (no extra charge)
Supported Voices
Popular voices include:
- English (US): Joanna, Matthew, Ivy, Kendra, Joey
- English (UK): Amy, Brian, Emma
- Spanish: Lucia, Conchita, Enrique
- French: Celine, Mathieu
- German: Marlene, Hans
- Italian: Carla, Giorgio
- Portuguese: Vitoria, Ricardo
Use getVoices() for complete list.
Error Handling
import { TTSError, TTSErrorCode } from '@pie-players/tts-server-core';
try {
const result = await provider.synthesize({ text: 'Hello' });
} catch (error) {
if (error instanceof TTSError) {
console.error('Error code:', error.code);
console.error('Message:', error.message);
console.error('Provider:', error.providerId);
}
}AWS IAM Permissions
Required IAM permissions:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"polly:SynthesizeSpeech",
"polly:DescribeVoices"
],
"Resource": "*"
}
]
}License
MIT