{}

Listen to Page Powered by Fish Audio S1 {voices.length > 1 ?

{isDropdownOpen &&

{voices.map((voice, index) => )}

}

{}

; }; This guide will walk you through installation, authentication, and core features. If you're using the legacy Session-based API (`fish_audio_sdk`), see the [migration guide](/archive/python-sdk-legacy/migration-guide) to upgrade to the new SDK. ## Installation Install via pip (Python 3.9 or higher required): ```bash theme={null} pip install fish-audio-sdk ``` For audio playback utilities, install with the `utils` extra: ```bash theme={null} pip install fish-audio-sdk[utils] ``` Sign up for a free Fish Audio account to get started with our API. 1. Go to [fish.audio/auth/signup](https://fish.audio/auth/signup) 2. Fill in your details to create an account, complete steps to verify your account. 3. Log in to your account and navigate to the [API section](https://fish.audio/app/api-keys) Once you have an account, you'll need an API key to authenticate your requests. 1. Log in to your [Fish Audio Dashboard](https://fish.audio/app/api-keys/) 2. Navigate to the API Keys section 3. Click "Create New Key" and give it a descriptive name, set a expiration if desired 4. Copy your key and store it securely Keep your API key secret! Never commit it to version control or share it publicly. Configure your API key using environment variables: ```bash theme={null} export FISH_API_KEY=your_api_key_here ``` Or create a `.env` file in your project root: ```bash theme={null} FISH_API_KEY=your_api_key_here ``` ## Quick Start Get started with the [`FishAudio`](/api-reference/sdk/python/client#fishaudio-objects) client in less than a minute: ```python Synchronous theme={null} from fishaudio import FishAudio from fishaudio.utils import play, save # Initialize client (reads from FISH_API_KEY environment variable) client = FishAudio() # Generate and play audio audio = client.tts.convert(text="Hello, playing from Fish Audio!") play(audio) # Generate and save audio audio = client.tts.convert(text="Saving this audio to a file!") save(audio, "output.mp3") ``` ```python Asynchronous theme={null} import asyncio from fishaudio import AsyncFishAudio from fishaudio.utils import play, save async def main(): # Initialize async client client = AsyncFishAudio() # Generate and play audio audio = await client.tts.convert(text="Hello, playing from Fish Audio!") play(audio) # Generate and save audio audio = await client.tts.convert(text="Saving this audio to a file!") save(audio, "output.mp3") asyncio.run(main()) ``` ## Core Features ### Text-to-Speech Fully customizable text-to-speech generation: ```python Synchronous focus={6-10} theme={null} from fishaudio import FishAudio from fishaudio.utils import play client = FishAudio() # With a specific voice audio = client.tts.convert( text="Custom voice", reference_id="bf322df2096a46f18c579d0baa36f41d" # Adrian ) play(audio) ``` ```python Asynchronous focus={8-12} theme={null} import asyncio from fishaudio import AsyncFishAudio from fishaudio.utils import play async def main(): client = AsyncFishAudio() # With a specific voice audio = await client.tts.convert( text="Custom voice", reference_id="bf322df2096a46f18c579d0baa36f41d" # Adrian ) play(audio) asyncio.run(main()) ``` ```python Synchronous focus={6-10} theme={null} from fishaudio import FishAudio from fishaudio.utils import play client = FishAudio() # With speed control audio = client.tts.convert( text="I'm talking pretty fast, is this still too slow?", speed=1.5 # 1.5x speed ) play(audio) ``` ```python Asynchronous focus={8-12} theme={null} import asyncio from fishaudio import AsyncFishAudio from fishaudio.utils import play async def main(): client = AsyncFishAudio() # With speed control audio = await client.tts.convert( text="I'm talking pretty fast, is this still too slow?", speed=1.5 # 1.5x speed ) play(audio) asyncio.run(main()) ``` Create reusable configurations with [`TTSConfig`](/api-reference/sdk/python/types#ttsconfig-objects). [`Prosody`](/api-reference/sdk/python/types#prosody-objects) controls speech characteristics like speed and volume: ```python Synchronous focus={7-18} theme={null} from fishaudio import FishAudio from fishaudio.types import TTSConfig, Prosody from fishaudio.utils import play client = FishAudio() # Define config once my_config = TTSConfig( prosody=Prosody(speed=1.2, volume=-5), reference_id="933563129e564b19a115bedd57b7406a", # Sarah format="wav", latency="balanced" ) # Reuse across multiple generations audio1 = client.tts.convert(text="Welcome to our product demonstration.", config=my_config) audio2 = client.tts.convert(text="Let me show you the key features.", config=my_config) audio3 = client.tts.convert(text="Thank you for watching this tutorial.", config=my_config) play(audio1) play(audio2) play(audio3) ``` ```python Asynchronous focus={9-20} theme={null} import asyncio from fishaudio import AsyncFishAudio from fishaudio.types import TTSConfig, Prosody from fishaudio.utils import play async def main(): client = AsyncFishAudio() # Define config once my_config = TTSConfig( prosody=Prosody(speed=1.2, volume=-5), reference_id="933563129e564b19a115bedd57b7406a", # Sarah format="wav", latency="balanced" ) # Reuse across multiple generations audio1 = await client.tts.convert(text="Welcome to our product demonstration.", config=my_config) audio2 = await client.tts.convert(text="Let me show you the key features.", config=my_config) audio3 = await client.tts.convert(text="Thank you for watching this tutorial.", config=my_config) play(audio1) play(audio2) play(audio3) asyncio.run(main()) ``` For chunk-by-chunk processing, use [`stream()`](/api-reference/sdk/python/resources#stream) which returns an `AudioStream` (iterable). For real-time streaming with dynamic text, see [Real-time Streaming](#real-time-streaming) below. Learn more in the [Text-to-Speech guide](/developer-guide/sdk-guide/python/text-to-speech). ### Speech-to-Text Transcribe audio to text for various use cases: ```python Synchronous focus={5-16} theme={null} from fishaudio import FishAudio client = FishAudio() # Transcribe audio with open("audio.wav", "rb") as f: result = client.asr.transcribe( audio=f.read(), language="en" # Optional: specify language ) print(result.text) # Access segments for segment in result.segments: print(f"[{segment.start:.2f}s - {segment.end:.2f}s] {segment.text}") ``` ```python Asynchronous focus={7-18} theme={null} import asyncio from fishaudio import AsyncFishAudio async def main(): client = AsyncFishAudio() # Transcribe audio with open("audio.wav", "rb") as f: result = await client.asr.transcribe( audio=f.read(), language="en" # Optional: specify language ) print(result.text) # Access segments for segment in result.segments: print(f"[{segment.start:.2f}s - {segment.end:.2f}s] {segment.text}") asyncio.run(main()) ``` Learn more in the [Speech-to-Text guide](/developer-guide/sdk-guide/python/speech-to-text). ### Real-time Streaming Stream dynamically generated text for conversational AI and live applications. Perfect for integrating with LLM streaming responses, live captions, and chatbot interactions: ```python Synchronous focus={7-15} theme={null} from fishaudio import FishAudio from fishaudio.utils import play client = FishAudio() # Stream dynamically generated text (e.g., from LLM) def text_chunks(): yield "Hello, " yield "this is " yield "streaming text!" audio_stream = client.tts.stream_websocket( text_chunks(), latency="balanced" ) play(audio_stream) ``` ```python Asynchronous focus={9-17} theme={null} import asyncio from fishaudio import AsyncFishAudio from fishaudio.utils import play async def main(): client = AsyncFishAudio() # Stream dynamically generated text async def text_chunks(): yield "Hello, " yield "this is " yield "streaming text!" audio_stream = await client.tts.stream_websocket( text_chunks(), latency="balanced" ) play(audio_stream) asyncio.run(main()) ``` Learn more in the [WebSocket Streaming guide](/developer-guide/sdk-guide/python/websocket). ### Voice Cloning **Instant voice cloning** - Clone a voice on-the-fly using [`ReferenceAudio`](/api-reference/sdk/python/types#referenceaudio-objects): ```python Synchronous focus={6-12} theme={null} from fishaudio import FishAudio from fishaudio.types import ReferenceAudio client = FishAudio() # Instant voice cloning with open("reference.wav", "rb") as f: audio = client.tts.convert( text="This will sound like the reference voice", references=[ReferenceAudio( audio=f.read(), text="Text spoken in the reference audio" )] ) ``` ```python Asynchronous focus={8-14} theme={null} import asyncio from fishaudio import AsyncFishAudio from fishaudio.types import ReferenceAudio async def main(): client = AsyncFishAudio() # Instant voice cloning with open("reference.wav", "rb") as f: audio = await client.tts.convert( text="This will sound like the reference voice", references=[ReferenceAudio( audio=f.read(), text="Text spoken in the reference audio" )] ) asyncio.run(main()) ``` **Voice models** - Create persistent voice models for repeated use: ```python Synchronous focus={6-11} theme={null} from fishaudio import FishAudio client = FishAudio() # Create persistent voice model with open("voice_sample.wav", "rb") as f: voice = client.voices.create( title="My Custom Voice", voices=[f.read()], description="Custom voice clone" ) print(f"Created voice: {voice.id}") ``` ```python Asynchronous focus={8-13} theme={null} import asyncio from fishaudio import AsyncFishAudio async def main(): client = AsyncFishAudio() # Create persistent voice model with open("voice_sample.wav", "rb") as f: voice = await client.voices.create( title="My Custom Voice", voices=[f.read()], description="Custom voice clone" ) print(f"Created voice: {voice.id}") asyncio.run(main()) ``` Learn more in the [Voice Cloning guide](/developer-guide/sdk-guide/python/voice-cloning). ## Client Initialization The recommended approach using environment variables: ```python theme={null} from fishaudio import FishAudio # Automatically reads from FISH_API_KEY environment variable client = FishAudio() ``` Provide the API key directly: ```python theme={null} from fishaudio import FishAudio client = FishAudio(api_key="your_api_key") ``` Never commit API keys to version control. Use environment variables or secret management systems. Configure a custom base URL: ```python theme={null} from fishaudio import FishAudio client = FishAudio( api_key="your_api_key", base_url="https://your-proxy-domain.com" ) ``` ## Sync vs Async The SDK provides both synchronous and asynchronous clients: ```python Synchronous theme={null} from fishaudio import FishAudio # For typical applications client = FishAudio() audio = client.tts.convert(text="Hello!") ``` ```python Asynchronous theme={null} import asyncio from fishaudio import AsyncFishAudio async def main(): # For async applications (web servers, concurrent tasks) client = AsyncFishAudio() audio = await client.tts.convert(text="Hello!") asyncio.run(main()) ``` Use [`AsyncFishAudio`](/api-reference/sdk/python/client#asyncfishaudio-objects) when: * Building async web applications (FastAPI, Sanic, etc.) * Processing multiple requests concurrently * Integrating with other async libraries * You need maximum performance ## Resource Clients The SDK organizes functionality into resource clients: | Resource | Description | Key Methods | | ----------------------------------------------------------------------------- | ------------------ | ----------------------------------------------------- | | [`client.tts`](/api-reference/sdk/python/resources#ttsclient-objects) | Text-to-speech | `convert()`, `stream()`, `stream_websocket()` | | [`client.asr`](/api-reference/sdk/python/resources#asrclient-objects) | Speech recognition | `transcribe()` | | [`client.voices`](/api-reference/sdk/python/resources#voicesclient-objects) | Voice management | `list()`, `get()`, `create()`, `update()`, `delete()` | | [`client.account`](/api-reference/sdk/python/resources#accountclient-objects) | Account info | `get_credits()`, `get_package()` | ## Utility Functions The SDK includes helpful utilities (requires `utils` extra): ```python theme={null} from fishaudio.utils import save, play, stream # Save audio to file save(audio, "output.mp3") # Play audio (automatically detects environment) play(audio) # Works in Jupyter, regular Python, etc. # Stream audio in real-time (requires mpv) stream(audio_iterator) ``` Use [`play()`](/api-reference/sdk/python/utils#play) for playback and [`save()`](/api-reference/sdk/python/utils#save) for writing audio files. Learn more in the [API Reference - Utils](/api-reference/sdk/python/utils). ## Error Handling The SDK provides a comprehensive exception hierarchy: ```python theme={null} from fishaudio import FishAudio from fishaudio.exceptions import ( FishAudioError, AuthenticationError, RateLimitError, ValidationError ) client = FishAudio() try: audio = client.tts.convert(text="Hello!") except AuthenticationError: print("Invalid API key") except RateLimitError: print("Rate limit exceeded. Please wait before retrying.") except ValidationError as e: print(f"Invalid request: {e}") except FishAudioError as e: print(f"API error: {e}") ``` The SDK includes exceptions for [`AuthenticationError`](/api-reference/sdk/python/exceptions#authenticationerror-objects), [`RateLimitError`](/api-reference/sdk/python/exceptions#ratelimiterror-objects), [`ValidationError`](/api-reference/sdk/python/exceptions#validationerror-objects), and [`FishAudioError`](/api-reference/sdk/python/exceptions#fishaudioerror-objects) for common error scenarios. Learn more in the [API Reference - Exceptions](/api-reference/sdk/python/exceptions). ## Next Steps Set up API keys and client configuration Generate natural-sounding speech Clone voices and manage voice models Transcribe audio to text Real-time audio streaming Complete API documentation ## Resources * [GitHub Repository](https://github.com/fishaudio/fish-audio-python) * [PyPI Package](https://pypi.org/project/fish-audio-sdk/) * [Migration Guide](/archive/python-sdk-legacy/migration-guide) - Upgrade from legacy SDK * [Best Practices](/developer-guide/best-practices/) - Production-ready tips * [API Reference](/api-reference/sdk/python/) - Detailed documentation