Documentation Index
Fetch the complete documentation index at: https://docs.cedarcopilot.com/llms.txt
Use this file to discover all available pages before exploring further.
Voice Endpoint Format
Cedar OS provides two approaches for handling voice, depending on your provider configuration:- Mastra/Custom backends: Direct voice endpoint handling
- AI SDK/OpenAI providers: Automatic transcription and speech generation
Provider-Specific Voice Handling
Mastra and Custom Backends
When using Mastra or custom backends, Cedar OS sends voice data directly to your voice endpoint. You have full control over:- Audio transcription
- Response generation
- Text-to-speech synthesis
- Response format
AI SDK and OpenAI Providers
When using AI SDK or OpenAI providers, Cedar OS automatically:- Transcribes audio using OpenAI’s Whisper model
- Generates a text response using the configured LLM
- Optionally generates speech using OpenAI’s TTS model (when
useBrowserTTSis false)
Request Format (Mastra/Custom)
Cedar OS sends voice data to your endpoint as a multipart form data request:Voice Settings Structure
Thesettings field contains a JSON object with the following structure:
language: Language code for speech recognition/synthesisvoiceId: Voice identifier for TTS (provider-specific)pitch,rate,volume: Voice modulation parametersuseBrowserTTS: Whether to use browser’s built-in TTSautoAddToMessages: Whether to add voice interactions to chat history
Context
Thecontext field contains stringified additional context from the Cedar state, which may include:
- Current chat messages
- Application state
- User-defined context
Response Format (All Providers)
Your endpoint can return different types of responses:1. JSON Response (Recommended)
text: The text response from the assistanttranscription: The transcribed user inputaudioData: Base64-encoded audio responseaudioUrl: URL to an audio fileaudioFormat: MIME type of the audiousage: Token usage statisticsobject: Structured response for actions
2. Audio Response
Return raw audio data with appropriate content type:3. Plain Text Response
Implementation Example (Mastra)
Here’s an example of implementing a voice endpoint in a Mastra backend:Voice Response Handling
Cedar OS provides a unifiedhandleLLMVoice function that processes voice responses consistently across all providers:
- Audio Playback: Handles base64 audio data, audio URLs, or browser TTS
- Message Integration: Automatically adds transcriptions and responses to chat history
- Action Execution: Processes structured responses to trigger state changes
Structured Responses
Cedar OS supports structured responses that can trigger actions in your application:SetState Response
To execute a state change:myCustomState.setValue(42) in your Cedar state.
Error Handling
Return appropriate HTTP status codes:200 OK: Successful response400 Bad Request: Invalid request format401 Unauthorized: Missing or invalid API key500 Internal Server Error: Server-side error
Voice Configuration
Configure voice settings when initializing Cedar:Provider-Specific Notes
OpenAI/AI SDK
- Transcription: Uses Whisper model (
whisper-1) - Speech: Uses TTS model (
tts-1) with configurable voices - Audio format: MP3 (audio/mpeg)
Mastra/Custom
- Full control over transcription and TTS services
- Can integrate with any speech service (Google, Azure, AWS, etc.)
- Flexible audio format support

