Listen API — Real-time STT
Raw real-time speech-to-text without AI cleaning. Create a session, get a WebSocket URL, and connect directly. You get interim and final transcripts — perfect for live captions, voice commands, or building your own processing pipeline.
Python
from sayd_ai import Sayd
client = Sayd(api_key="sk-your-key")
# Create a real-time STT session (no AI cleaning)
session = client.listen.create(
language="multi", # Recommended. "auto", "en", "zh", or "multi" (multilingual)
sample_rate=16000, # Recommended. 8000 also supported.
codec="pcm16", # "pcm16", "opus", or "opus_fs320"
)
# Connect to the WebSocket URL directly
print(f"Session: {session.session_id}")
print(f"Connect: {session.websocket_url}")
# Use any WebSocket client to stream audio
import websockets.sync.client, json
with websockets.sync.client.connect(session.websocket_url) as ws:
msg = json.loads(ws.recv()) # {"type": "ready"}
# Send audio chunks (PCM16, 100ms each)
ws.send(audio_bytes)
# Receive live transcripts
for msg in ws:
data = json.loads(msg)
if data["type"] == "partial":
print(f"\r {data['text']}", end="", flush=True)
elif data["type"] == "sentence":
print(f"\n[final] {data['text']}")
ws.send(json.dumps({"type": "end"}))
# List & retrieve sessions
sessions = client.listen.list(limit=10)
detail = client.listen.get(session.session_id)Parameters
| Parameter | Values | Description |
|---|---|---|
| language | "auto" | Auto-detect language (single language per session) |
| "en" | English (also accepts en-US, en-GB etc.) | |
| "zh" | Chinese (also accepts zh-CN, zh-TW etc.) | |
| "multi" | Recommended. Multi-language mode — automatically detects and switches between languages within the same session. | |
| sample_rate | 8000 / 16000 | 8000 or 16000 Hz (16000 recommended for best accuracy) |
WebSocket Connection
POST /api/listen returns a pre-authenticated websocket_url with the API key embedded as a query parameter. Connect directly — no additional auth headers needed.
Example: wss://api2.memorion.me/v1/listen/stream/{session_id}?api_key=...&external_user_id=...
API Endpoints
POST
/api/listenCreate a Listen session (returns WebSocket URL)GET
/api/listenList Listen sessionsGET
/api/listen/{id}Get Listen session details & transcripts