Emotion from audio.
Not just words.

Continuous Prosody Intelligence (CPI) — an ML model that runs parallel to ASR, streaming emotion signals to your agent in real-time. Fine-tune with LoRA on your data.

Parallel to ASRLangChain toolLoRA fine-tuning

What you get

Word-level emotion labels synced to your transcript. Valence, arousal, and custom taxonomy states for every utterance.

00:30

Hi,

Emotion
Neutral
Valence
0.50
Arousal
0.40
Escalation
Low
Pitch (f0)
Energy (dB)
Intensity
Neutral
Happy
Frustrated
Angry
Calm
Surprised
Grateful

Built for Production

Low-latency streaming, simple APIs, and battle-tested performance.

Sub-200ms Latency

Real-time streaming analysis optimized for live voice applications.

Prosody Features

Pitch, energy, jitter, shimmer, and voiced ratio—ready for your ML pipeline.

Intent Detection

Go beyond transcription. Understand emotional intent from how words are spoken.

Simple Integration

REST API, WebSocket streaming, and SDKs for Python and JavaScript.

Privacy First

On-premise deployment available. Your audio data stays yours.

Scalable

800+ QPS per node. Horizontal scaling for any workload.

Integrates with your stack

Deepgram
OpenAI
AssemblyAI
Twilio
Salesforce
HubSpot
AWS
GCP
LangChain
Retell
Vapi
Bland AI
Deepgram
OpenAI
AssemblyAI
Twilio
Salesforce
HubSpot
AWS
GCP
LangChain
Retell
Vapi
Bland AI

Architecture

State space model (Mamba-based) with multi-modal feature fusion. O(n) complexity for streaming. Trained on multilingual speech emotion corpora.

Model Pipeline

Audio Input
Feature Extraction
Prosody
28 dimensions
Phonetic
4 dimensions
Fusion Layer (256d)
SSM 1
SSM 2
SSM 3
SSM 4
Global Pool
Emotion
Softmax
VAD
Regression

Feature Extraction

Prosodic Features
Pitch (F0)185Hz
Energy-12.4dB
Jitter1.2%
Shimmer3.8%
Spectral Features
125Hz1kHz8kHz
Temporal Features
4.2
syllables/sec
0.34s
avg pause
78%
voiced ratio

Benchmark Results

68.4%
IEMOCAP
Unweighted Accuracy
74.2%
RAVDESS
Weighted Accuracy
71.8%
CREMA-D
Weighted Accuracy
67.1%
MSP-IMPROV
Unweighted Accuracy

Evaluated on standard speech emotion recognition benchmarks. ProsodySSM outperforms transformer baselines while maintaining O(n) complexity.

Features

Ship emotion-aware agents faster.

LoRA Fine-tuning

Train on your labeled data. LoRA adapters for domain-specific emotion detection without full model retraining.

  • Domain-specific adaptation
  • Your data, your model
  • Custom taxonomy training

Tone Contracts

Define rules: frustration → escalate, confusion → clarify. API triggers actions on emotion thresholds.

  • Configurable thresholds
  • Webhook on trigger
  • Chain multiple actions

LangChain Tool

pip install langchain-prosody. Emotion as a tool call or callback in your agent.

  • langchain-prosody package
  • Tool & callback support
  • Agent emotion memory

ASR Integration

Runs alongside your existing transcription. Emotion scores align to word timestamps automatically.

  • Works with any STT provider
  • Word-level emotion alignment
  • Drop-in, no pipeline changes

Custom Taxonomies

Map model outputs to your labels. Base emotions → domain-specific states via configurable thresholds.

  • Per-vertical mapping
  • Configurable thresholds
  • Admin dashboard

Integrations

Webhooks, REST API, native Salesforce/HubSpot connectors. Or just read from the SDK.

  • Native CRM connectors
  • Webhook events
  • REST & GraphQL APIs
Voice AI Agents

Give your AI the ability to detect frustration, urgency, or satisfaction in real-time and respond appropriately.

Call Analytics

Automatically flag calls with negative sentiment. Surface coaching opportunities. Track emotion trends over time.

Quality Assurance

Monitor 100% of calls instead of 2%. Emotion scoring adds a dimension transcription alone can't capture.

Simple Integration

Add emotion detection to your voice pipeline in a few lines.

Python
from prosody import ProsodyClient

client = ProsodyClient(api_key="your-key")

result = client.analyze(
    audio_file="recording.wav",
    features=["emotion", "prosody"]
)

print(result.emotion)  # "happy"
print(result.valence)  # 0.72
print(result.arousal)  # 0.65
JavaScript
import { Prosody } from '@prosody/sdk';

const client = new Prosody({ apiKey: 'your-key' });

const result = await client.analyze({
  audio: audioBlob,
  features: ['emotion', 'prosody']
});

console.log(result.emotion);  // "happy"
console.log(result.valence);  // 0.72
console.log(result.arousal);  // 0.65

Ready to build?

Start with our free tier. Upgrade when you need more capacity.

Get API Key

Contact

Questions about integration, pricing, or custom fine-tuning? Reach out.

San Francisco, CA

Send us a message