◆ Module 01 · Voice · Sara

JARVIS for revenue.
EU sovereign. AI Act compliant.

Sara is one agent · three faces. Cold-calls prospects (cara A) · navigates the cockpit by voice (cara B) · answers text questions with citations (cara C). ≤800ms P50 latency. ES/EN/PT native. Resemble PerTH watermark shipped pre-AI-Act deadline. Sierra has chat. 11x bolted voice on email. Sara is voice-first day one.

≤800ms

End-to-end P50 latency

Languages native · ES/EN/PT

≤6%

WER ES target · jiwer eval CI

✓

AI Act art 50/52 · pre-deadline 2026-08-02

Talk to Sara · 20 min Hear voice samples

◆ 1 agent · 3 caras

Same Sara. Different surface.

Outbound voice · cold calls

Sara dials prospects · books meetings · respects business hours + 5-min SLA

→ 8 SalesGPT conversation stages canon · Introduction → Qualification → Value Prop → Close
→ Rapport neurology (pacing + mirroring + sub-modalities V/A/K) baked-in
→ Real-time objection handling · 21 Cardone objections × 3 V/A/K modalities = 63 variants
→ Cal.com booking on-call · automatic Twenty opp create

Output: Booked meetings → Twenty CRM opportunity stage transition

UX voice · platform navigation

Sara responds to cockpit voice commands · "lanza audiencia ICP-01" · "abre el pipeline"

→ Cmd-K + "Sara, ..." command palette · grammar verb+object+filter
→ 6 autopilot patterns: propose-approve · batch confirm · escalation gate · explanation on demand · undo · confidence indicator
→ Cross-touchpoint seal · same Sara across 12 cockpit surfaces
→ Permission gates HITL · 3 tiers (silent · 4s countdown auto · explicit verb)

Output: Workflow invocation · cockpit navigation · agent handoffs

Text brain · async Q&A

Sara answers text questions · pulls from Foundation Pack + audit trail + agent memory

→ Foundation Pack 7-file context (market_positioning · audience_delight · creator_style · etc)
→ mem0 + Letta agent memory · per-tenant + per-agent
→ Graphiti temporal knowledge graph · "what did we agree about X last week"
→ Cites sources · pulls from `ross_governance.agent_outbound_log` for FinOps + audit

Output: Cited responses · Twenty data lookups · multi-agent handoff

◆ Production stack · 10 layers

Open-stack. Auditable.

Layer	Tool	License	Detail
STT	faster-whisper	MIT	WER ES ≤6% target
VAD	silero-VAD	MIT	voice activity detect ≤10ms
Smart-turn	Pipecat smart-turn	Apache 2.0	turn detection · barge-in support
TTS	ElevenLabs + Chatterbox	commercial + Apache 2.0	David PVC voice clone + brand voice
Voice clone	ElevenLabs PVC	commercial	3 min audio training (Wispr capture)
Watermark	Resemble PerTH + AudioSeal Meta	Apache 2.0	AI Act art 52 · deadline 2026-08-02
Orchestrator	Vapi (primary) + LiveKit (backup)	commercial	end-to-end ≤800ms P50
Noise suppression	DeepFilterNet	MIT	echo cancel + noise gate
Guardrails	NeMo Guardrails	Apache 2.0	prompt-injection + PII redaction
Eval	jiwer (WER)	Apache 2.0	per-language quality regression

8 of 10 layers are MIT/Apache OSS · 2 commercial (ElevenLabs PVC + Vapi orchestrator) · ROSS pays for these · ad-spend pass-through model · OSS-heavy stack designed for high gross margin.

◆ Stage 4 · new mechanism · 8 capabilities únicas

Sara cubre 8 · la competencia cubre 2.

Si conoces el espacio voice-AI · sabes que Vapi/Bland/Retell pelean por latency. ElevenLabs pelea por TTS quality. Hume pelea por emotion. Pero ninguno opera CRM en tiempo real ni tiene cross-channel memory ni soul.md per-tenant ni audit trail SHA-256. Aquí el positioning map · plan v4 INVESTOR §4.

Voice realtime <600ms

Pipecat smart-turn + WhisperLive WebSocket + preemptive_generation · p50 target validado

Opera CRM en tiempo real

"Mueve Tekniker a closed_won" en voz · ejecuta · loguea · responde · todo durante la call

Cross-channel memory

mem0 + Letta + Graphiti · Sara recuerda email de hace 3 semanas + slack + nota CRM · contexto unificado

HITL → Autopilot graduado

5 tiers · silent · countdown 4s · explicit verb · batch confirm · full autopilot · per-capability tunable

EU AI Act compliant by design

Art 50 disclosure first turn · art 52 watermark Resemble PerTH · pre-deadline 2 ago 2026 enforced

Audit trail SHA-256 chained

Cada acción de Sara loguea a ross_governance · 7y retention · forensic-grade · AI Act art 12

MCP server first-class

Sara expone MCP endpoint · Claude Code · Claude Desktop · cualquier MCP-aware client la opera

soul.md per-tenant persona

Personality + boundaries + tone + escalation rules por tenant · NO Sara genérica · cada cuenta su Sara

◆ Demo público · 90 second walkthrough · Cmd-J

Pulsa Cmd-J en cualquier surface del cockpit · háblale a Sara · esto es lo que entiende ya.

→ "Mueve Tekniker a closed_won"

→ "Puntúa MEDDIC el deal Erle Cloud"

→ "Redacta follow-up para PAL Robotics"

→ "Resume la call de ayer · 3 bullets"

→ "Crea tarea · llamar a BBVA mañana 10:00"

→ "Lanza secuencia ICP-01 outbound"

→ "Qué deals están at-risk esta semana"

→ "Reagenda Iberdrola · timezone Madrid"

8 intents canónicos · move_stage · score_meddic · draft_followup · summarize · create_task · launch_sequence · risk_query · reschedule.

Continúa a Stage 3 · ROSS Hour €299 →Stage 5 · investor view →

◆ Lo que Sara hace · 3 caras

Una agente. Tres superficies.

Cara A · voice outbound

Llama, cualifica, agenda

Marca prospectos, respeta horario laboral, agenda en Cal.com y crea la oportunidad en Twenty. Disclosure AI Act art 50 + watermark dual PerTH / AudioSeal en cada llamada. ES / EN / PT.

Cara B · navegación por voz

Opera el cockpit hablando

Responde a comandos de voz en el cockpit: "lanza la audiencia ICP-01", "abre el pipeline". Gates de permiso humano (HITL) graduados antes de cualquier acción con efecto.

Cara C · ⌘J en texto

Pregunta desde cualquier pantalla

Pulsa ⌘J en cualquier superficie del cockpit: mover etapa, puntuar MEDDIC, redactar follow-up, lanzar secuencia, crear tarea. Cita fuentes y loguea cada acción al audit trail.

Hablar con Sara ahora →Ver Sara en cockpit →

Hear Sara on a call before you decide

20 min with David.
Sara answers half the call.

Book demo See Maya (Module 02)

JARVIS for revenue.EU sovereign. AI Act compliant.