feat: Add Streamlit dashboard with Blueprint compliance (v2.1.0)

Dashboard Features:
- 8 navigation sections: Overview, Outcomes, Poor CX, FCR, Churn, Agent, Call Explorer, Export
- Beyond Brand Identity styling (colors #6D84E3, Outfit font)
- RCA Sankey diagram (Driver → Outcome → Churn Risk flow)
- Correlation heatmaps (driver co-occurrence, driver-outcome)
- Outcome Deep Dive (root causes, correlation, duration analysis)
- Export functionality (Excel, HTML, JSON)

Blueprint Compliance:
- FCR: 4 categories (Primera Llamada/Rellamada × Sin/Con Riesgo de Fuga)
- Churn: Binary view (Sin Riesgo de Fuga / En Riesgo de Fuga)
- Agent: Talento Para Replicar / Oportunidades de Mejora
- Fixed FCR rate calculation (only FIRST_CALL counts as success)

Technical:
- Streamlit + Plotly for interactive visualizations
- Light theme configuration (.streamlit/config.toml)
- Fixed Plotly colorbar titlefont deprecation

Documentation:
- Updated PROJECT_CONTEXT.md, TODO.md, CHANGELOG.md
- Added 4 new technical decisions (TD-014 to TD-017)
- Created TROUBLESHOOTING.md with 10 common issues

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
sujucu70
2026-01-19 16:27:30 +01:00
commit 75e7b9da3d
110 changed files with 28247 additions and 0 deletions

279
WORKFLOW.md Normal file
View File

@@ -0,0 +1,279 @@
# CXInsights - Development Workflow
## Checkpoints Overview
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ DEVELOPMENT WORKFLOW │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ CP1 → CP2 → CP3 → CP4 → CP5 → CP6 → CP7 → CP8 → [CP9] │
│ │ │ │ │ │ │ │ │ │ │
│ │ │ │ │ │ │ │ │ └─ Optimization │
│ │ │ │ │ │ │ │ └─ E2E Pipeline │
│ │ │ │ │ │ │ └─ RCA Aggregation │
│ │ │ │ │ │ └─ Compression │
│ │ │ │ │ └─ Inference Engine │
│ │ │ │ └─ Feature Extraction │
│ │ │ └─ RCA Schemas │
│ │ └─ Transcription Module │
│ └─ Project Setup & Contracts │
│ │
│ Cada checkpoint tiene criterios STOP/GO explícitos. │
│ NO avanzar sin aprobación del checkpoint anterior. │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
```
---
## CHECKPOINT 1 — Project Setup & Contracts
**Objetivo:** Fijar estructura, contratos y versionado antes de escribir lógica.
### Tareas
- [x] Crear estructura de carpetas según ARCHITECTURE.md
- [x] Inicializar repo Git con .gitignore (datos, .env, outputs)
- [x] Crear requirements.txt con versiones pinned
- [x] Crear .env.example con todas las variables necesarias
- [x] Crear README.md con:
- descripción del producto
- instalación (virtualenv)
- configuración (.env)
- flujo de ejecución (alto nivel)
- [x] Crear config/rca_taxonomy.yaml (Round 1 frozen)
- [x] Crear config/settings.yaml (batch_size, limits, retries)
- [x] Crear config/schemas/:
- call_analysis_v1.py (Pydantic)
- incluir: schema_version, prompt_version, model_id
### Reglas
- ❌ No implementar lógica funcional
- ❌ No llamar APIs externas
### Entregable
- Output de `tree -L 3`
- Revisión de contratos y estructura
### STOP/GO Criteria
- [ ] Estructura completa y coherente con ARCHITECTURE.md
- [ ] Contratos Pydantic compilables
- [ ] .gitignore protege datos sensibles
---
## CHECKPOINT 2 — Transcription Module (Isolated & Auditable)
**Objetivo:** STT fiable, comparable y con métricas reales.
### Tareas
- [x] Implementar src/transcription/base.py
- interfaz Transcriber
- [x] Implementar AssemblyAITranscriber
- batch async
- retries + backoff
- captura de provider, job_id
- [x] Implementar modelos:
- Transcript
- SpeakerTurn
- incluir: audio_duration_sec, language, provider, created_at
- [x] Implementar extracción básica de audio metadata (ffprobe)
- [x] Tests:
- 1 audio corto (mock o real)
- validar estructura + diarización mínima
- [x] Notebook 01_transcription_validation.ipynb:
- 510 audios reales
- medir: latencia, coste real/min, diarization quality
### STOP/GO Criteria
- [ ] Calidad aceptable
- [ ] Coste real conocido
- [ ] Decisión proveedor STT
---
## CHECKPOINT 3 — RCA Schemas & Data Contracts (NO LLM)
**Objetivo:** Definir qué significa una llamada analizada.
### Tareas
- [x] Implementar src/models/call_analysis.py:
- CallAnalysis
- RCALabel
- EvidenceSpan
- Event
- [x] Reglas obligatorias:
- separar observed vs inferred
- events[] estructurado (HOLD, TRANSFER, ESCALATION…)
- status por llamada (success/partial/failed)
- trazabilidad: schema_version, prompt_version, model_id
- [x] Crear data/examples/:
- lost sale
- poor CX
- mixed
- con evidence y events reales
### STOP/GO Criteria
- [ ] ¿El schema captura TODO lo necesario?
- [ ] ¿Es auditable sin leer texto libre?
---
## CHECKPOINT 4 — Feature & Event Extraction (Deterministic)
**Objetivo:** Sacar del LLM lo que no debe inferir.
### Tareas
- [x] Implementar src/features/event_detector.py:
- HOLD_START / HOLD_END
- TRANSFER
- SILENCE
- [x] Implementar src/features/turn_metrics.py:
- talk ratio
- interruptions
- [x] Enriquecer Transcript → TranscriptWithEvents
### STOP/GO Criteria
- [ ] Eventos coherentes
- [ ] Base causal estable para inference
---
## CHECKPOINT 5 — Inference Engine (MAP Stage, Single Pass)
**Objetivo:** Inferencia consistente, explicable y controlada.
### Tareas
- [x] Crear un único prompt MAP:
- sales + CX + RCA + reasoning
- forzar JSON completo
- [x] Implementar LLMClient:
- JSON strict
- retries + repair
- logging de tokens
- [x] Implementar BatchInference:
- batch_size configurable
- guardado incremental
- resume seguro
- [x] Tests:
- evidence obligatorio
- confidence < 0.6 si evidence débil
- [x] Notebook 02_inference_validation.ipynb:
- 10 llamadas reales
- revisar evidence manualmente
- coste por llamada
### STOP/GO Criteria
- [ ] ¿El LLM no alucina?
- [ ] ¿La evidence es defendible?
---
## CHECKPOINT 6 — Transcript Compression (Baseline, not optional)
**Objetivo:** Control de coste y latencia desde diseño.
### Tareas
- [x] Implementar CompressedTranscript:
- customer intent
- agent offers
- objections
- resolution statements
- [x] Validar reducción tokens (>60%)
- [x] Forzar uso de compressed transcript en inference
### STOP/GO Criteria
- [ ] Coste predecible
- [ ] Latencia estable en 20k
---
## CHECKPOINT 7 — Aggregation & RCA Trees (Deterministic Core)
**Objetivo:** Pasar de llamadas a causas.
### Tareas
- [x] Implementar estadísticas:
- frecuencia
- conditional probabilities
- [x] Definir severity_score con reglas explícitas
- [x] Implementar RCATreeBuilder (determinístico)
- [x] (Opcional) LLM solo para narrativa
- [x] Notebook 04_aggregation_validation.ipynb:
- 100 llamadas
- números cuadran
- RCA prioriza bien
### STOP/GO Criteria
- [ ] ¿El árbol es accionable?
- [ ] ¿Refleja impacto real?
---
## CHECKPOINT 8 — End-to-End Pipeline & Delivery
**Objetivo:** Operación real sin intervención humana.
### Tareas
- [x] Implementar CXInsightsPipeline
- manifests por stage
- resume total/parcial
- [x] Implementar exports:
- Excel
- PDF
- JSON
- [x] CLI principal
- [x] Notebook 05_full_pipeline_test.ipynb:
- 50 llamadas
- medir tiempo total
- medir coste total
### STOP/GO Criteria
- [ ] Pipeline estable
- [ ] Outputs reproducibles
---
## CHECKPOINT 9 — Optimization & Benchmarking (Optional)
**Objetivo:** Maximizar ROI.
### Tareas
- [ ] Caching por hash
- [ ] Batch size benchmarks
- [ ] Comparar STT providers
---
## Progress Tracking
| Checkpoint | Status | Date Started | Date Completed | Notes |
|------------|--------|--------------|----------------|-------|
| CP1 | ✅ Completed | 2026-01-19 | 2026-01-19 | Approved |
| CP2 | ✅ Completed | 2026-01-19 | 2026-01-19 | Approved |
| CP3 | ✅ Completed | 2026-01-19 | 2026-01-19 | Approved |
| CP4 | ✅ Completed | 2026-01-19 | 2026-01-19 | Approved |
| CP5 | ✅ Completed | 2026-01-19 | 2026-01-19 | Approved |
| CP6 | ✅ Completed | 2026-01-19 | 2026-01-19 | Approved |
| CP7 | ✅ Completed | 2026-01-19 | 2026-01-19 | Approved |
| CP8 | ✅ Completed | 2026-01-19 | 2026-01-19 | Approved |
| CP9 | ⏳ Optional | - | - | - |