BeyondCX_Insights/docs/PROJECT_CONTEXT.md

# PROJECT_CONTEXT.md

> **Este archivo es tu 'norte'. SIEMPRE léelo primero.**

---

## 1. ¿Qué es CXInsights?

CXInsights es un pipeline standalone para analizar grabaciones de call centers en español (5,000-20,000 llamadas por batch), identificando automáticamente las causas raíz de ventas perdidas y mala experiencia de cliente mediante transcripción, extracción de features, inferencia LLM y agregación estadística.

---

## 2. Problema que resuelve

**Para quién:** Equipos de análisis de call centers (BeyondCX.ai → Entelgy pilot)

**Por qué importa:**
- Miles de llamadas diarias imposibles de revisar manualmente
- Causas de pérdida de ventas ocultas en conversaciones
- Métricas de CX basadas en surveys, no en comportamiento real
- Necesidad de insights accionables con evidencia verificable

---

## 3. Estado actual del proyecto

| Campo | Valor |
|-------|-------|
| **Última actualización** | 2026-01-19 |
| **Fase** | Production Ready (v2.1 Dashboard + Blueprint Compliance) |
| **Completitud** | 100% (9/9 checkpoints + Dashboard) |

### Checkpoints completados
- [x] CP1: Project Setup & Contracts
- [x] CP2: Transcription Module
- [x] CP3: RCA Schemas & Data Contracts
- [x] CP4: Feature & Event Extraction
- [x] CP5: Inference Engine
- [x] CP6: Transcript Compression
- [x] CP7: Aggregation & RCA Trees
- [x] CP8: End-to-End Pipeline & Delivery
- [x] **CP-GAPS: v2.0 Blueprint Alignment** (2026-01-19)
- [x] **CP-DASH: Streamlit Dashboard** ← NEW (2026-01-19)

### Checkpoints pendientes
- [ ] CP9: Optimization & Benchmarking (OPTIONAL)

### v2.0 Blueprint Alignment (completado 2026-01-19)
- [x] Gap Analysis vs BeyondCX Blueprints (4 docs)
- [x] FCR Detection Module (FIRST_CALL/REPEAT_CALL/UNKNOWN)
- [x] Churn Risk Classification (NO_RISK/AT_RISK/UNKNOWN)
- [x] Agent Skill Assessment (GOOD_PERFORMER/NEEDS_IMPROVEMENT/MIXED)
- [x] Enhanced RCALabel with origin, corrective_action, replicable_practice
- [x] Prompt v2.0 with all new fields
- [x] Updated aggregation statistics for v2.0 metrics

### Dashboard Streamlit (completado 2026-01-19)
- [x] Beyond Brand Identity styling (colores, tipografía)
- [x] 8 secciones: Overview, Outcomes, Poor CX, FCR, Churn, Agent, Call Explorer, Export
- [x] RCA Sankey Diagram (Driver → Outcome → Churn Risk)
- [x] Correlation Heatmaps (co-occurrence, driver-outcome)
- [x] Outcome Deep Dive (root causes, correlation, duration analysis)
- [x] Export functionality (Excel, HTML, JSON)
- [x] Blueprint terminology compliance (FCR 4 categorías, Churn Sin/En Riesgo, Talento)

---

## 4. Stack tecnológico (decisiones tomadas)

| Componente | Decisión | Rationale |
|------------|----------|-----------|
| **STT** | AssemblyAI | Best Spanish diarization, competitive cost (~$0.04/call) |
| **LLM** | OpenAI GPT-4o-mini (default) | Cost-effective, JSON strict mode, good Spanish |
| **Data Models** | Pydantic v2 | Type safety, validation, serialization |
| **Storage** | Filesystem JSON | Simplicity, debuggability, checkpoint/resume |
| **Async** | asyncio + aiohttp | Batch processing with rate limiting |
| **Deploy** | Local Python CLI | Phase 1 MVP, no infrastructure overhead |
| **Excel Export** | openpyxl | Standard, no external dependencies |
| **PDF Export** | HTML fallback (weasyprint optional) | Works without system dependencies |
| **Dashboard** | Streamlit + Plotly | Rapid development, interactive charts |
| **Brand Styling** | Custom CSS + Beyond colors | Corporate identity compliance |

---

## 5. Estructura del proyecto (mapa mental)

```
cxinsights/
├── cli.py                     [✅ Done] Main entry point
├── src/
│   ├── transcription/         [✅ Done] AssemblyAI STT with diarization
│   │   ├── base.py            Interface + MockTranscriber
│   │   ├── assemblyai.py      AssemblyAI implementation
│   │   └── models.py          Transcript, SpeakerTurn
│   ├── features/              [✅ Done] Deterministic event extraction
│   │   ├── event_detector.py  HOLD, TRANSFER, SILENCE detection
│   │   └── turn_metrics.py    Talk ratio, interruptions
│   ├── compression/           [✅ Done] Token reduction (>60%)
│   │   ├── compressor.py      Spanish regex patterns
│   │   └── models.py          CompressedTranscript
│   ├── inference/             [✅ Done] LLM-based RCA extraction
│   │   ├── analyzer.py        CallAnalyzer with batch processing
│   │   ├── llm_client.py      OpenAI client with retry/repair
│   │   └── prompts.py         Spanish MAP prompt
│   ├── aggregation/           [✅ Done] Statistics & RCA trees
│   │   ├── statistics.py      Frequency calculations
│   │   ├── severity.py        Weighted severity scoring
│   │   └── rca_tree.py        Deterministic tree builder
│   ├── pipeline/              [✅ Done] Orchestration
│   │   ├── models.py          Manifest, Config, Stages
│   │   └── pipeline.py        CXInsightsPipeline
│   ├── exports/               [✅ Done] Output generation
│   │   ├── json_export.py     Summary + individual analyses
│   │   ├── excel_export.py    Multi-sheet workbook
│   │   └── pdf_export.py      Executive HTML report
│   └── models/                [✅ Done] Core data contracts
│       └── call_analysis.py   CallAnalysis, RCALabel, Evidence
├── config/
│   ├── rca_taxonomy.yaml      [✅ Done] Lost Sales + Poor CX drivers
│   └── settings.yaml          [✅ Done] Batch size, limits, retries
├── tests/
│   └── unit/                  [✅ Done] Comprehensive test suite
├── notebooks/                 [✅ Done] Validation notebooks 01-05
├── dashboard/                 [✅ Done] Streamlit visualization
│   ├── app.py                 Main dashboard application
│   ├── config.py              Beyond brand colors, CSS
│   ├── data_loader.py         Batch data loading utilities
│   ├── components.py          Plotly visualization components
│   └── exports.py             Export functionality
├── .streamlit/
│   └── config.toml            [✅ Done] Theme configuration
└── data/
    ├── examples/              [✅ Done] Sample CallAnalysis JSONs
    └── output/                Generated results go here
```

---

## 6. Cómo navegar este proyecto (para Claude Code)

| Si necesitas... | Lee... |
|-----------------|--------|
| Entender arquitectura | `docs/ARCHITECTURE.md` |
| Implementar features | `docs/MODULE_GUIDES.md` |
| Decisiones técnicas | `docs/TECHNICAL_DECISIONS.md` |
| Troubleshooting | `docs/TROUBLESHOOTING.md` |
| Costs/performance | `docs/BENCHMARKS.md` |
| Schemas de datos | `docs/DATA_CONTRACTS.md` |
| Empezar rápido | `docs/QUICK_START.md` |

---

## 7. Contexto de negocio

| Campo | Valor |
|-------|-------|
| **Usuario principal** | BeyondCX.ai team (Susana) |
| **Cliente objetivo** | Entelgy (demo/pilot) |
| **Idioma de llamadas** | Español (España/LATAM) |
| **Volumen típico** | 5,000-20,000 llamadas por batch |

### KPIs críticos

| KPI | Target | Status |
|-----|--------|--------|
| Cost per call | < €0.50 | TBD (benchmark pending) |
| Processing time | < 24h for 5k calls | TBD |
| RCA accuracy | > 80% (manual validation) | TBD |

---

## 8. Decisiones pendientes (para no repetir análisis)

| Decisión | Status | Notas |
|----------|--------|-------|
| AssemblyAI como STT provider | ✅ DECIDED | Best Spanish diarization |
| OpenAI GPT-4o-mini como LLM default | ✅ DECIDED | Cost-effective, configurable |
| Dashboard Streamlit | ✅ DECIDED | Implemented with Beyond branding |
| Multi-idioma support | ⏳ PENDING | Fase 2 |
| DuckDB para analytics | ⏳ PENDING | Consider for large batches |

---

## 9. Prohibiciones (para evitar sobre-ingeniería)

- ❌ **NO** diseñar para integración con BeyondDiagnosticPrototipo (Fase 2)
- ❌ **NO** asumir outcome labels (sale, churn) disponibles en audio
- ❌ **NO** implementar features sin validar con usuario
- ❌ **NO** cambiar taxonomía RCA sin aprobación explícita
- ❌ **NO** añadir dependencias pesadas (Docker, DBs) en Fase 1
- ❌ **NO** optimizar prematuramente sin benchmarks reales

---

## 10. Principios de diseño (inmutables)

### OBSERVED vs INFERRED

Todo dato se clasifica como:
- **OBSERVED**: Determinístico, extraído sin LLM (duración, eventos, métricas)
- **INFERRED**: Requiere LLM, DEBE tener `evidence_spans[]` con timestamps

### Evidence-backed RCA

```
RCALabel SIN evidence = RECHAZADO
```

Cada driver inferido requiere:
- `driver_code`: Código de taxonomía
- `confidence`: 0.0-1.0 (< 0.6 si evidencia débil)
- `evidence_spans[]`: Mínimo 1 span con texto y timestamps

### Traceability

Todo output incluye:
```python
Traceability(
    schema_version="1.0.0",
    prompt_version="v2.0",  # Updated from v1.0
    model_id="gpt-4o-mini"
)
```

### v2.0 Analysis Dimensions

El prompt v2.0 (Blueprint-aligned) incluye:
- **FCR Status**: FIRST_CALL / REPEAT_CALL / UNKNOWN
- **Churn Risk**: NO_RISK / AT_RISK / UNKNOWN
- **Agent Classification**: GOOD_PERFORMER / NEEDS_IMPROVEMENT / MIXED
- **Driver Origin**: AGENT / CUSTOMER / COMPANY / PROCESS

---

## 11. Comandos principales

```bash
# Ejecutar pipeline completo
python cli.py run my_batch -i data/audio -o data/output

# Ver estado de un batch
python cli.py status my_batch

# Con opciones específicas
python cli.py run my_batch --model gpt-4o --formats json,excel,pdf

# Sin compresión (más tokens, más costo)
python cli.py run my_batch --no-compression

# Sin resume (empezar de cero)
python cli.py run my_batch --no-resume

# Lanzar dashboard de visualización
python -m streamlit run dashboard/app.py
# Dashboard disponible en http://localhost:8510
```

---

## 12. Archivos críticos (no modificar sin revisión)

| Archivo | Razón |
|---------|-------|
| `config/rca_taxonomy.yaml` | Define todos los drivers - cambios afectan inferencia |
| `src/models/call_analysis.py` | Contrato central - cambios rompen downstream |
| `src/inference/prompts.py` | Prompt MAP - cambios afectan calidad RCA |
| `src/aggregation/severity.py` | Fórmula de severidad - cambios afectan priorización |

---

**Última actualización**: 2026-01-19 | **Autor**: Claude Code | **Versión**: 2.0.0 (Blueprint Aligned)