BeyondCX_Insights/WORKFLOW.md

# CXInsights - Development Workflow

## Checkpoints Overview

```
┌─────────────────────────────────────────────────────────────────────────────┐
│                         DEVELOPMENT WORKFLOW                                 │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  CP1 → CP2 → CP3 → CP4 → CP5 → CP6 → CP7 → CP8 → [CP9]                    │
│   │     │     │     │     │     │     │     │      │                       │
│   │     │     │     │     │     │     │     │      └─ Optimization         │
│   │     │     │     │     │     │     │     └─ E2E Pipeline                │
│   │     │     │     │     │     │     └─ RCA Aggregation                   │
│   │     │     │     │     │     └─ Compression                             │
│   │     │     │     │     └─ Inference Engine                              │
│   │     │     │     └─ Feature Extraction                                  │
│   │     │     └─ RCA Schemas                                               │
│   │     └─ Transcription Module                                            │
│   └─ Project Setup & Contracts                                             │
│                                                                             │
│  Cada checkpoint tiene criterios STOP/GO explícitos.                       │
│  NO avanzar sin aprobación del checkpoint anterior.                        │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘
```

---

## CHECKPOINT 1 — Project Setup & Contracts

**Objetivo:** Fijar estructura, contratos y versionado antes de escribir lógica.

### Tareas

- [x] Crear estructura de carpetas según ARCHITECTURE.md
- [x] Inicializar repo Git con .gitignore (datos, .env, outputs)
- [x] Crear requirements.txt con versiones pinned
- [x] Crear .env.example con todas las variables necesarias
- [x] Crear README.md con:
  - descripción del producto
  - instalación (virtualenv)
  - configuración (.env)
  - flujo de ejecución (alto nivel)
- [x] Crear config/rca_taxonomy.yaml (Round 1 frozen)
- [x] Crear config/settings.yaml (batch_size, limits, retries)
- [x] Crear config/schemas/:
  - call_analysis_v1.py (Pydantic)
  - incluir: schema_version, prompt_version, model_id

### Reglas

- ❌ No implementar lógica funcional
- ❌ No llamar APIs externas

### Entregable

- Output de `tree -L 3`
- Revisión de contratos y estructura

### STOP/GO Criteria

- [ ] Estructura completa y coherente con ARCHITECTURE.md
- [ ] Contratos Pydantic compilables
- [ ] .gitignore protege datos sensibles

---

## CHECKPOINT 2 — Transcription Module (Isolated & Auditable)

**Objetivo:** STT fiable, comparable y con métricas reales.

### Tareas

- [x] Implementar src/transcription/base.py
  - interfaz Transcriber
- [x] Implementar AssemblyAITranscriber
  - batch async
  - retries + backoff
  - captura de provider, job_id
- [x] Implementar modelos:
  - Transcript
  - SpeakerTurn
  - incluir: audio_duration_sec, language, provider, created_at
- [x] Implementar extracción básica de audio metadata (ffprobe)
- [x] Tests:
  - 1 audio corto (mock o real)
  - validar estructura + diarización mínima
- [x] Notebook 01_transcription_validation.ipynb:
  - 5–10 audios reales
  - medir: latencia, coste real/min, diarization quality

### STOP/GO Criteria

- [ ] Calidad aceptable
- [ ] Coste real conocido
- [ ] Decisión proveedor STT

---

## CHECKPOINT 3 — RCA Schemas & Data Contracts (NO LLM)

**Objetivo:** Definir qué significa una llamada analizada.

### Tareas

- [x] Implementar src/models/call_analysis.py:
  - CallAnalysis
  - RCALabel
  - EvidenceSpan
  - Event
- [x] Reglas obligatorias:
  - separar observed vs inferred
  - events[] estructurado (HOLD, TRANSFER, ESCALATION…)
  - status por llamada (success/partial/failed)
  - trazabilidad: schema_version, prompt_version, model_id
- [x] Crear data/examples/:
  - lost sale
  - poor CX
  - mixed
  - con evidence y events reales

### STOP/GO Criteria

- [ ] ¿El schema captura TODO lo necesario?
- [ ] ¿Es auditable sin leer texto libre?

---

## CHECKPOINT 4 — Feature & Event Extraction (Deterministic)

**Objetivo:** Sacar del LLM lo que no debe inferir.

### Tareas

- [x] Implementar src/features/event_detector.py:
  - HOLD_START / HOLD_END
  - TRANSFER
  - SILENCE
- [x] Implementar src/features/turn_metrics.py:
  - talk ratio
  - interruptions
- [x] Enriquecer Transcript → TranscriptWithEvents

### STOP/GO Criteria

- [ ] Eventos coherentes
- [ ] Base causal estable para inference

---

## CHECKPOINT 5 — Inference Engine (MAP Stage, Single Pass)

**Objetivo:** Inferencia consistente, explicable y controlada.

### Tareas

- [x] Crear un único prompt MAP:
  - sales + CX + RCA + reasoning
  - forzar JSON completo
- [x] Implementar LLMClient:
  - JSON strict
  - retries + repair
  - logging de tokens
- [x] Implementar BatchInference:
  - batch_size configurable
  - guardado incremental
  - resume seguro
- [x] Tests:
  - evidence obligatorio
  - confidence < 0.6 si evidence débil
- [x] Notebook 02_inference_validation.ipynb:
  - 10 llamadas reales
  - revisar evidence manualmente
  - coste por llamada

### STOP/GO Criteria

- [ ] ¿El LLM no alucina?
- [ ] ¿La evidence es defendible?

---

## CHECKPOINT 6 — Transcript Compression (Baseline, not optional)

**Objetivo:** Control de coste y latencia desde diseño.

### Tareas

- [x] Implementar CompressedTranscript:
  - customer intent
  - agent offers
  - objections
  - resolution statements
- [x] Validar reducción tokens (>60%)
- [x] Forzar uso de compressed transcript en inference

### STOP/GO Criteria

- [ ] Coste predecible
- [ ] Latencia estable en 20k

---

## CHECKPOINT 7 — Aggregation & RCA Trees (Deterministic Core)

**Objetivo:** Pasar de llamadas a causas.

### Tareas

- [x] Implementar estadísticas:
  - frecuencia
  - conditional probabilities
- [x] Definir severity_score con reglas explícitas
- [x] Implementar RCATreeBuilder (determinístico)
- [x] (Opcional) LLM solo para narrativa
- [x] Notebook 04_aggregation_validation.ipynb:
  - 100 llamadas
  - números cuadran
  - RCA prioriza bien

### STOP/GO Criteria

- [ ] ¿El árbol es accionable?
- [ ] ¿Refleja impacto real?

---

## CHECKPOINT 8 — End-to-End Pipeline & Delivery

**Objetivo:** Operación real sin intervención humana.

### Tareas

- [x] Implementar CXInsightsPipeline
  - manifests por stage
  - resume total/parcial
- [x] Implementar exports:
  - Excel
  - PDF
  - JSON
- [x] CLI principal
- [x] Notebook 05_full_pipeline_test.ipynb:
  - 50 llamadas
  - medir tiempo total
  - medir coste total

### STOP/GO Criteria

- [ ] Pipeline estable
- [ ] Outputs reproducibles

---

## CHECKPOINT 9 — Optimization & Benchmarking (Optional)

**Objetivo:** Maximizar ROI.

### Tareas

- [ ] Caching por hash
- [ ] Batch size benchmarks
- [ ] Comparar STT providers

---

## Progress Tracking

| Checkpoint | Status | Date Started | Date Completed | Notes |
|------------|--------|--------------|----------------|-------|
| CP1 | ✅ Completed | 2026-01-19 | 2026-01-19 | Approved |
| CP2 | ✅ Completed | 2026-01-19 | 2026-01-19 | Approved |
| CP3 | ✅ Completed | 2026-01-19 | 2026-01-19 | Approved |
| CP4 | ✅ Completed | 2026-01-19 | 2026-01-19 | Approved |
| CP5 | ✅ Completed | 2026-01-19 | 2026-01-19 | Approved |
| CP6 | ✅ Completed | 2026-01-19 | 2026-01-19 | Approved |
| CP7 | ✅ Completed | 2026-01-19 | 2026-01-19 | Approved |
| CP8 | ✅ Completed | 2026-01-19 | 2026-01-19 | Approved |
| CP9 | ⏳ Optional | - | - | - |