# TODO.md > Lista priorizada de tareas pendientes --- ## Checkpoints Completados ### CP1: Project Setup & Contracts ✅ - [x] Crear estructura de carpetas - [x] Inicializar repo Git - [x] Crear requirements.txt - [x] Crear .env.example - [x] Crear README.md - [x] Crear config/rca_taxonomy.yaml - [x] Crear config/settings.yaml - [x] Crear schemas Pydantic ### CP2: Transcription Module ✅ - [x] Implementar Transcriber interface - [x] Implementar AssemblyAITranscriber - [x] Implementar modelos (Transcript, SpeakerTurn) - [x] Tests unitarios - [x] Notebook 01_transcription_validation.ipynb ### CP3: RCA Schemas & Data Contracts ✅ - [x] Implementar CallAnalysis - [x] Implementar RCALabel, EvidenceSpan - [x] Implementar Event - [x] Separar observed vs inferred - [x] Crear data/examples/ ### CP4: Feature & Event Extraction ✅ - [x] Implementar event_detector.py - [x] Implementar turn_metrics.py - [x] Tests unitarios ### CP5: Inference Engine ✅ - [x] Crear prompt MAP único - [x] Implementar LLMClient con JSON strict - [x] Implementar BatchInference con resume - [x] Tests de evidence obligatorio - [x] Notebook 02_inference_validation.ipynb ### CP6: Transcript Compression ✅ - [x] Implementar CompressedTranscript - [x] Validar reducción >60% tokens - [x] Integrar en inference - [x] Notebook 03_compression_validation.ipynb ### CP7: Aggregation & RCA Trees ✅ - [x] Implementar statistics.py - [x] Definir severity_score con reglas explícitas - [x] Implementar RCATreeBuilder - [x] Notebook 04_aggregation_validation.ipynb ### CP8: End-to-End Pipeline ✅ - [x] Implementar CXInsightsPipeline - [x] Implementar manifests por stage - [x] Implementar resume - [x] Implementar exports (JSON, Excel, PDF) - [x] CLI principal - [x] Notebook 05_full_pipeline_test.ipynb ### CP-GAPS: v2.0 Blueprint Alignment ✅ (2026-01-19) - [x] Gap Analysis vs BeyondCX Blueprints (4 docs Word) - [x] Update rca_taxonomy.yaml with new driver categories - [x] churn_risk drivers - [x] fcr_failure drivers - [x] agent_skills (positive + improvement_needed) - [x] Update call_analysis.py models with new fields - [x] FCRStatus enum - [x] ChurnRisk enum - [x] AgentClassification enum - [x] DriverOrigin enum - [x] AgentSkillIndicator model - [x] Enhanced RCALabel with origin, corrective_action, replicable_practice - [x] Updated CallAnalysis with new fields - [x] Create prompt v2.0 (config/prompts/call_analysis/v2.0/) - [x] system.txt - [x] user.txt - [x] schema.json - [x] Update versions.yaml to active v2.0 - [x] Update prompt_manager.py with TaxonomyTexts - [x] Update analyzer.py to parse new fields - [x] Update aggregation models and statistics for v2.0 - [x] Update tests for v2.0 compatibility ### CP-DASH: Streamlit Dashboard ✅ (2026-01-19) - [x] Create dashboard structure (app.py, config.py, data_loader.py, components.py) - [x] Implement Beyond Brand Identity styling - [x] Colors: Black #000000, Blue #6D84E3, Grey #B1B1B0 - [x] Light theme configuration (.streamlit/config.toml) - [x] Custom CSS with Outfit font - [x] Implement 8 dashboard sections - [x] Overview (KPIs, outcomes, drivers, FCR, churn) - [x] Outcomes Analysis - [x] Poor CX Analysis - [x] FCR Analysis - [x] Churn Risk Analysis - [x] Agent Performance - [x] Call Explorer - [x] Export Insights - [x] Advanced visualizations - [x] RCA Sankey Diagram (Driver → Outcome → Churn Risk) - [x] Correlation Heatmaps (co-occurrence, driver-outcome) - [x] Outcome Deep Dive (root causes, correlation, duration) - [x] Export functionality - [x] Excel multi-sheet workbook - [x] HTML executive summary report - [x] JSON raw data export - [x] Blueprint terminology compliance - [x] FCR: 4 categorías (Primera Llamada/Rellamada × Sin/Con Riesgo) - [x] Churn: Sin Riesgo de Fuga / En Riesgo de Fuga - [x] Agent: Talento Para Replicar / Oportunidades de Mejora --- ## Alta prioridad (Pendiente) - [ ] **Run real benchmark with v2.0** - Ejecutar pipeline con 50-100 llamadas reales - [ ] **Measure actual costs** - Documentar costes reales STT + LLM - [ ] **Validate v2.0 RCA accuracy** - Manual review de 20 llamadas con nuevos campos - [x] **Documentation** - Completar stubs en docs/ ✅ - [x] **Test v2.0 with real transcripts** - Validado con batch test-07 (30 llamadas) ✅ - [x] **Update exports for v2.0** - Dashboard incluye todos los campos nuevos ✅ - [x] **Dashboard Streamlit** - Implementado con Beyond branding ✅ --- ## Media prioridad (CP9 - Optional) - [ ] Caching por hash de transcript - [ ] Batch size benchmarks (encontrar óptimo) - [ ] Comparar STT providers (Whisper, Google) - [ ] Comparar LLM providers (Claude vs GPT-4o) - [ ] DuckDB para analytics de grandes batches --- ## Baja prioridad (Fase 2) - [x] Dashboard Streamlit ✅ (completado 2026-01-19) - [ ] Docker containerization - [ ] CI/CD pipeline - [ ] API REST (FastAPI) - [ ] Multi-idioma support - [ ] Real-time processing - [ ] Integración BeyondDiagnosticPrototipo - [ ] Campaign tracking (Blueprint KPI 2) - [ ] Customer value analysis (Blueprint Pilar 4) - [ ] Sales cycle optimization analysis --- ## Backlog (Ideas) - [ ] Automatic prompt tuning based on validation results - [ ] A/B testing de prompts - [ ] Confidence calibration - [ ] Active learning loop - [ ] Cost anomaly detection --- **Última actualización**: 2026-01-19 (v2.1 Dashboard + Blueprint Compliance completed)