Files
BeyondCX_Insights/docs/TODO.md
sujucu70 75e7b9da3d feat: Add Streamlit dashboard with Blueprint compliance (v2.1.0)
Dashboard Features:
- 8 navigation sections: Overview, Outcomes, Poor CX, FCR, Churn, Agent, Call Explorer, Export
- Beyond Brand Identity styling (colors #6D84E3, Outfit font)
- RCA Sankey diagram (Driver → Outcome → Churn Risk flow)
- Correlation heatmaps (driver co-occurrence, driver-outcome)
- Outcome Deep Dive (root causes, correlation, duration analysis)
- Export functionality (Excel, HTML, JSON)

Blueprint Compliance:
- FCR: 4 categories (Primera Llamada/Rellamada × Sin/Con Riesgo de Fuga)
- Churn: Binary view (Sin Riesgo de Fuga / En Riesgo de Fuga)
- Agent: Talento Para Replicar / Oportunidades de Mejora
- Fixed FCR rate calculation (only FIRST_CALL counts as success)

Technical:
- Streamlit + Plotly for interactive visualizations
- Light theme configuration (.streamlit/config.toml)
- Fixed Plotly colorbar titlefont deprecation

Documentation:
- Updated PROJECT_CONTEXT.md, TODO.md, CHANGELOG.md
- Added 4 new technical decisions (TD-014 to TD-017)
- Created TROUBLESHOOTING.md with 10 common issues

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-19 16:27:30 +01:00

167 lines
5.3 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# TODO.md
> Lista priorizada de tareas pendientes
---
## Checkpoints Completados
### CP1: Project Setup & Contracts ✅
- [x] Crear estructura de carpetas
- [x] Inicializar repo Git
- [x] Crear requirements.txt
- [x] Crear .env.example
- [x] Crear README.md
- [x] Crear config/rca_taxonomy.yaml
- [x] Crear config/settings.yaml
- [x] Crear schemas Pydantic
### CP2: Transcription Module ✅
- [x] Implementar Transcriber interface
- [x] Implementar AssemblyAITranscriber
- [x] Implementar modelos (Transcript, SpeakerTurn)
- [x] Tests unitarios
- [x] Notebook 01_transcription_validation.ipynb
### CP3: RCA Schemas & Data Contracts ✅
- [x] Implementar CallAnalysis
- [x] Implementar RCALabel, EvidenceSpan
- [x] Implementar Event
- [x] Separar observed vs inferred
- [x] Crear data/examples/
### CP4: Feature & Event Extraction ✅
- [x] Implementar event_detector.py
- [x] Implementar turn_metrics.py
- [x] Tests unitarios
### CP5: Inference Engine ✅
- [x] Crear prompt MAP único
- [x] Implementar LLMClient con JSON strict
- [x] Implementar BatchInference con resume
- [x] Tests de evidence obligatorio
- [x] Notebook 02_inference_validation.ipynb
### CP6: Transcript Compression ✅
- [x] Implementar CompressedTranscript
- [x] Validar reducción >60% tokens
- [x] Integrar en inference
- [x] Notebook 03_compression_validation.ipynb
### CP7: Aggregation & RCA Trees ✅
- [x] Implementar statistics.py
- [x] Definir severity_score con reglas explícitas
- [x] Implementar RCATreeBuilder
- [x] Notebook 04_aggregation_validation.ipynb
### CP8: End-to-End Pipeline ✅
- [x] Implementar CXInsightsPipeline
- [x] Implementar manifests por stage
- [x] Implementar resume
- [x] Implementar exports (JSON, Excel, PDF)
- [x] CLI principal
- [x] Notebook 05_full_pipeline_test.ipynb
### CP-GAPS: v2.0 Blueprint Alignment ✅ (2026-01-19)
- [x] Gap Analysis vs BeyondCX Blueprints (4 docs Word)
- [x] Update rca_taxonomy.yaml with new driver categories
- [x] churn_risk drivers
- [x] fcr_failure drivers
- [x] agent_skills (positive + improvement_needed)
- [x] Update call_analysis.py models with new fields
- [x] FCRStatus enum
- [x] ChurnRisk enum
- [x] AgentClassification enum
- [x] DriverOrigin enum
- [x] AgentSkillIndicator model
- [x] Enhanced RCALabel with origin, corrective_action, replicable_practice
- [x] Updated CallAnalysis with new fields
- [x] Create prompt v2.0 (config/prompts/call_analysis/v2.0/)
- [x] system.txt
- [x] user.txt
- [x] schema.json
- [x] Update versions.yaml to active v2.0
- [x] Update prompt_manager.py with TaxonomyTexts
- [x] Update analyzer.py to parse new fields
- [x] Update aggregation models and statistics for v2.0
- [x] Update tests for v2.0 compatibility
### CP-DASH: Streamlit Dashboard ✅ (2026-01-19)
- [x] Create dashboard structure (app.py, config.py, data_loader.py, components.py)
- [x] Implement Beyond Brand Identity styling
- [x] Colors: Black #000000, Blue #6D84E3, Grey #B1B1B0
- [x] Light theme configuration (.streamlit/config.toml)
- [x] Custom CSS with Outfit font
- [x] Implement 8 dashboard sections
- [x] Overview (KPIs, outcomes, drivers, FCR, churn)
- [x] Outcomes Analysis
- [x] Poor CX Analysis
- [x] FCR Analysis
- [x] Churn Risk Analysis
- [x] Agent Performance
- [x] Call Explorer
- [x] Export Insights
- [x] Advanced visualizations
- [x] RCA Sankey Diagram (Driver → Outcome → Churn Risk)
- [x] Correlation Heatmaps (co-occurrence, driver-outcome)
- [x] Outcome Deep Dive (root causes, correlation, duration)
- [x] Export functionality
- [x] Excel multi-sheet workbook
- [x] HTML executive summary report
- [x] JSON raw data export
- [x] Blueprint terminology compliance
- [x] FCR: 4 categorías (Primera Llamada/Rellamada × Sin/Con Riesgo)
- [x] Churn: Sin Riesgo de Fuga / En Riesgo de Fuga
- [x] Agent: Talento Para Replicar / Oportunidades de Mejora
---
## Alta prioridad (Pendiente)
- [ ] **Run real benchmark with v2.0** - Ejecutar pipeline con 50-100 llamadas reales
- [ ] **Measure actual costs** - Documentar costes reales STT + LLM
- [ ] **Validate v2.0 RCA accuracy** - Manual review de 20 llamadas con nuevos campos
- [x] **Documentation** - Completar stubs en docs/ ✅
- [x] **Test v2.0 with real transcripts** - Validado con batch test-07 (30 llamadas) ✅
- [x] **Update exports for v2.0** - Dashboard incluye todos los campos nuevos ✅
- [x] **Dashboard Streamlit** - Implementado con Beyond branding ✅
---
## Media prioridad (CP9 - Optional)
- [ ] Caching por hash de transcript
- [ ] Batch size benchmarks (encontrar óptimo)
- [ ] Comparar STT providers (Whisper, Google)
- [ ] Comparar LLM providers (Claude vs GPT-4o)
- [ ] DuckDB para analytics de grandes batches
---
## Baja prioridad (Fase 2)
- [x] Dashboard Streamlit ✅ (completado 2026-01-19)
- [ ] Docker containerization
- [ ] CI/CD pipeline
- [ ] API REST (FastAPI)
- [ ] Multi-idioma support
- [ ] Real-time processing
- [ ] Integración BeyondDiagnosticPrototipo
- [ ] Campaign tracking (Blueprint KPI 2)
- [ ] Customer value analysis (Blueprint Pilar 4)
- [ ] Sales cycle optimization analysis
---
## Backlog (Ideas)
- [ ] Automatic prompt tuning based on validation results
- [ ] A/B testing de prompts
- [ ] Confidence calibration
- [ ] Active learning loop
- [ ] Cost anomaly detection
---
**Última actualización**: 2026-01-19 (v2.1 Dashboard + Blueprint Compliance completed)