Files
BeyondCX_Insights/docs/MODULE_GUIDES.md
sujucu70 75e7b9da3d feat: Add Streamlit dashboard with Blueprint compliance (v2.1.0)
Dashboard Features:
- 8 navigation sections: Overview, Outcomes, Poor CX, FCR, Churn, Agent, Call Explorer, Export
- Beyond Brand Identity styling (colors #6D84E3, Outfit font)
- RCA Sankey diagram (Driver → Outcome → Churn Risk flow)
- Correlation heatmaps (driver co-occurrence, driver-outcome)
- Outcome Deep Dive (root causes, correlation, duration analysis)
- Export functionality (Excel, HTML, JSON)

Blueprint Compliance:
- FCR: 4 categories (Primera Llamada/Rellamada × Sin/Con Riesgo de Fuga)
- Churn: Binary view (Sin Riesgo de Fuga / En Riesgo de Fuga)
- Agent: Talento Para Replicar / Oportunidades de Mejora
- Fixed FCR rate calculation (only FIRST_CALL counts as success)

Technical:
- Streamlit + Plotly for interactive visualizations
- Light theme configuration (.streamlit/config.toml)
- Fixed Plotly colorbar titlefont deprecation

Documentation:
- Updated PROJECT_CONTEXT.md, TODO.md, CHANGELOG.md
- Added 4 new technical decisions (TD-014 to TD-017)
- Created TROUBLESHOOTING.md with 10 common issues

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-19 16:27:30 +01:00

226 lines
4.6 KiB
Markdown

# MODULE_GUIDES.md
> Guía de implementación para cada módulo
---
## Guía: Transcription Module
### Archivos involucrados
```
src/transcription/
├── __init__.py
├── base.py # Interface Transcriber + MockTranscriber
├── assemblyai.py # AssemblyAITranscriber
└── models.py # Transcript, SpeakerTurn, TranscriptMetadata
```
### Cómo funciona
1. Audio file entra al transcriber
2. AssemblyAI procesa con diarización (agent/customer)
3. Retorna `Transcript` con `SpeakerTurn[]` y metadata
### Cómo testear
```bash
pytest tests/unit/test_transcription.py -v
```
### Cómo extender
- Para nuevo provider: implementar `Transcriber` interface
- Para modificar output: editar `models.py`
### Troubleshooting
- "API key invalid" → Check `.env` ASSEMBLYAI_API_KEY
- "Audio format not supported" → Convert to MP3/WAV
---
## Guía: Feature Extraction Module
### Archivos involucrados
```
src/features/
├── __init__.py
├── event_detector.py # HOLD, TRANSFER, SILENCE detection
└── turn_metrics.py # Talk ratio, interruptions
```
### Cómo funciona
1. Transcript entra
2. Regex + reglas detectan eventos (HOLD, TRANSFER, etc.)
3. Métricas calculadas (talk ratio, speaking time)
4. Transcript enriquecido con `detected_events[]`
### Eventos soportados
- `HOLD_START` / `HOLD_END`
- `TRANSFER`
- `ESCALATION`
- `SILENCE` (> umbral)
- `INTERRUPTION`
### Cómo testear
```bash
pytest tests/unit/test_features.py -v
```
---
## Guía: Compression Module
### Archivos involucrados
```
src/compression/
├── __init__.py
├── compressor.py # TranscriptCompressor
└── models.py # CompressedTranscript, CustomerIntent, etc.
```
### Cómo funciona
1. Transcript completo entra
2. Regex español extrae:
- Customer intents (cancelar, consultar)
- Agent offers (descuento, upgrade)
- Objections (precio, competencia)
- Resolutions
3. Genera `CompressedTranscript` con >60% reducción
### Patrones español
```python
INTENT_PATTERNS = {
IntentType.CANCEL: [r"quiero\s+cancelar", r"dar\s+de\s+baja"],
IntentType.INQUIRY: [r"quería\s+saber", r"información\s+sobre"],
}
```
### Cómo testear
```bash
pytest tests/unit/test_compression.py -v
```
---
## Guía: Inference Module
### Archivos involucrados
```
src/inference/
├── __init__.py
├── analyzer.py # CallAnalyzer (main class)
├── llm_client.py # OpenAIClient
└── prompts.py # Spanish MAP prompt
```
### Cómo funciona
1. CompressedTranscript entra
2. Prompt construido con taxonomía + transcript
3. LLM genera JSON con:
- `outcome`
- `lost_sales_drivers[]` con evidence
- `poor_cx_drivers[]` con evidence
4. Response parseada a `CallAnalysis`
### Configuración
```python
AnalyzerConfig(
model="gpt-4o-mini",
use_compression=True,
max_concurrent=5,
)
```
### Cómo testear
```bash
pytest tests/unit/test_inference.py -v
```
---
## Guía: Aggregation Module
### Archivos involucrados
```
src/aggregation/
├── __init__.py
├── statistics.py # StatisticsCalculator
├── severity.py # SeverityCalculator
├── rca_tree.py # RCATreeBuilder
└── models.py # DriverFrequency, RCATree, etc.
```
### Cómo funciona
1. List[CallAnalysis] entra
2. Statistics: frecuencias por driver
3. Severity: puntuación ponderada
4. RCA Tree: árbol jerárquico ordenado
### Fórmula de severidad
```python
severity = (
base_severity * 0.4 +
frequency_factor * 0.3 +
confidence_factor * 0.2 +
co_occurrence_factor * 0.1
) * 100
```
### Cómo testear
```bash
pytest tests/unit/test_aggregation.py -v
```
---
## Guía: Pipeline Module
### Archivos involucrados
```
src/pipeline/
├── __init__.py
├── models.py # PipelineManifest, StageManifest, Config
└── pipeline.py # CXInsightsPipeline
```
### Stages
1. TRANSCRIPTION
2. FEATURE_EXTRACTION
3. COMPRESSION
4. INFERENCE
5. AGGREGATION
6. EXPORT
### Resume
- Manifest JSON guardado por batch
- `get_resume_stage()` detecta dónde continuar
### Cómo testear
```bash
pytest tests/unit/test_pipeline.py -v
```
---
## Guía: Exports Module
### Archivos involucrados
```
src/exports/
├── __init__.py
├── json_export.py # Summary + analyses
├── excel_export.py # Multi-sheet workbook
└── pdf_export.py # HTML executive report
```
### Formatos
- **JSON**: `summary.json` + `analyses/*.json`
- **Excel**: 5 sheets (Summary, Lost Sales, Poor CX, Details, Patterns)
- **PDF/HTML**: Executive report con métricas
### Cómo testear
```bash
pytest tests/unit/test_pipeline.py::TestExports -v
```
---
**Última actualización**: 2026-01-19