Go to file

sujucu70 7ddb8a2ee5 feat: Add Render.com deployment support with production data

Render Configuration:
- render.yaml for declarative deployment
- requirements-dashboard.txt (lightweight deps for cloud)
- Updated .streamlit/config.toml for production
- Updated app.py to auto-detect production vs local data

Production Data:
- Added data/production/test-07/ with 30 real call analyses
- Updated .gitignore to allow data/production/

Documentation:
- Added Render.com section to DEPLOYMENT.md with step-by-step guide

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-01-19 16:45:57 +01:00

.streamlit

feat: Add Render.com deployment support with production data

2026-01-19 16:45:57 +01:00

config

feat: Add Streamlit dashboard with Blueprint compliance (v2.1.0)

2026-01-19 16:27:30 +01:00

dashboard

feat: Add Render.com deployment support with production data

2026-01-19 16:45:57 +01:00

data

feat: Add Render.com deployment support with production data

2026-01-19 16:45:57 +01:00

docs

feat: Add Render.com deployment support with production data

2026-01-19 16:45:57 +01:00

notebooks

feat: Add Streamlit dashboard with Blueprint compliance (v2.1.0)

2026-01-19 16:27:30 +01:00

src

feat: Add Streamlit dashboard with Blueprint compliance (v2.1.0)

2026-01-19 16:27:30 +01:00

tests

feat: Add Streamlit dashboard with Blueprint compliance (v2.1.0)

2026-01-19 16:27:30 +01:00

.env.example

feat: Add Streamlit dashboard with Blueprint compliance (v2.1.0)

2026-01-19 16:27:30 +01:00

.gitignore

feat: Add Render.com deployment support with production data

2026-01-19 16:45:57 +01:00

brand-identity-guidelines.md

feat: Add Streamlit dashboard with Blueprint compliance (v2.1.0)

2026-01-19 16:27:30 +01:00

cli.py

feat: Add Streamlit dashboard with Blueprint compliance (v2.1.0)

2026-01-19 16:27:30 +01:00

PRODUCT_SPEC.md

feat: Add Streamlit dashboard with Blueprint compliance (v2.1.0)

2026-01-19 16:27:30 +01:00

README.md

feat: Add Streamlit dashboard with Blueprint compliance (v2.1.0)

2026-01-19 16:27:30 +01:00

render.yaml

feat: Add Render.com deployment support with production data

2026-01-19 16:45:57 +01:00

requirements-dashboard.txt

feat: Add Render.com deployment support with production data

2026-01-19 16:45:57 +01:00

requirements-dev.txt

feat: Add Streamlit dashboard with Blueprint compliance (v2.1.0)

2026-01-19 16:27:30 +01:00

requirements-pii.txt

feat: Add Streamlit dashboard with Blueprint compliance (v2.1.0)

2026-01-19 16:27:30 +01:00

requirements.txt

feat: Add Streamlit dashboard with Blueprint compliance (v2.1.0)

2026-01-19 16:27:30 +01:00

WORKFLOW.md

feat: Add Streamlit dashboard with Blueprint compliance (v2.1.0)

2026-01-19 16:27:30 +01:00

README.md

CXInsights

Pipeline automatizado para análisis de conversaciones de contact center en español. Identifica causas raíz de ventas perdidas y mala experiencia de cliente (CX) mediante análisis de transcripciones de llamadas.

Propuesta de Valor

CXInsights identifica, de forma automatizada y basada en evidencia:

Por qué se pierden oportunidades de venta durante las llamadas
Por qué los clientes reciben una mala experiencia
Cuáles son las causas más frecuentes y prioritarias

Responde a preguntas clave:

¿En qué punto del flujo se pierde la venta?
¿Qué comportamientos o procesos generan frustración?
¿Cuáles son las causas raíz de mala CX o churn potencial?

Instalación

Requisitos previos

Python 3.11+
ffmpeg (opcional, para validación de audio)
Cuentas en AssemblyAI y OpenAI

Setup

# 1. Clonar repositorio
git clone https://github.com/tu-org/cxinsights.git
cd cxinsights

# 2. Crear entorno virtual
python -m venv .venv

# Windows
.venv\Scripts\activate

# Linux/Mac
source .venv/bin/activate

# 3. Instalar dependencias
pip install -r requirements.txt

# 4. (Opcional) Instalar soporte PII
pip install -r requirements-pii.txt
python -m spacy download es_core_news_md

# 5. (Opcional) Instalar dependencias de desarrollo
pip install -r requirements-dev.txt

Configuración

1. Variables de entorno

# Copiar template
cp .env.example .env

# Editar con tus API keys
# Windows: notepad .env
# Linux/Mac: nano .env

Variables requeridas:

Variable	Descripción
`ASSEMBLYAI_API_KEY`	API key de AssemblyAI para transcripción
`OPENAI_API_KEY`	API key de OpenAI para análisis LLM

2. Configuración de throttling

Ajusta según tu tier en las APIs:

# .env
MAX_CONCURRENT_TRANSCRIPTIONS=30  # AssemblyAI
LLM_REQUESTS_PER_MINUTE=200       # OpenAI (Tier 1: 200, Tier 2: 2000)

Flujo de Ejecución

┌─────────────────────────────────────────────────────────────────────────────┐
│                         PIPELINE CXInsights                                  │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  1. VALIDACIÓN        Usuario carga audios → Validación + estimación coste │
│         ↓                                                                   │
│  2. TRANSCRIPCIÓN     Audio → Transcript (AssemblyAI)                      │
│         ↓                                                                   │
│  3. FEATURES          Transcript → Eventos + Métricas (determinístico)     │
│         ↓                                                                   │
│  4. COMPRESIÓN        Transcript → CompressedTranscript (reducción >60%)   │
│         ↓                                                                   │
│  5. INFERENCE         CompressedTranscript → Labels (LLM)                  │
│         ↓                                                                   │
│  6. VALIDACIÓN        Labels → Quality Gate (evidence requerido)           │
│         ↓                                                                   │
│  7. AGREGACIÓN        Labels → RCA Trees (estadístico)                     │
│         ↓                                                                   │
│  8. OUTPUTS           RCA Trees → PDF + Excel + JSON                       │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Uso

Estimación de costes

python -m cxinsights.pipeline.cli estimate --input ./data/raw/audio/mi_batch

Ejecutar pipeline completo

python -m cxinsights.pipeline.cli run \
    --input ./data/raw/audio/mi_batch \
    --batch-id mi_batch

Ejecutar por stages

# Solo transcripción
python -m cxinsights.pipeline.cli run --batch-id mi_batch --stages transcription

# Solo inferencia (requiere transcripts existentes)
python -m cxinsights.pipeline.cli run --batch-id mi_batch --stages inference

Resumir desde checkpoint

python -m cxinsights.pipeline.cli resume --batch-id mi_batch

Inputs Esperados

Formato de audio

MP3, WAV, M4A
Duración típica: 6-8 minutos (AHT)

Naming convention

{call_id}_{YYYYMMDD}_{queue}.mp3

Ejemplo: CALL001_20240115_ventas-movil.mp3

Metadata opcional (CSV)

call_id,date,queue,duration
CALL001,2024-01-15,ventas-movil,420

Outputs

Archivo	Descripción
`transcripts.json`	Transcripciones con diarización
`call_labels.json`	Etiquetas RCA por llamada con evidencias
`rca_trees.json`	Árboles de causas raíz
`executive_summary.pdf`	Reporte ejecutivo (2-3 páginas)
`raw_analytics.xlsx`	Dataset completo para exploración

Estructura del Proyecto

cxinsights/
├── src/
│   ├── transcription/    # STT (AssemblyAI)
│   ├── features/         # Extracción determinística
│   ├── inference/        # Análisis LLM
│   ├── validation/       # Quality gate
│   ├── aggregation/      # RCA trees
│   ├── visualization/    # Exports
│   └── pipeline/         # Orquestación
├── config/
│   ├── rca_taxonomy.yaml # Taxonomía frozen
│   └── settings.yaml     # Configuración
├── data/                 # Datos (gitignored)
├── tests/                # Tests
└── notebooks/            # Validación

Documentación

PRODUCT_SPEC.md - Especificación del producto
docs/ARCHITECTURE.md - Arquitectura del pipeline
docs/TECH_STACK.md - Stack tecnológico
docs/PROJECT_STRUCTURE.md - Estructura detallada
docs/DEPLOYMENT.md - Guía de deployment

KPIs de Calidad

KPI	Target
Transcripciones utilizables	90%
Confianza media RCA	≥ 0.70
Tiempo (5K llamadas)	< 24h
Coste por llamada	< €0.50

Licencia

Propietario - BeyondCX.ai