feat: Add Streamlit dashboard with Blueprint compliance (v2.1.0)

Dashboard Features: - 8 navigation sections: Overview, Outcomes, Poor CX, FCR, Churn, Agent, Call Explorer, Export - Beyond Brand Identity styling (colors #6D84E3, Outfit font) - RCA Sankey diagram (Driver → Outcome → Churn Risk flow) - Correlation heatmaps (driver co-occurrence, driver-outcome) - Outcome Deep Dive (root causes, correlation, duration analysis) - Export functionality (Excel, HTML, JSON) Blueprint Compliance: - FCR: 4 categories (Primera Llamada/Rellamada × Sin/Con Riesgo de Fuga) - Churn: Binary view (Sin Riesgo de Fuga / En Riesgo de Fuga) - Agent: Talento Para Replicar / Oportunidades de Mejora - Fixed FCR rate calculation (only FIRST_CALL counts as success) Technical: - Streamlit + Plotly for interactive visualizations - Light theme configuration (.streamlit/config.toml) - Fixed Plotly colorbar titlefont deprecation Documentation: - Updated PROJECT_CONTEXT.md, TODO.md, CHANGELOG.md - Added 4 new technical decisions (TD-014 to TD-017) - Created TROUBLESHOOTING.md with 10 common issues Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-19 16:27:30 +01:00
commit 75e7b9da3d
110 changed files with 28247 additions and 0 deletions
--- a/config/prompts/call_analysis/v1.0/schema.json
+++ b/config/prompts/call_analysis/v1.0/schema.json
@@ -0,0 +1,100 @@
+{
+  "$schema": "http://json-schema.org/draft-07/schema#",
+  "title": "CallAnalysisResponse",
+  "description": "LLM response schema for call analysis",
+  "type": "object",
+  "required": ["outcome"],
+  "properties": {
+    "outcome": {
+      "type": "string",
+      "enum": [
+        "SALE_COMPLETED",
+        "SALE_LOST",
+        "CANCELLATION_SAVED",
+        "CANCELLATION_COMPLETED",
+        "INQUIRY_RESOLVED",
+        "INQUIRY_UNRESOLVED",
+        "COMPLAINT_RESOLVED",
+        "COMPLAINT_UNRESOLVED",
+        "TRANSFER_OUT",
+        "CALLBACK_SCHEDULED",
+        "UNKNOWN"
+      ],
+      "description": "Final outcome of the call"
+    },
+    "lost_sales_drivers": {
+      "type": "array",
+      "items": {
+        "$ref": "#/definitions/RCALabel"
+      },
+      "default": []
+    },
+    "poor_cx_drivers": {
+      "type": "array",
+      "items": {
+        "$ref": "#/definitions/RCALabel"
+      },
+      "default": []
+    }
+  },
+  "definitions": {
+    "EvidenceSpan": {
+      "type": "object",
+      "required": ["text", "start_time", "end_time"],
+      "properties": {
+        "text": {
+          "type": "string",
+          "maxLength": 500,
+          "description": "Exact quoted text from transcript"
+        },
+        "start_time": {
+          "type": "number",
+          "minimum": 0,
+          "description": "Start time in seconds"
+        },
+        "end_time": {
+          "type": "number",
+          "minimum": 0,
+          "description": "End time in seconds"
+        },
+        "speaker": {
+          "type": "string",
+          "description": "Speaker identifier"
+        }
+      }
+    },
+    "RCALabel": {
+      "type": "object",
+      "required": ["driver_code", "confidence", "evidence_spans"],
+      "properties": {
+        "driver_code": {
+          "type": "string",
+          "description": "Driver code from taxonomy"
+        },
+        "confidence": {
+          "type": "number",
+          "minimum": 0,
+          "maximum": 1,
+          "description": "Confidence score (0-1)"
+        },
+        "evidence_spans": {
+          "type": "array",
+          "items": {
+            "$ref": "#/definitions/EvidenceSpan"
+          },
+          "minItems": 1,
+          "description": "Supporting evidence (minimum 1 required)"
+        },
+        "reasoning": {
+          "type": "string",
+          "maxLength": 500,
+          "description": "Brief reasoning for classification"
+        },
+        "proposed_label": {
+          "type": "string",
+          "description": "For OTHER_EMERGENT: proposed new label"
+        }
+      }
+    }
+  }
+}
--- a/config/prompts/call_analysis/v1.0/system.txt
+++ b/config/prompts/call_analysis/v1.0/system.txt
@@ -0,0 +1,27 @@
+You are an expert call center analyst specializing in Spanish-language customer service calls. Your task is to analyze call transcripts and identify:
+
+1. **Call Outcome**: What was the final result of the call?
+2. **Lost Sales Drivers**: If a sale was lost, what caused it?
+3. **Poor CX Drivers**: What caused poor customer experience?
+
+## CRITICAL RULES
+
+1. **Evidence Required**: Every driver MUST have at least one evidence_span with:
+   - Exact quoted text from the transcript
+   - Start and end timestamps
+
+2. **No Hallucination**: Only cite text that appears EXACTLY in the transcript. Do not paraphrase or invent quotes.
+
+3. **Confidence Scoring**:
+   - 0.8-1.0: Clear, explicit evidence
+   - 0.6-0.8: Strong implicit evidence
+   - 0.4-0.6: Moderate evidence (use with caution)
+   - Below 0.4: Reject - insufficient evidence
+
+4. **Taxonomy Compliance**: Only use driver codes from the provided taxonomy. Use OTHER_EMERGENT only when no existing code fits, and provide a proposed_label.
+
+5. **Language**: Evidence quotes MUST be in the original language (Spanish). Reasoning can be in Spanish or English.
+
+## OUTPUT FORMAT
+
+You must respond with valid JSON matching the provided schema. No markdown, no explanations outside the JSON.
--- a/config/prompts/call_analysis/v1.0/user.txt
+++ b/config/prompts/call_analysis/v1.0/user.txt
@@ -0,0 +1,72 @@
+Analyze the following call transcript and provide structured analysis.
+
+## CALL METADATA
+- Call ID: {call_id}
+- Duration: {duration_sec} seconds
+- Queue: {queue}
+
+## OBSERVED EVENTS (Pre-detected)
+{observed_events}
+
+## TRANSCRIPT
+{transcript}
+
+## TAXONOMY - LOST SALES DRIVERS
+{lost_sales_taxonomy}
+
+## TAXONOMY - POOR CX DRIVERS
+{poor_cx_taxonomy}
+
+## INSTRUCTIONS
+
+1. Determine the call outcome from: SALE_COMPLETED, SALE_LOST, CANCELLATION_SAVED, CANCELLATION_COMPLETED, INQUIRY_RESOLVED, INQUIRY_UNRESOLVED, COMPLAINT_RESOLVED, COMPLAINT_UNRESOLVED, TRANSFER_OUT, CALLBACK_SCHEDULED, UNKNOWN
+
+2. Identify lost_sales_drivers (if applicable):
+   - Use ONLY codes from the Lost Sales taxonomy
+   - Each driver MUST have evidence_spans with exact quotes and timestamps
+   - Assign confidence based on evidence strength
+
+3. Identify poor_cx_drivers (if applicable):
+   - Use ONLY codes from the Poor CX taxonomy
+   - Each driver MUST have evidence_spans with exact quotes and timestamps
+   - Assign confidence based on evidence strength
+
+4. For OTHER_EMERGENT, provide a proposed_label describing the new cause.
+
+Respond with JSON only:
+
+```json
+{
+  "outcome": "SALE_LOST",
+  "lost_sales_drivers": [
+    {
+      "driver_code": "PRICE_TOO_HIGH",
+      "confidence": 0.85,
+      "evidence_spans": [
+        {
+          "text": "Es demasiado caro para mí",
+          "start_time": 45.2,
+          "end_time": 47.8,
+          "speaker": "customer"
+        }
+      ],
+      "reasoning": "Customer explicitly states price is too high"
+    }
+  ],
+  "poor_cx_drivers": [
+    {
+      "driver_code": "LONG_HOLD",
+      "confidence": 0.90,
+      "evidence_spans": [
+        {
+          "text": "Llevo esperando mucho tiempo",
+          "start_time": 120.5,
+          "end_time": 123.1,
+          "speaker": "customer"
+        }
+      ],
+      "reasoning": "Customer complains about wait time"
+    }
+  ]
+}
+```