feat: Add Streamlit dashboard with Blueprint compliance (v2.1.0)

Dashboard Features: - 8 navigation sections: Overview, Outcomes, Poor CX, FCR, Churn, Agent, Call Explorer, Export - Beyond Brand Identity styling (colors #6D84E3, Outfit font) - RCA Sankey diagram (Driver → Outcome → Churn Risk flow) - Correlation heatmaps (driver co-occurrence, driver-outcome) - Outcome Deep Dive (root causes, correlation, duration analysis) - Export functionality (Excel, HTML, JSON) Blueprint Compliance: - FCR: 4 categories (Primera Llamada/Rellamada × Sin/Con Riesgo de Fuga) - Churn: Binary view (Sin Riesgo de Fuga / En Riesgo de Fuga) - Agent: Talento Para Replicar / Oportunidades de Mejora - Fixed FCR rate calculation (only FIRST_CALL counts as success) Technical: - Streamlit + Plotly for interactive visualizations - Light theme configuration (.streamlit/config.toml) - Fixed Plotly colorbar titlefont deprecation Documentation: - Updated PROJECT_CONTEXT.md, TODO.md, CHANGELOG.md - Added 4 new technical decisions (TD-014 to TD-017) - Created TROUBLESHOOTING.md with 10 common issues Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-19 16:27:30 +01:00
commit 75e7b9da3d
110 changed files with 28247 additions and 0 deletions
--- a/notebooks/02_inference_validation.ipynb
+++ b/notebooks/02_inference_validation.ipynb
@@ -0,0 +1,651 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# 02 - Inference Engine Validation\n",
+    "\n",
+    "**Checkpoint 5 validation notebook**\n",
+    "\n",
+    "This notebook validates the inference engine components:\n",
+    "1. LLMClient with JSON strict mode and retries\n",
+    "2. PromptManager with versioned templates\n",
+    "3. CallAnalyzer for single-call analysis\n",
+    "4. BatchAnalyzer with checkpointing\n",
+    "\n",
+    "**Note**: Uses mocked LLM responses to avoid API costs during validation."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import sys\n",
+    "sys.path.insert(0, '..')\n",
+    "\n",
+    "import json\n",
+    "from pathlib import Path\n",
+    "from datetime import datetime\n",
+    "from unittest.mock import AsyncMock, MagicMock, patch\n",
+    "\n",
+    "# Project imports\n",
+    "from src.inference.client import LLMClient, LLMClientConfig, LLMResponse\n",
+    "from src.inference.prompt_manager import (\n",
+    "    PromptManager,\n",
+    "    PromptTemplate,\n",
+    "    format_events_for_prompt,\n",
+    "    format_transcript_for_prompt,\n",
+    "    load_taxonomy_for_prompt,\n",
+    ")\n",
+    "from src.inference.analyzer import CallAnalyzer, AnalyzerConfig\n",
+    "from src.models.call_analysis import (\n",
+    "    CallAnalysis,\n",
+    "    CallOutcome,\n",
+    "    ProcessingStatus,\n",
+    "    Event,\n",
+    "    EventType,\n",
+    ")\n",
+    "from src.transcription.models import SpeakerTurn, Transcript, TranscriptMetadata\n",
+    "\n",
+    "print(\"Imports successful!\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 1. Prompt Manager Validation"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Initialize prompt manager\n",
+    "prompts_dir = Path('../config/prompts')\n",
+    "manager = PromptManager(prompts_dir)\n",
+    "\n",
+    "print(f\"Prompts directory: {prompts_dir}\")\n",
+    "print(f\"Available prompt types: {manager.list_prompt_types()}\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Load call analysis prompt\n",
+    "template = manager.load('call_analysis', 'v1.0')\n",
+    "\n",
+    "print(f\"Template name: {template.name}\")\n",
+    "print(f\"Template version: {template.version}\")\n",
+    "print(f\"System prompt length: {len(template.system)} chars\")\n",
+    "print(f\"User prompt length: {len(template.user)} chars\")\n",
+    "print(f\"Has schema: {template.schema is not None}\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Test template rendering\n",
+    "system, user = template.render(\n",
+    "    call_id=\"TEST001\",\n",
+    "    transcript=\"AGENT: Hola, buenos días\\nCUSTOMER: Quiero cancelar\",\n",
+    "    duration_sec=120.5,\n",
+    "    queue=\"ventas\",\n",
+    "    observed_events=\"- HOLD_START at 30.0s\",\n",
+    "    lost_sales_taxonomy=\"- PRICE_TOO_HIGH: Customer mentions price concerns\",\n",
+    "    poor_cx_taxonomy=\"- LONG_HOLD: Extended hold times\",\n",
+    ")\n",
+    "\n",
+    "print(\"=== SYSTEM PROMPT (first 500 chars) ===\")\n",
+    "print(system[:500])\n",
+    "print(\"\\n=== USER PROMPT (first 500 chars) ===\")\n",
+    "print(user[:500])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Test taxonomy loading\n",
+    "lost_sales_tax, poor_cx_tax = load_taxonomy_for_prompt(\n",
+    "    Path('../config/rca_taxonomy.yaml')\n",
+    ")\n",
+    "\n",
+    "print(\"=== LOST SALES TAXONOMY ===\")\n",
+    "print(lost_sales_tax[:500] if lost_sales_tax else \"(empty)\")\n",
+    "print(\"\\n=== POOR CX TAXONOMY ===\")\n",
+    "print(poor_cx_tax[:500] if poor_cx_tax else \"(empty)\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 2. LLMClient Validation (Mocked)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Test LLMResponse cost estimation\n",
+    "response = LLMResponse(\n",
+    "    content='{\"outcome\": \"LOST_SALE\"}',\n",
+    "    prompt_tokens=1000,\n",
+    "    completion_tokens=500,\n",
+    "    total_tokens=1500,\n",
+    "    success=True,\n",
+    "    model=\"gpt-4o-mini\",\n",
+    ")\n",
+    "\n",
+    "print(f\"Response success: {response.success}\")\n",
+    "print(f\"Total tokens: {response.total_tokens}\")\n",
+    "print(f\"Estimated cost: ${response.cost_estimate_usd:.6f}\")\n",
+    "print(f\"Model: {response.model}\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Test JSON parsing with mocked client\n",
+    "with patch.dict('os.environ', {'OPENAI_API_KEY': 'test-key'}):\n",
+    "    client = LLMClient()\n",
+    "    \n",
+    "    # Test various JSON formats\n",
+    "    test_cases = [\n",
+    "        ('{\"key\": \"value\"}', \"Plain JSON\"),\n",
+    "        ('```json\\n{\"key\": \"value\"}\\n```', \"Markdown block\"),\n",
+    "        ('Here is the result: {\"key\": \"value\"} done.', \"Embedded JSON\"),\n",
+    "        ('not json', \"Invalid\"),\n",
+    "    ]\n",
+    "    \n",
+    "    for content, desc in test_cases:\n",
+    "        result = client._parse_json(content)\n",
+    "        print(f\"{desc}: {result}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 3. Formatting Functions"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Test event formatting\n",
+    "events = [\n",
+    "    Event(event_type=EventType.HOLD_START, start_time=10.0),\n",
+    "    Event(event_type=EventType.HOLD_END, start_time=45.0),\n",
+    "    Event(event_type=EventType.SILENCE, start_time=60.0, duration_sec=8.5),\n",
+    "    Event(event_type=EventType.TRANSFER, start_time=120.0),\n",
+    "]\n",
+    "\n",
+    "events_text = format_events_for_prompt(events)\n",
+    "print(\"=== FORMATTED EVENTS ===\")\n",
+    "print(events_text)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Test transcript formatting\n",
+    "turns = [\n",
+    "    SpeakerTurn(speaker=\"agent\", text=\"Hola, buenos días, gracias por llamar.\", start_time=0.0, end_time=2.5),\n",
+    "    SpeakerTurn(speaker=\"customer\", text=\"Hola, quiero información sobre los precios.\", start_time=3.0, end_time=5.0),\n",
+    "    SpeakerTurn(speaker=\"agent\", text=\"Claro, ¿qué producto le interesa?\", start_time=5.5, end_time=7.0),\n",
+    "    SpeakerTurn(speaker=\"customer\", text=\"El plan premium, pero es muy caro.\", start_time=7.5, end_time=10.0),\n",
+    "]\n",
+    "\n",
+    "transcript_text = format_transcript_for_prompt(turns)\n",
+    "print(\"=== FORMATTED TRANSCRIPT ===\")\n",
+    "print(transcript_text)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Test truncation\n",
+    "long_turns = [\n",
+    "    SpeakerTurn(speaker=\"agent\", text=\"A\" * 3000, start_time=0.0, end_time=30.0),\n",
+    "    SpeakerTurn(speaker=\"customer\", text=\"B\" * 3000, start_time=30.0, end_time=60.0),\n",
+    "    SpeakerTurn(speaker=\"agent\", text=\"C\" * 3000, start_time=60.0, end_time=90.0),\n",
+    "]\n",
+    "\n",
+    "truncated = format_transcript_for_prompt(long_turns, max_chars=5000)\n",
+    "print(f\"Truncated length: {len(truncated)} chars\")\n",
+    "print(f\"Contains truncation marker: {'truncated' in truncated}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 4. CallAnalyzer Validation (Mocked LLM)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Create test transcript\n",
+    "test_transcript = Transcript(\n",
+    "    call_id=\"VAL001\",\n",
+    "    turns=[\n",
+    "        SpeakerTurn(speaker=\"agent\", text=\"Hola, buenos días.\", start_time=0.0, end_time=1.5),\n",
+    "        SpeakerTurn(speaker=\"customer\", text=\"Hola, quiero cancelar mi servicio.\", start_time=2.0, end_time=4.0),\n",
+    "        SpeakerTurn(speaker=\"agent\", text=\"¿Puedo preguntar el motivo?\", start_time=4.5, end_time=6.0),\n",
+    "        SpeakerTurn(speaker=\"customer\", text=\"Es demasiado caro para mí.\", start_time=6.5, end_time=8.5),\n",
+    "        SpeakerTurn(speaker=\"agent\", text=\"Entiendo. ¿Le puedo ofrecer un descuento?\", start_time=9.0, end_time=11.0),\n",
+    "        SpeakerTurn(speaker=\"customer\", text=\"No gracias, ya tomé la decisión.\", start_time=11.5, end_time=13.5),\n",
+    "    ],\n",
+    "    metadata=TranscriptMetadata(\n",
+    "        audio_duration_sec=60.0,\n",
+    "        language=\"es\",\n",
+    "        provider=\"assemblyai\",\n",
+    "    ),\n",
+    ")\n",
+    "\n",
+    "print(f\"Test transcript: {test_transcript.call_id}\")\n",
+    "print(f\"Turns: {len(test_transcript.turns)}\")\n",
+    "print(f\"Duration: {test_transcript.metadata.audio_duration_sec}s\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Mock LLM response for lost sale\n",
+    "mock_llm_response = {\n",
+    "    \"outcome\": \"LOST_SALE\",\n",
+    "    \"lost_sales_drivers\": [\n",
+    "        {\n",
+    "            \"driver_code\": \"PRICE_TOO_HIGH\",\n",
+    "            \"confidence\": 0.92,\n",
+    "            \"evidence_spans\": [\n",
+    "                {\n",
+    "                    \"text\": \"Es demasiado caro para mí\",\n",
+    "                    \"start_time\": 6.5,\n",
+    "                    \"end_time\": 8.5,\n",
+    "                    \"speaker\": \"customer\"\n",
+    "                }\n",
+    "            ],\n",
+    "            \"reasoning\": \"Customer explicitly states the service is too expensive\"\n",
+    "        },\n",
+    "        {\n",
+    "            \"driver_code\": \"RETENTION_ATTEMPT_FAILED\",\n",
+    "            \"confidence\": 0.85,\n",
+    "            \"evidence_spans\": [\n",
+    "                {\n",
+    "                    \"text\": \"No gracias, ya tomé la decisión\",\n",
+    "                    \"start_time\": 11.5,\n",
+    "                    \"end_time\": 13.5,\n",
+    "                    \"speaker\": \"customer\"\n",
+    "                }\n",
+    "            ],\n",
+    "            \"reasoning\": \"Customer rejected discount offer indicating firm decision\"\n",
+    "        }\n",
+    "    ],\n",
+    "    \"poor_cx_drivers\": []\n",
+    "}\n",
+    "\n",
+    "print(\"Mock LLM response prepared\")\n",
+    "print(json.dumps(mock_llm_response, indent=2))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Test analyzer with mocked LLM\n",
+    "with patch.dict('os.environ', {'OPENAI_API_KEY': 'test-key'}):\n",
+    "    # Create mock LLM client\n",
+    "    mock_client = MagicMock(spec=LLMClient)\n",
+    "    mock_client.complete.return_value = LLMResponse(\n",
+    "        content=json.dumps(mock_llm_response),\n",
+    "        parsed_json=mock_llm_response,\n",
+    "        prompt_tokens=500,\n",
+    "        completion_tokens=200,\n",
+    "        total_tokens=700,\n",
+    "        success=True,\n",
+    "        model=\"gpt-4o-mini\",\n",
+    "    )\n",
+    "    \n",
+    "    # Create analyzer with mock client\n",
+    "    analyzer = CallAnalyzer(\n",
+    "        llm_client=mock_client,\n",
+    "        config=AnalyzerConfig(\n",
+    "            prompt_version=\"v1.0\",\n",
+    "            min_confidence_threshold=0.3,\n",
+    "        ),\n",
+    "    )\n",
+    "    \n",
+    "    # Analyze\n",
+    "    result = analyzer.analyze(test_transcript, batch_id=\"validation\")\n",
+    "    \n",
+    "    print(f\"Analysis status: {result.status}\")\n",
+    "    print(f\"Outcome: {result.outcome}\")\n",
+    "    print(f\"Lost sales drivers: {len(result.lost_sales_drivers)}\")\n",
+    "    print(f\"Poor CX drivers: {len(result.poor_cx_drivers)}\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Validate result structure\n",
+    "print(\"=== CALL ANALYSIS RESULT ===\")\n",
+    "print(f\"Call ID: {result.call_id}\")\n",
+    "print(f\"Batch ID: {result.batch_id}\")\n",
+    "print(f\"Status: {result.status}\")\n",
+    "print(f\"Outcome: {result.outcome}\")\n",
+    "\n",
+    "print(\"\\n=== OBSERVED FEATURES ===\")\n",
+    "print(f\"Audio duration: {result.observed.audio_duration_sec}s\")\n",
+    "print(f\"Events: {len(result.observed.events)}\")\n",
+    "print(f\"Agent talk ratio: {result.observed.agent_talk_ratio:.2%}\")\n",
+    "\n",
+    "print(\"\\n=== LOST SALES DRIVERS ===\")\n",
+    "for driver in result.lost_sales_drivers:\n",
+    "    print(f\"  - {driver.driver_code} (conf: {driver.confidence:.2f})\")\n",
+    "    print(f\"    Evidence: \\\"{driver.evidence_spans[0].text}\\\"\")\n",
+    "\n",
+    "print(\"\\n=== TRACEABILITY ===\")\n",
+    "print(f\"Schema version: {result.traceability.schema_version}\")\n",
+    "print(f\"Prompt version: {result.traceability.prompt_version}\")\n",
+    "print(f\"Model ID: {result.traceability.model_id}\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Validate JSON serialization\n",
+    "result_dict = result.model_dump()\n",
+    "result_json = json.dumps(result_dict, indent=2, default=str)\n",
+    "\n",
+    "print(f\"Serialized JSON length: {len(result_json)} chars\")\n",
+    "print(\"\\n=== SAMPLE OUTPUT (first 1500 chars) ===\")\n",
+    "print(result_json[:1500])"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 5. Validation of Evidence Requirements"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from src.models.call_analysis import RCALabel, EvidenceSpan\n",
+    "\n",
+    "# Test: RCALabel requires evidence\n",
+    "print(\"Testing evidence requirements...\")\n",
+    "\n",
+    "# Valid: with evidence\n",
+    "try:\n",
+    "    valid_label = RCALabel(\n",
+    "        driver_code=\"PRICE_TOO_HIGH\",\n",
+    "        confidence=0.9,\n",
+    "        evidence_spans=[\n",
+    "            EvidenceSpan(text=\"Es muy caro\", start_time=10.0, end_time=12.0)\n",
+    "        ],\n",
+    "    )\n",
+    "    print(\"✓ Valid label with evidence created successfully\")\n",
+    "except Exception as e:\n",
+    "    print(f\"✗ Unexpected error: {e}\")\n",
+    "\n",
+    "# Invalid: without evidence\n",
+    "try:\n",
+    "    invalid_label = RCALabel(\n",
+    "        driver_code=\"PRICE_TOO_HIGH\",\n",
+    "        confidence=0.9,\n",
+    "        evidence_spans=[],  # Empty!\n",
+    "    )\n",
+    "    print(\"✗ Should have raised error for empty evidence\")\n",
+    "except ValueError as e:\n",
+    "    print(f\"✓ Correctly rejected: {e}\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Test: OTHER_EMERGENT requires proposed_label\n",
+    "print(\"\\nTesting OTHER_EMERGENT requirements...\")\n",
+    "\n",
+    "evidence = [EvidenceSpan(text=\"test\", start_time=0, end_time=1)]\n",
+    "\n",
+    "# Valid: with proposed_label\n",
+    "try:\n",
+    "    emergent_valid = RCALabel(\n",
+    "        driver_code=\"OTHER_EMERGENT\",\n",
+    "        confidence=0.7,\n",
+    "        evidence_spans=evidence,\n",
+    "        proposed_label=\"NEW_PATTERN_DISCOVERED\",\n",
+    "    )\n",
+    "    print(f\"✓ OTHER_EMERGENT with proposed_label: {emergent_valid.proposed_label}\")\n",
+    "except Exception as e:\n",
+    "    print(f\"✗ Unexpected error: {e}\")\n",
+    "\n",
+    "# Invalid: without proposed_label\n",
+    "try:\n",
+    "    emergent_invalid = RCALabel(\n",
+    "        driver_code=\"OTHER_EMERGENT\",\n",
+    "        confidence=0.7,\n",
+    "        evidence_spans=evidence,\n",
+    "        # No proposed_label!\n",
+    "    )\n",
+    "    print(\"✗ Should have raised error for missing proposed_label\")\n",
+    "except ValueError as e:\n",
+    "    print(f\"✓ Correctly rejected: {e}\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Test: Confidence bounds\n",
+    "print(\"\\nTesting confidence bounds...\")\n",
+    "\n",
+    "evidence = [EvidenceSpan(text=\"test\", start_time=0, end_time=1)]\n",
+    "\n",
+    "# Valid: confidence in range\n",
+    "for conf in [0.0, 0.5, 1.0]:\n",
+    "    try:\n",
+    "        label = RCALabel(\n",
+    "            driver_code=\"TEST\",\n",
+    "            confidence=conf,\n",
+    "            evidence_spans=evidence,\n",
+    "        )\n",
+    "        print(f\"✓ Confidence {conf} accepted\")\n",
+    "    except Exception as e:\n",
+    "        print(f\"✗ Confidence {conf} rejected: {e}\")\n",
+    "\n",
+    "# Invalid: out of range\n",
+    "for conf in [-0.1, 1.5]:\n",
+    "    try:\n",
+    "        label = RCALabel(\n",
+    "            driver_code=\"TEST\",\n",
+    "            confidence=conf,\n",
+    "            evidence_spans=evidence,\n",
+    "        )\n",
+    "        print(f\"✗ Confidence {conf} should have been rejected\")\n",
+    "    except ValueError as e:\n",
+    "        print(f\"✓ Confidence {conf} correctly rejected\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 6. Batch Analyzer Configuration"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from src.inference.batch_analyzer import BatchAnalyzer, BatchAnalyzerConfig, BatchCheckpoint\n",
+    "\n",
+    "# Test checkpoint serialization\n",
+    "checkpoint = BatchCheckpoint(\n",
+    "    batch_id=\"test_batch_001\",\n",
+    "    total_calls=100,\n",
+    "    processed_call_ids=[\"CALL001\", \"CALL002\", \"CALL003\"],\n",
+    "    failed_call_ids={\"CALL004\": \"LLM timeout\"},\n",
+    "    success_count=3,\n",
+    "    partial_count=0,\n",
+    "    failed_count=1,\n",
+    ")\n",
+    "\n",
+    "print(\"=== CHECKPOINT ===\")\n",
+    "print(f\"Batch ID: {checkpoint.batch_id}\")\n",
+    "print(f\"Total: {checkpoint.total_calls}\")\n",
+    "print(f\"Processed: {len(checkpoint.processed_call_ids)}\")\n",
+    "print(f\"Failed: {len(checkpoint.failed_call_ids)}\")\n",
+    "\n",
+    "# Test round-trip\n",
+    "checkpoint_dict = checkpoint.to_dict()\n",
+    "restored = BatchCheckpoint.from_dict(checkpoint_dict)\n",
+    "\n",
+    "print(f\"\\nRound-trip successful: {restored.batch_id == checkpoint.batch_id}\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Test batch config\n",
+    "config = BatchAnalyzerConfig(\n",
+    "    batch_size=10,\n",
+    "    max_concurrent=5,\n",
+    "    requests_per_minute=200,\n",
+    "    save_interval=10,\n",
+    ")\n",
+    "\n",
+    "print(\"=== BATCH CONFIG ===\")\n",
+    "print(f\"Batch size: {config.batch_size}\")\n",
+    "print(f\"Max concurrent: {config.max_concurrent}\")\n",
+    "print(f\"Requests/minute: {config.requests_per_minute}\")\n",
+    "print(f\"Save interval: {config.save_interval}\")\n",
+    "print(f\"Checkpoint dir: {config.checkpoint_dir}\")\n",
+    "print(f\"Output dir: {config.output_dir}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 7. Summary\n",
+    "\n",
+    "### Components Validated:\n",
+    "\n",
+    "1. **PromptManager** ✓\n",
+    "   - Loads versioned prompts from config/prompts/\n",
+    "   - Template rendering with safe_substitute\n",
+    "   - Taxonomy loading for RCA drivers\n",
+    "\n",
+    "2. **LLMClient** ✓\n",
+    "   - Cost estimation based on tokens\n",
+    "   - JSON parsing (plain, markdown blocks, embedded)\n",
+    "   - Usage statistics tracking\n",
+    "\n",
+    "3. **CallAnalyzer** ✓\n",
+    "   - Combines observed features + LLM inference\n",
+    "   - Produces CallAnalysis with full traceability\n",
+    "   - Evidence validation enforced\n",
+    "\n",
+    "4. **BatchAnalyzer** ✓\n",
+    "   - Checkpoint serialization/restoration\n",
+    "   - Configurable concurrency and rate limiting\n",
+    "   - Incremental saving support\n",
+    "\n",
+    "5. **Data Contracts** ✓\n",
+    "   - Evidence required for all RCA labels\n",
+    "   - Confidence bounds enforced (0-1)\n",
+    "   - OTHER_EMERGENT requires proposed_label\n",
+    "\n",
+    "### Ready for:\n",
+    "- Integration with real OpenAI API\n",
+    "- Batch processing of transcripts\n",
+    "- Checkpoint/resume for long-running jobs"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "print(\"=\"*50)\n",
+    "print(\"CHECKPOINT 5 - INFERENCE ENGINE VALIDATION COMPLETE\")\n",
+    "print(\"=\"*50)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "name": "python",
+   "version": "3.11.0"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}