Skip to content

Instantly share code, notes, and snippets.

@manoelneto
Created January 26, 2026 20:46
Show Gist options
  • Select an option

  • Save manoelneto/f0e7422591067b4b22dc82baeb76c346 to your computer and use it in GitHub Desktop.

Select an option

Save manoelneto/f0e7422591067b4b22dc82baeb76c346 to your computer and use it in GitHub Desktop.
Guided Dialog Confidence - Current Behavior vs Stakeholder Requirements

Guided Dialog Confidence - Current Behavior

Overview

Stakeholders want to display field-level confidence scores for each extracted value in a guided dialog. However, the current implementation only stores a single confidence value per extraction turn, and this value gets overwritten on each subsequent turn.


Example Scenario: Leave Request Form

Fields to collect:

  1. leave_type (required)
  2. start_date (required)
  3. employee_email (optional)

Turn-by-Turn Walkthrough

Turn 1: Ask for leave_type

Assistant: "What type of leave are you requesting?"

User: "I need parental leave"

LLM Returns:

{
  "leave_type": "parental_leave",
  "confidence": 0.95
}

Database State:

{
  "extracted_data": {
    "leave_type": "parental_leave",
    "confidence": 0.95
  },
  "fields_asked": ["leave_type"]
}

Turn 2: Ask for start_date

Assistant: "When would you like your leave to start?"

User: "sometime in march I think"

LLM Returns:

{
  "start_date": "2025-03-01",
  "confidence": 0.6
}

Database State:

{
  "extracted_data": {
    "leave_type": "parental_leave",
    "start_date": "2025-03-01",
    "confidence": 0.6
  },
  "fields_asked": ["leave_type", "start_date"]
}

⚠️ Notice: Turn 1's confidence (0.95) is now lost - replaced by Turn 2's confidence (0.6)


Turn 3: Ask for employee_email

Assistant: "What's your email address?"

User: "[email protected]"

LLM Returns:

{
  "employee_email": "[email protected]",
  "confidence": 0.99
}

Final Database State:

{
  "extracted_data": {
    "leave_type": "parental_leave",
    "start_date": "2025-03-01",
    "employee_email": "[email protected]",
    "confidence": 0.99
  },
  "fields_asked": ["leave_type", "start_date", "employee_email"]
}

Result: Only Turn 3's confidence (0.99) remains. All previous confidence values are lost.


What Stakeholders Want vs What We Have

Field Extracted Value Actual Confidence Stored?
leave_type "parental_leave" 0.95 ❌ Lost
start_date "2025-03-01" 0.60 ❌ Lost
employee_email "[email protected]" 0.99 ✅ Only because it's last

What We Can Show Today

leave_type:     "parental_leave"        confidence: N/A
start_date:     "2025-03-01"            confidence: N/A
employee_email: "[email protected]" confidence: 0.99 (misleading - only last turn)

What Stakeholders Want

leave_type:     "parental_leave"         confidence: 0.95 ✓ High
start_date:     "2025-03-01"             confidence: 0.60 ⚠️ Low
employee_email: "[email protected]" confidence: 0.99 ✓ High

The Problem

  1. Confidence is per-turn, not per-field - The LLM returns one confidence value per extraction call
  2. Values get overwritten - Each turn's merge_extracted_data! overwrites the previous confidence
  3. No history is kept - There's no log of per-turn or per-field confidence

Real-World Impact

In the example above, start_date was extracted with low confidence (0.6) because the user said "sometime in march I think" - a vague response.

But in the final output, the only confidence shown is 0.99 (from the email field), which incorrectly suggests all data is high-quality.


What Would Be Needed

To support per-field confidence, the data structure would need to change from:

Current Structure:

{
  "extracted_data": {
    "leave_type": "parental_leave",
    "start_date": "2025-03-01",
    "employee_email": "[email protected]",
    "confidence": 0.99
  }
}

Required Structure:

{
  "extracted_data": {
    "leave_type": {
      "value": "parental_leave",
      "confidence": 0.95
    },
    "start_date": {
      "value": "2025-03-01",
      "confidence": 0.6
    },
    "employee_email": {
      "value": "[email protected]",
      "confidence": 0.99
    }
  }
}

Files That Would Need Changes

  1. app/services/workflows/guided_dialogue_processor.rb - Store confidence per field
  2. app/models/workflow/action/guided_dialogue/output.rb - New data structure and accessors
  3. All downstream consumers of extracted_data - Handle new nested structure

Summary

Aspect Current State Stakeholder Requirement
Confidence granularity Per extraction turn Per field
Confidence history Only last value kept All values preserved
Data structure Flat with single confidence Nested with per-field confidence
Implementation effort N/A Code changes required
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment