Stakeholders want to display field-level confidence scores for each extracted value in a guided dialog. However, the current implementation only stores a single confidence value per extraction turn, and this value gets overwritten on each subsequent turn.
Fields to collect:
leave_type(required)start_date(required)employee_email(optional)
Assistant: "What type of leave are you requesting?"
User: "I need parental leave"
LLM Returns:
{
"leave_type": "parental_leave",
"confidence": 0.95
}Database State:
{
"extracted_data": {
"leave_type": "parental_leave",
"confidence": 0.95
},
"fields_asked": ["leave_type"]
}Assistant: "When would you like your leave to start?"
User: "sometime in march I think"
LLM Returns:
{
"start_date": "2025-03-01",
"confidence": 0.6
}Database State:
{
"extracted_data": {
"leave_type": "parental_leave",
"start_date": "2025-03-01",
"confidence": 0.6
},
"fields_asked": ["leave_type", "start_date"]
}
⚠️ Notice: Turn 1's confidence (0.95) is now lost - replaced by Turn 2's confidence (0.6)
Assistant: "What's your email address?"
User: "[email protected]"
LLM Returns:
{
"employee_email": "[email protected]",
"confidence": 0.99
}Final Database State:
{
"extracted_data": {
"leave_type": "parental_leave",
"start_date": "2025-03-01",
"employee_email": "[email protected]",
"confidence": 0.99
},
"fields_asked": ["leave_type", "start_date", "employee_email"]
}❌ Result: Only Turn 3's confidence (0.99) remains. All previous confidence values are lost.
| Field | Extracted Value | Actual Confidence | Stored? |
|---|---|---|---|
leave_type |
"parental_leave" | 0.95 | ❌ Lost |
start_date |
"2025-03-01" | 0.60 | ❌ Lost |
employee_email |
"[email protected]" | 0.99 | ✅ Only because it's last |
leave_type: "parental_leave" confidence: N/A
start_date: "2025-03-01" confidence: N/A
employee_email: "[email protected]" confidence: 0.99 (misleading - only last turn)
leave_type: "parental_leave" confidence: 0.95 ✓ High
start_date: "2025-03-01" confidence: 0.60 ⚠️ Low
employee_email: "[email protected]" confidence: 0.99 ✓ High
- Confidence is per-turn, not per-field - The LLM returns one confidence value per extraction call
- Values get overwritten - Each turn's
merge_extracted_data!overwrites the previous confidence - No history is kept - There's no log of per-turn or per-field confidence
In the example above, start_date was extracted with low confidence (0.6) because the user said "sometime in march I think" - a vague response.
But in the final output, the only confidence shown is 0.99 (from the email field), which incorrectly suggests all data is high-quality.
To support per-field confidence, the data structure would need to change from:
Current Structure:
{
"extracted_data": {
"leave_type": "parental_leave",
"start_date": "2025-03-01",
"employee_email": "[email protected]",
"confidence": 0.99
}
}Required Structure:
{
"extracted_data": {
"leave_type": {
"value": "parental_leave",
"confidence": 0.95
},
"start_date": {
"value": "2025-03-01",
"confidence": 0.6
},
"employee_email": {
"value": "[email protected]",
"confidence": 0.99
}
}
}app/services/workflows/guided_dialogue_processor.rb- Store confidence per fieldapp/models/workflow/action/guided_dialogue/output.rb- New data structure and accessors- All downstream consumers of
extracted_data- Handle new nested structure
| Aspect | Current State | Stakeholder Requirement |
|---|---|---|
| Confidence granularity | Per extraction turn | Per field |
| Confidence history | Only last value kept | All values preserved |
| Data structure | Flat with single confidence | Nested with per-field confidence |
| Implementation effort | N/A | Code changes required |