Yossi Ovadia yossiovadia

Model Quality Comparison: Opus 4.6 vs Qwen 3.5 (35B, local Ollama)

Same prompt, single shot, no follow-ups.

This demonstrates why routing a follow-up query to a cheaper model — even with full context — produces meaningfully worse results than keeping it on the original model.

Results

	Claude Opus 4.6	Qwen 3.5 (35B-A3B, Q4_K_M, local)

	<!DOCTYPE html>
	<html lang="en">
	<head>
	<meta charset="UTF-8">
	<meta name="viewport" content="width=device-width, initial-scale=1.0">
	<title>ReproBot Demo</title>
	<style>
	, ::before, *::after { margin: 0; padding: 0; box-sizing: border-box; }

	:root {

	#!/usr/bin/env python3
	"""
	Test jailbreak LoRA model confidence scores via semantic-router Classification API.
	This tests the same pathway as PII (Go → Rust → Candle) to compare behavior.

	Results: Jailbreak model shows VARIED confidence scores (14 unique values from 15 tests)
	Range: 0.9917 to 0.9999
	This proves the uniform 0.9 issue is specific to PII token classification.
	"""

	#!/usr/bin/env python3
	"""
	Pure Python LoRA PII Model Test
	================================

	This test loads the LoRA PII model directly using HuggingFace transformers
	and runs inference WITHOUT using ANY semantic-router code (no Go, no Rust FFI).

	This proves whether the 0.9 confidence comes from:
	- The model itself ✓

	170517 19:12:39 [Note] InnoDB: Waiting for purge to start
	170517 19:12:39 [Note] InnoDB: Percona XtraDB (http://www.percona.com) 5.6.31-77.0 started; log sequence number 1600607
	170517 19:12:39 [Note] Plugin 'FEEDBACK' is disabled.
	170517 19:12:39 [ERROR] Can't open the mysql.plugin table. Please run mysql_upgrade to create it.
	170517 19:12:39 [ERROR] Can't open and lock privilege tables: Table 'mysql.servers' doesn't exist
	170517 19:12:39 [Note] Server socket created on IP: '0.0.0.0'.
	170517 19:12:39 [ERROR] Fatal error: Can't open and lock privilege tables: Table 'mysql.user' doesn't exist