Skip to content

Instantly share code, notes, and snippets.

View yossiovadia's full-sized avatar

Yossi Ovadia yossiovadia

  • Red Hat
  • California
View GitHub Profile
@yossiovadia
yossiovadia / repro-demo.html
Last active March 14, 2026 00:33
ReproBot Demo — AI-powered infrastructure-level bug reproduction (concept)
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>ReproBot Demo</title>
<style>
*, *::before, *::after { margin: 0; padding: 0; box-sizing: border-box; }
:root {
@yossiovadia
yossiovadia / 00_comparison.md
Last active March 12, 2026 15:41
Model quality comparison: Opus 4.6 vs Qwen 3.5 — neon space shooter (same prompt, single shot)

Model Quality Comparison: Opus 4.6 vs Qwen 3.5 (35B, local Ollama)

Same prompt, single shot, no follow-ups.

This demonstrates why routing a follow-up query to a cheaper model — even with full context — produces meaningfully worse results than keeping it on the original model.

Results

Claude Opus 4.6 Qwen 3.5 (35B-A3B, Q4_K_M, local)
Claiming my agent SangerDev on Agent IRC. Code: jade-342
Claiming my agent BigShotPM on Agent IRC. Code: ember-578
@yossiovadia
yossiovadia / jailbreak_test_gist.py
Created November 13, 2025 18:30
Test jailbreak LoRA confidence via semantic-router API - proves 0.9 uniform issue is specific to PII token classification
#!/usr/bin/env python3
"""
Test jailbreak LoRA model confidence scores via semantic-router Classification API.
This tests the same pathway as PII (Go → Rust → Candle) to compare behavior.
Results: Jailbreak model shows VARIED confidence scores (14 unique values from 15 tests)
Range: 0.9917 to 0.9999
This proves the uniform 0.9 issue is specific to PII token classification.
"""
@yossiovadia
yossiovadia / test_lora_pii_pure_python.py
Created November 13, 2025 17:42
Pure Python LoRA PII Model Validator - Bypasses semantic-router to test model directly
#!/usr/bin/env python3
"""
Pure Python LoRA PII Model Test
================================
This test loads the LoRA PII model directly using HuggingFace transformers
and runs inference WITHOUT using ANY semantic-router code (no Go, no Rust FFI).
This proves whether the 0.9 confidence comes from:
- The model itself ✓
170517 19:12:39 [Note] InnoDB: Waiting for purge to start
170517 19:12:39 [Note] InnoDB: Percona XtraDB (http://www.percona.com) 5.6.31-77.0 started; log sequence number 1600607
170517 19:12:39 [Note] Plugin 'FEEDBACK' is disabled.
170517 19:12:39 [ERROR] Can't open the mysql.plugin table. Please run mysql_upgrade to create it.
170517 19:12:39 [ERROR] Can't open and lock privilege tables: Table 'mysql.servers' doesn't exist
170517 19:12:39 [Note] Server socket created on IP: '0.0.0.0'.
170517 19:12:39 [ERROR] Fatal error: Can't open and lock privilege tables: Table 'mysql.user' doesn't exist