Skip to content

Instantly share code, notes, and snippets.

@nutanc
Last active March 11, 2026 12:17
Show Gist options
  • Select an option

  • Save nutanc/ae0aaf52d19b17a88c11d53a0a8888a1 to your computer and use it in GitHub Desktop.

Select an option

Save nutanc/ae0aaf52d19b17a88c11d53a0a8888a1 to your computer and use it in GitHub Desktop.
Creating wisprflow alternative
Create a Wisprflow alternative project with local asr model and local LLM model which can be tried on a desktop. The flow should be the asr model gets the transcription and gives the context of the open window to the LLM which fixes the transcription and pastes it into the text area with the cursor.
Thought for 8s
Planning Wisprflow Alternative
Researched ASR (faster-whisper), LLM (Ollama), and desktop automation (xdotool/xclip) approaches. Checked existing workspace structure.
Files Edited
Implementation Plan
Task
Progress Updates
1
Researching approach and creating implementation plan
2
Creating implementation plan and task breakdown
I've drafted the implementation plan for VoiceFlow — a local Wisprflow alternative. Key design decisions:
ASR: faster-whisper (base.en model, ~150MB, runs on CPU/GPU)
LLM: Ollama with llama3.2 (you'll need Ollama installed + model pulled)
Desktop integration: xdotool + xclip for window context detection and text pasting (X11 only)
Hotkey: Ctrl+Shift+Space push-to-talk via pynput
Two things I'd like your input on:
Do you use X11 or Wayland? The current plan uses X11 tools — Wayland needs different tools.
Do you have a preferred Ollama model already installed, or should I default to llama3.2?
Building Cloud Wisprflow for Windows
Built the complete cloud edition at /mnt/d/code/python/open_wispr_flow/cloud/ with 10 files: main.py (orchestrator with push-to-talk hotkey), config.py, recorder.py, transcriber.py (OpenAI Whisper), window_context.py (win32gui), llm_corrector.py (OpenAI GPT), text_paster.py, build_exe.bat, build_exe.py, requirements.txt, and README.md.
Files Edited
Walkthrough
Task
README.md
build_exe.py
build_exe.bat
main.py
Progress Updates
1
Creating project files with OpenAI APIs and Windows-compatible active window detection
2
Creating transcriber, window context, LLM corrector, text paster, and main orchestrator modules
3
Creating main.py orchestrator, README with EXE build steps, and build script
4
Updating task checklist and creating walkthrough
Done! The Cloud Edition is at open_wispr_flow/cloud/. Here's how to get going on Windows:
Run from source (PowerShell):
powershell
cd D:\code\python\open_wispr_flow\cloud
pip install -r requirements.txt
$env:OPENAI_API_KEY = "sk-your-key-here"
python main.py
Build EXE:
powershell
pip install pyinstaller
.\build_exe.bat
# or: python build_exe.py
Output: dist\OpenWisprFlow.exe
How it works:
Hold Ctrl+Shift+Space → speak → release → corrected text appears at your cursor
Uses OpenAI Whisper for transcription + GPT-4o-mini for context-aware correction
Detects the active window (VS Code, Chrome, etc.) to improve corrections
Cost: ~$0.007 per dictation (~140 uses per $1)
No huge model downloads needed — just pip install and your API key.
Walkthrough
Open
I am in wsl in windows. How to build exe.
You cannot build a Windows .exe from WSL — PyInstaller builds for the OS it runs on, so WSL would produce a Linux binary. You need to use Windows Python instead.
Here are the steps:
1. Open a Windows Command Prompt or PowerShell (not WSL)
Press Win+R, type cmd, hit Enter.
2. Check if Windows Python is installed
cmd
python --version
If not installed, download from python.org/downloads — check "Add to PATH" during install.
3. Navigate to the project
cmd
cd D:\code\python\open_wispr_flow\cloud
4. Install dependencies + PyInstaller
cmd
pip install -r requirements.txt
pip install pyinstaller
5. Build the EXE
cmd
build_exe.bat
Or manually:
cmd
pyinstaller --onefile --name "OpenWisprFlow" --console --hidden-import=win32gui --hidden-import=win32api --hidden-import=win32con --hidden-import=pynput --hidden-import=pynput.keyboard --hidden-import=pynput.keyboard._win32 --hidden-import=pynput._util --hidden-import=pynput._util.win32 --hidden-import=sounddevice --hidden-import=_sounddevice_data main.py
6. Run it
cmd
set OPENAI_API_KEY=sk-your-key-here
dist\OpenWisprFlow.exe
Key point: The build must happen from Windows Python, not WSL Python. Even for testing before building the EXE, you should run python main.py from a Windows terminal so that win32gui (active window detection) and the hotkey listener work properly.
'pyinstaller' is not recognized as an internal or external command,
operable program or batch file.
Thought for <1s
PyInstaller installed but isn't on your PATH. Use it as a Python module instead:
cmd
python -m PyInstaller --onefile --name "OpenWisprFlow" --console --hidden-import=win32gui --hidden-import=win32api --hidden-import=win32con --hidden-import=pynput --hidden-import=pynput.keyboard --hidden-import=pynput.keyboard._win32 --hidden-import=pynput._util --hidden-import=pynput._util.win32 --hidden-import=sounddevice --hidden-import=_sounddevice_data main.py
Or if python doesn't work, try
py
:
cmd
py -m PyInstaller --onefile --name "OpenWisprFlow" --console main.py
Tip: If even python -m PyInstaller fails with "No module named PyInstaller", reinstall it:
cmd
python -m pip install pyinstaller
This is working. Can we now make a similar version for macos
macOS support added! The same codebase now works on both platforms. Here's what was added:
New files
requirements_mac.txt — same deps minus pywin32
build_mac.sh — macOS build script with Darwin-specific imports
Updated files
text_paster.py — now uses Cmd+V on macOS (instead of Ctrl+V)
README.md — added full macOS install, run, and build instructions
macOS Quick Start
bash
cd open_wispr_flow/cloud
pip3 install -r requirements_mac.txt
export OPENAI_API_KEY=sk-your-key-here
python3 main.py
Build macOS binary
bash
chmod +x build_mac.sh
./build_mac.sh
# Output: dist/OpenWisprFlow
⚠️ On macOS, you'll need to grant Accessibility, Microphone, and Input Monitoring permissions in System Settings → Privacy & Security.
README.md
Open
Can we make this into a installable package.
Done! Here's what was added for proper platform installers:
Windows Installer (.exe setup wizard)
build_installer_win.bat — one-click: builds EXE → creates installer
installer_win.iss — Inno Setup script with:
API key prompt during install (saved to user env vars)
Start Menu & Desktop shortcuts
Optional "Start on Windows startup"
Proper uninstaller
launcher.bat — auto-prompts for API key on first run if not set
Steps on your Windows machine:
cmd
cd D:\code\python\open_wispr_flow\cloud
REM If you have Inno Setup installed:
build_installer_win.bat
REM If not, download it free from: https://jrsoftware.org/isdl.php
Output: installer_output\OpenWisprFlow_Setup.exe
macOS Installer (.dmg)
build_installer_mac.sh — builds .app bundle + .dmg with:
Native macOS dialog for API key on first launch
Drag-to-Applications layout in DMG
API key saved to ~/.open_wispr_flow/config.json
Steps on a Mac:
bash
chmod +x build_installer_mac.sh
./build_installer_mac.sh
Output: dist/OpenWisprFlow_Installer.dmg
README.md
Open
Adding File Logging and macOS Noconsole
Added file logging to
main.py
(writes to ~/.open_wispr_flow/app.log). Updated
build_mac.sh
and
build_installer_mac.sh
with --noconsole, all module --add-data entries, and hidden imports for pystray/PIL/tkinter.
Files Edited
requirements_mac.txt
build_installer_mac.sh
build_mac.sh
main.py
Progress Updates
1
Adding file logging to main.py and updating mac build scripts to use --noconsole
2
Wiring up logging in main() and updating mac build scripts
3
Completing final updates to requirements_mac.txt
Done! Both platforms now run without a terminal window:
Changes
build_exe.bat — --console → --noconsole (already done)
build_mac.sh — --console → --noconsole + all new hidden imports for tray/settings
build_installer_mac.sh — added all new module --add-data and hidden imports
requirements_mac.txt — added pystray and Pillow
main.py — in tray mode, all output goes to ~/.open_wispr_flow/app.log (auto-truncated at 1MB)
Rebuild on Windows
cmd
build_exe.bat
Log file location
~/.open_wispr_flow/app.log
(On Windows: C:\Users\<you>\.open_wispr_flow\app.log)
Thought for 1s
Let me look at your current file to understand the context of your question about the Ctrl+Win key combination.
Analyzed
main.py#L1-217
Let me also check the tray app to see how hotkeys are currently handled:
Searched
hotkey
32 results
Searched
ctrl
34 results
In pynput (the library your project uses for keyboard listening), the Windows key is represented as:
keyboard.Key.cmd or keyboard.Key.cmd_l / keyboard.Key.cmd_r
In pynput, the Windows/Super/Command key is unified under Key.cmd regardless of OS:
Key pynput constant
Left Win/Cmd keyboard.Key.cmd_l
Right Win/Cmd keyboard.Key.cmd_r
Either Win/Cmd keyboard.Key.cmd
So to detect Ctrl + Win, you'd check for something like:
python
if key in (keyboard.Key.cmd_l, keyboard.Key.cmd_r):
win_pressed = True
and combine it with the existing ctrl_pressed check.
I also noticed your
README.md
already mentions the hotkey as Ctrl+CMD(Win) (line 8), but the actual implementation in
tray_app.py
and
main.py
uses Ctrl+Shift+Space. If you want to change the hotkey to Ctrl+Win, you'd replace the shift_pressed / keyboard.Key.shift checks with keyboard.Key.cmd_l / keyboard.Key.cmd_r checks instead.
Would you like me to make that change?
yes
Thought for 6s
Changing Hotkey to Ctrl+Win
Updated hotkey from Ctrl+Shift+Space to Ctrl+Win in
config.py
,
tray_app.py
,
main.py
, and
settings_ui.py
.
Files Edited
build_installer_mac.sh
README.md
settings_ui.py
main.py
tray_app.py
config.py
Progress Updates
1
Reading tray_app.py and config.py to understand full scope of changes
2
Applying hotkey changes to all files: config.py, tray_app.py, main.py, settings_ui.py, README.md
3
Updating README.md and build_installer_mac.sh documentation
Done! The hotkey has been changed from Ctrl+Shift+Space to Ctrl+Win across all files:
File Change
config.py HOTKEY_COMBINATION updated to <ctrl>+<cmd>
tray_app.py Listener now tracks win_pressed via keyboard.Key.cmd_l/cmd_r/cmd instead of shift; triggers on Ctrl+Win press, releases on either key release
main.py Same hotkey logic change for CLI mode
settings_ui.py Label changed to "Ctrl + Win"
README.md Hotkey documentation updated
How it works: Hold Ctrl+Win to start recording, release either key to stop and process. In pynput, the Windows key maps to keyboard.Key.cmd.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment