Skip to content

Instantly share code, notes, and snippets.

View hvaara's full-sized avatar
:shipit:

Roy Hvaara hvaara

:shipit:
View GitHub Profile
@awni
awni / mlx_api_prompt.py
Created August 20, 2024 15:43
Meta Llama 3.1 with MLX LM and the MLX Python API as Context
import os
import mlx.core as mx
from mlx_lm import load, generate
filename = os.path.join(os.path.dirname(mx.__file__), "core/__init__.pyi")
with open(filename, 'r') as fid:
prompt = fid.read()
prompt += "\nHow do you write a self-attention layer using the above API in MLX?"
model, tokenizer = load("mlx-community/meta-Llama-3.1-8B-Instruct-4bit")
@fxkamd
fxkamd / TinyGrad-notes.md
Last active December 4, 2025 06:01
Observations about HSA and KFD backends in TinyGrad

This is Felix Kuehling, long time KFD driver architect. I started looking into the TinyGrad source code yesterday, focusing on ops_kfd.py, ops_hsa.py and driver/hsa.py, to understand how TinyGrad talks to our HW and help with the ongoing debugging effort from the top down. This analysis is based on this commit: https://github.com/tinygrad/tinygrad/tree/3de855ea50d72238deac14fc05cda2a611497778

I'm intrigued by the use of Python for low-level programming. I think I can learn something from your use of ctypes and clang2py for fast prototyping and test development. I want to share some observations based on my initial review.

ops_kfd looks pretty new, and I see many problems with it based on my long experience working on KFD. I think it's interesting, but probably not relevant for the most pressing problems at hand, so I'll cover that last.

ops_hsa uses ROCr APIs to manage GPU memory, create a user mode AQL queue for GPU kernel dispatch, async SDMA copies, and signal-based synchronization with barrier packets

#!/usr/bin/env python
"""
Calculate KL-divergence of two models output logits on data set.
First call the program with write_path and text_path using fp16 model.
./llama_kl.py -m <fp16 model> -t <wiki.test.raw> -w <logits.gz>
This writes logits to file. Then call the program with quantized model with read path
./llama_kl.py -m <quantized model> -r <logits.gz>
KL-divergence to the first run is calculated.
See ./llama_kl.py --help for more options.
"""
@philipturner
philipturner / CalculateDiffusion.swift
Last active July 20, 2025 10:28
Calculate the number of floating-point operations in Stable Diffusion, and how those operations are distributed among layers
//
// main.swift
// CalculateDiffusion
//
// Created by Philip Turner on 6/2/23.
//
import Foundation
import QuartzCore
import MetalPerformanceShadersGraph
@alopresto
alopresto / gpg_git_signing.md
Last active July 1, 2025 15:59
Steps to enable GPG signing of git commits.

If anyone is interested in setting up their system to automatically (or manually) sign their git commits with their GPG key, here are the steps:

  1. Generate and add your key to GitHub
  2. $ git config --global commit.gpgsign true ([OPTIONAL] every commit will now be signed)
  3. $ git config --global user.signingkey ABCDEF01 (where ABCDEF01 is the fingerprint of the key to use)
  4. $ git config --global alias.logs "log --show-signature" (now available as $ git logs)
  5. $ git config --global alias.cis "commit -S" (optional if global signing is false)
  6. $ echo "Some content" >> example.txt
  7. $ git add example.txt
  8. $ git cis -m "This commit is signed by a GPG key." (regular commit will work if global signing is enabled)
@ygotthilf
ygotthilf / jwtRS256.sh
Last active December 3, 2025 11:46
How to generate JWT RS256 key
ssh-keygen -t rsa -b 4096 -m PEM -f jwtRS256.key
# Don't add passphrase
openssl rsa -in jwtRS256.key -pubout -outform PEM -out jwtRS256.key.pub
cat jwtRS256.key
cat jwtRS256.key.pub
@noonat
noonat / coreos-virtualbox.md
Last active February 10, 2023 22:00
Installing CoreOS on VirtualBox
  • Download and install VirtualBox.
  • Download the CoreOS ISO
  • Create a new VM in VirtualBox
    • For the OS, Other Linux, 64-bit should be fine
    • Give the VM 1gb of memory, like your physical hardware has.
    • Create a disk of whatever size you want. I made a VMDK file that could expand dynamically up to 8gb.
  • Mount the ISO in the VM
    • Right click on the VM and click settings
  • Go to the storage tab
@mobilemind
mobilemind / git-tag-delete-local-and-remote.sh
Last active November 30, 2025 00:48
how to delete a git tag locally and remote
# delete local tag '12345'
git tag -d 12345
# delete remote tag '12345' (eg, GitHub version too)
git push origin :refs/tags/12345
# alternative approach
git push --delete origin tagName
git tag -d tagName