Skip to content

Instantly share code, notes, and snippets.

@Kerushii
Kerushii / gist:3d8880d8196fabff26452a8a3e74b88a
Last active October 1, 2023 01:53
Introductory Guide to at Home Neuralnetwork Training
# Background
A LLM is a state machine that aims to characterise and explain the data through the means of embedding them into a hyperspace; so knowledge, or in the topic of language modelling, the next token could be retrieved after the previous token.
In order to achieve this, in the field of language modelling, a LLM usually uses a loss function called “cross entropy loss” which essentially takes account of the possibility of the next token. The model is published for having confidence in the wrong token and is rewarded in having confidence in the correct token.
Programmatically, this is done through an optimizer optimising the state machine on the loss landscape. The minimum modification that can be done by the machine is called “step length” and each optimization it applies towards the goal is called a step.
In order to make one step, one batch of data has to be seen. And the neural network finds the minimum in that batch. Ideally the network should be able to scan through the loss landscape and find
@ChrisHayduk
ChrisHayduk / merge_qlora_with_quantized_model.py
Last active September 27, 2025 08:22
Merging QLoRA weights with quantized model
"""
The code below combines approaches published by both @eugene-yh and @jinyongyoo on Github.
Thanks for the contributions guys!
"""
import torch
import peft
//
// NSObject+BlockObservation.h
// Version 1.0
//
// Andy Matuschak
// [email protected]
// Public domain because I love you. Let me know how you use it.
//
#import <Cocoa/Cocoa.h>