Michael Moser MoserMichael

llama.cpp - reading the source code

looking through the llama.cpp source code, to learn about language models. Disclaimer: it's january 2026, so things will likely change, at some point.

llama.cpp repo link
build instructions
- contains instructions on how to build the debug version
- default build creates devices: 'libggml-base.so' and 'libggml-cpu.so' - probably need to set up additional requirements for the other backends.
- if you need additional backends: documentation on each backend includes build instructions
looking at main.cpp for llama-simple - a CLI program to continue with a given prompt, specified on the command line. command line: ./llama-simple -m model.gguf [-n n_predict] [-ngl n_gpu_layers] {prompt-text}

not very politically correct gist

I am taking the json from the following gist original rsdb gitst - local copy It contains a json with a set of records on racial and ethnical slurs.

Wrote a python script that does the following things

create a json where all slurs are grouped by the name of the group, which is the subject of the slur slurs sorted by group name

learning queuing theory basics

Bikey Bonn Kleiford Seranilla - Queuing Theory

Liz Thompson - Queueing theory (simple)

SREcon24 Americas - System Performance and Queuing Theory - Concepts and Application

Learning about the concept

Wikipedia on Meditation

The objectives of mediation

Reach a "mentally clear and emotionally calm and stable state".
other objectives:
- "improve focus by decreasing self-referential chatter"
- minimize "mind-wandering (aka spontaneous drift, shift of attention to 'something personal')
minimize rumination (repetitive thought pattern where the mind becomes "stuck" on negative content) and

Learning about huggingface (and other AI stuff)

the site: https://huggingface.co/
at first the account is free, once you start using their GPU's you need a paid account
main page ui
- model hub : lots of trained open source models (sorted by trending)
- datasets : open source datasets!!!
- spaces - hosted apps on hugging space
community (blogs/posts/daily papers)

cooking up stuff

In memory of my dear mother, Vera. You taught me to be positive and be ready to learn from anyone. And your cooking was the best...

Maybe a bit rare to see cooking advice in a github gist, nevertheless - let's go! (I prefer recipes that do not require a lot of work)

What I like about cooking the same dish repeatedly:

you develop an intuition for time, how long does it take for a pancake to be ready before turning? How long does it take for a soup to be ready for consumption?
Also: experimentation with the recipes: sometimes an ingredient isn't handy, can you substitute it with something else? Try variations of spices? - there is room for experimentation, to some degree - and you can adopt what works. Cooking is not a totally fixed process, there is some room for variation...
Time passes quickly, if you do something with your own hands that has a purpose & makes sense. Cooking is an 'easy entry' into this kind of activity.

course on visual programming with Scratch

The aim here is to come up with a methodology/course of teaching the stuff, so the intended audience are teachers/parents who want to teach this stuff.

Scratch

A general intro to Scratch

at the bottom of the page there is a dropdown for choosing your language (very important to some people)

Why study prompting? LLM systems have a training process. Prompting is not based on magical thinking, they try to use peculiarities of the training process. For example 'lets think step by step' was/is the prefix for many training example, so the presence of these words in the prompt appeared be triggering chain-of-thought reasoning, at some stage. see wiki

link to course

by Isa Fulford - she also wrote OpenAI cookbook and Andrew Ng

	# use brew on the mac, this installs ollama that has been compiled with the metal library
	# metal is a library that uses the native GPU on the M3 processor
	brew install ollama

	# check that metal is in use
	# if metal is installed correctly, you will see a line like this on the screen:
	# "inference compute" ... library=Metal ...
	OLLAMA_DEBUG=1 ollama serve

	# next thing we want to do: load the claude model


	This project moved to https://github.com/MoserMichael/tips_on_using_google_ai_mode