Created
April 15, 2025 07:14
-
-
Save drorata/b8b6e90d846977a463761a9157be1ffb to your computer and use it in GitHub Desktop.
What the HASH? Some insights into the Python built in hash function
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| { | |
| "cells": [ | |
| { | |
| "cell_type": "code", | |
| "execution_count": null, | |
| "metadata": {}, | |
| "outputs": [], | |
| "source": [ | |
| "hash(\"Hello hash\")" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": {}, | |
| "source": [ | |
| "## Intro\n", | |
| "\n", | |
| "* Hash functions are used to (among other things) to generate identifiers of objects\n", | |
| "* This is why it is not possible to hash a mutable object (like a list)\n", | |
| "* The built in `hash` is designed for efficiency when using hash tables\n", | |
| "* If `x==y` then `hash(x) == hash(y)`. Obviously, the inverse doesn't hold.\n", | |
| "* For numbers, `hash(1) == hash(1.0)`" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": {}, | |
| "source": [ | |
| "## Hash of integers" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": null, | |
| "metadata": {}, | |
| "outputs": [], | |
| "source": [ | |
| "hash(1) == hash(1.0) == 1" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": null, | |
| "metadata": {}, | |
| "outputs": [], | |
| "source": [ | |
| "hash(-290000)" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": null, | |
| "metadata": {}, | |
| "outputs": [], | |
| "source": [ | |
| "for i in range (-5,5):\n", | |
| " print(f\"i={i:2}\", f\"hash({i:2})={hash(i):2}\")" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": {}, | |
| "source": [ | |
| "Do you see something special?\n", | |
| "\n", | |
| "This happens [here](https://github.com/python/cpython/blob/8190571a75fc46278042e7fffbe8aeb1f71ab21d/Objects/longobject.c#L3738-L3743)" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": null, | |
| "metadata": {}, | |
| "outputs": [], | |
| "source": [ | |
| "for N in [2**61-3, 2**61-2, 2**61-1, 2**61, 2**61+1, 2**61+2]:\n", | |
| " print(f\"hash({N}) = {hash(N)}\")" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": {}, | |
| "source": [ | |
| "# Hashing BaseModels" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": null, | |
| "metadata": {}, | |
| "outputs": [], | |
| "source": [ | |
| "from pydantic import BaseModel, ConfigDict" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": null, | |
| "metadata": {}, | |
| "outputs": [], | |
| "source": [ | |
| "class Foo(BaseModel):\n", | |
| " bar: str\n", | |
| " baz: int" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": null, | |
| "metadata": {}, | |
| "outputs": [], | |
| "source": [ | |
| "hash(Foo(bar=\"string\", baz=314))" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": null, | |
| "metadata": {}, | |
| "outputs": [], | |
| "source": [ | |
| "class Bar(BaseModel):\n", | |
| " foo: str\n", | |
| " baz: int\n", | |
| "\n", | |
| " model_config = ConfigDict(frozen=True)" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": null, | |
| "metadata": {}, | |
| "outputs": [], | |
| "source": [ | |
| "hash(Bar(foo=\"string\", baz=123))" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": null, | |
| "metadata": {}, | |
| "outputs": [], | |
| "source": [ | |
| "class Goo(BaseModel):\n", | |
| " foo: str\n", | |
| " baz: list[int]\n", | |
| "\n", | |
| " model_config = ConfigDict(frozen=True)" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": null, | |
| "metadata": {}, | |
| "outputs": [], | |
| "source": [ | |
| "hash(Goo(foo=\"string\", baz=[1,2,3]))" | |
| ] | |
| } | |
| ], | |
| "metadata": { | |
| "kernelspec": { | |
| "display_name": ".venv", | |
| "language": "python", | |
| "name": "python3" | |
| }, | |
| "language_info": { | |
| "codemirror_mode": { | |
| "name": "ipython", | |
| "version": 3 | |
| }, | |
| "file_extension": ".py", | |
| "mimetype": "text/x-python", | |
| "name": "python", | |
| "nbconvert_exporter": "python", | |
| "pygments_lexer": "ipython3", | |
| "version": "3.12.2" | |
| } | |
| }, | |
| "nbformat": 4, | |
| "nbformat_minor": 2 | |
| } |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment