wesslen · September 5, 2024 22:06
diff --git a/dsba6010-openai-api-prompting-with-modal.ipynb b/dsba6010-openai-api-prompting-with-modal.ipynb
 {
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "provenance": []
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    },
    "language_info": {
      "name": "python"
    }
  },
  "cells": [
    {
      "cell_type": "code",
      "source": [
        "!pip install openai"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "riDjpPY84fmi",
        "outputId": "5143fe14-0d88-451e-b3f1-2692eb24b06c"
      },
      "execution_count": 1,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "Collecting openai\n",
            "  Downloading openai-1.43.0-py3-none-any.whl.metadata (22 kB)\n",
            "Requirement already satisfied: anyio<5,>=3.5.0 in /usr/local/lib/python3.10/dist-packages (from openai) (3.7.1)\n",
            "Requirement already satisfied: distro<2,>=1.7.0 in /usr/lib/python3/dist-packages (from openai) (1.7.0)\n",
            "Collecting httpx<1,>=0.23.0 (from openai)\n",
            "  Downloading httpx-0.27.2-py3-none-any.whl.metadata (7.1 kB)\n",
            "Collecting jiter<1,>=0.4.0 (from openai)\n",
            "  Downloading jiter-0.5.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.6 kB)\n",
            "Requirement already satisfied: pydantic<3,>=1.9.0 in /usr/local/lib/python3.10/dist-packages (from openai) (2.8.2)\n",
            "Requirement already satisfied: sniffio in /usr/local/lib/python3.10/dist-packages (from openai) (1.3.1)\n",
            "Requirement already satisfied: tqdm>4 in /usr/local/lib/python3.10/dist-packages (from openai) (4.66.5)\n",
            "Requirement already satisfied: typing-extensions<5,>=4.11 in /usr/local/lib/python3.10/dist-packages (from openai) (4.12.2)\n",
            "Requirement already satisfied: idna>=2.8 in /usr/local/lib/python3.10/dist-packages (from anyio<5,>=3.5.0->openai) (3.8)\n",
            "Requirement already satisfied: exceptiongroup in /usr/local/lib/python3.10/dist-packages (from anyio<5,>=3.5.0->openai) (1.2.2)\n",
            "Requirement already satisfied: certifi in /usr/local/lib/python3.10/dist-packages (from httpx<1,>=0.23.0->openai) (2024.8.30)\n",
            "Collecting httpcore==1.* (from httpx<1,>=0.23.0->openai)\n",
            "  Downloading httpcore-1.0.5-py3-none-any.whl.metadata (20 kB)\n",
            "Collecting h11<0.15,>=0.13 (from httpcore==1.*->httpx<1,>=0.23.0->openai)\n",
            "  Downloading h11-0.14.0-py3-none-any.whl.metadata (8.2 kB)\n",
            "Requirement already satisfied: annotated-types>=0.4.0 in /usr/local/lib/python3.10/dist-packages (from pydantic<3,>=1.9.0->openai) (0.7.0)\n",
            "Requirement already satisfied: pydantic-core==2.20.1 in /usr/local/lib/python3.10/dist-packages (from pydantic<3,>=1.9.0->openai) (2.20.1)\n",
            "Downloading openai-1.43.0-py3-none-any.whl (365 kB)\n",
            "\u001b[2K   \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m365.7/365.7 kB\u001b[0m \u001b[31m2.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hDownloading httpx-0.27.2-py3-none-any.whl (76 kB)\n",
            "\u001b[2K   \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m76.4/76.4 kB\u001b[0m \u001b[31m3.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hDownloading httpcore-1.0.5-py3-none-any.whl (77 kB)\n",
            "\u001b[2K   \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m77.9/77.9 kB\u001b[0m \u001b[31m3.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hDownloading jiter-0.5.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (318 kB)\n",
            "\u001b[2K   \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m318.9/318.9 kB\u001b[0m \u001b[31m10.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hDownloading h11-0.14.0-py3-none-any.whl (58 kB)\n",
            "\u001b[2K   \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m58.3/58.3 kB\u001b[0m \u001b[31m2.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hInstalling collected packages: jiter, h11, httpcore, httpx, openai\n",
            "Successfully installed h11-0.14.0 httpcore-1.0.5 httpx-0.27.2 jiter-0.5.0 openai-1.43.0\n"
          ]
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "To run this in Colab, you will need to set an API Key named `DSBA_LLAMA3_KEY` and `MODAL_BASE_URL`, which is the URL endpoint where the LLaMa 3 model is hosted. You will need to add `/v1/` to the `MODAL_BASE_URL` path so it will look like:\n",
        "\n",
        "```\n",
        "# MODAL_BASE_URL\n",
        "https://your-workspace-name--vllm-openai-compatible-serve.modal.run/v1/\n",
        "```\n",
        "\n",
        "\n",
        "\n",
        "If using the class API example, these will be provided to you. Otherwise you will need to get these from your Modal service.\n",
        "\n",
        "![](https://miro.medium.com/v2/resize:fit:1400/format:webp/1*5wEevNCOf80GTHwptPTB4g.png)\n",
        "\n",
        "As mentioned, I have hosted `LLaMa3-8B-Instruct` model that we'll use instead of OpenAI. The reason is this avoids individual costs on the API -- only cost to me for hosting on Modal.\n",
        "\n",
        "This hosted model will **not** be up indefinitely and only for class demo purposes.\n",
        "\n",
        "If you host your own model, be sure to destroy it when you're done or you'll be charged."
      ],
      "metadata": {
        "id": "nylKS_U36keK"
      }
    },
    {
      "cell_type": "code",
      "execution_count": 2,
      "metadata": {
        "id": "LiLY95O62dOS"
      },
      "outputs": [],
      "source": [
        "from openai import OpenAI\n",
        "from google.colab import userdata\n",
        "\n",
        "client = OpenAI(api_key=userdata.get(\"DSBA_LLAMA3_KEY\"))\n",
        "client.base_url = userdata.get(\"MODAL_BASE_URL\")\n",
        "model = \"/models/NousResearch/Meta-Llama-3-8B-Instruct\"\n",
        "\n",
        "messages = [\n",
        "    {\n",
        "        \"role\": \"system\",\n",
        "        \"content\": \"You are a poetic assistant, skilled in writing satirical doggerel with creative flair.\",\n",
        "    },\n",
        "    {\n",
        "        \"role\": \"user\",\n",
        "        \"content\": \"Compose a limerick about baboons and racoons.\",\n",
        "    },\n",
        "]\n",
        "\n",
        "stream = client.chat.completions.create(\n",
        "    model=model,\n",
        "    messages=messages,\n",
        "    stream=True,\n",
        ")"
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "for chunk in stream:\n",
        "    if chunk.choices[0].delta.content is not None:\n",
        "        print(chunk.choices[0].delta.content, end=\"\")"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "T8p3ZWX16Ghi",
        "outputId": "5e450ad2-9697-41c7-cd32-a81f8216c32b"
      },
      "execution_count": 3,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            " There once were two creatures quite fine,\n",
            "Baboons and raccoons, a curious combine,\n",
            "They raided the trash cans with glee,\n",
            "In the moon's silver shine,\n",
            "Together they dined, a messy entwine."
          ]
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "Alternatively, you can run this as a cURL command. This example shows how to run it in bash (Unix/Mac).\n",
        "\n",
        "This assumes you have set local environmental variables (e.g., `.env` with `MODAL_BASE_URL` and `DSBA_LLAMA3_KEY` and loaded them)\n",
        "\n",
        "```bash\n",
        "curl \"$MODAL_BASE_URL/chat/completions\" \\\n",
        "    -H \"Content-Type: application/json\" \\\n",
        "    -H \"Authorization: Bearer $DSBA_LLAMA3_KEY\" \\\n",
        "    -d '{\n",
        "        \"model\": \"/models/NousResearch/Meta-Llama-3-8B-Instruct\",\n",
        "        \"messages\": [\n",
        "            {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},\n",
        "            {\"role\": \"user\", \"content\": \"Who won the world series in 2020?\"}\n",
        "        ]\n",
        "    }'\n",
        "```\n",
        "\n",
        "You can also add `| jq` if you have [`jq`](https://jqlang.github.io/jq/download/) installed to have it \"pretty print\":\n",
        "```bash\n",
        "curl \"$MODAL_BASE_URL/chat/completions\" \\\n",
        "    -H \"Content-Type: application/json\" \\\n",
        "    -H \"Authorization: Bearer $DSBA_LLAMA3_KEY\" \\\n",
        "    -d '{\n",
        "        \"model\": \"/models/NousResearch/Meta-Llama-3-8B-Instruct\",\n",
        "        \"messages\": [\n",
        "            {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},\n",
        "            {\"role\": \"user\", \"content\": \"Who won the world series in 2020?\"}\n",
        "        ]\n",
        "    }' | jq\n",
        "```"
      ],
      "metadata": {
        "id": "6WYclO3ZycJu"
      }
    }
  ]
 }
	{
	"nbformat": 4,
	"nbformat_minor": 0,
	"metadata": {
	"colab": {
	"provenance": []
	},
	"kernelspec": {
	"name": "python3",
	"display_name": "Python 3"
	},
	"language_info": {
	"name": "python"
	}
	},
	"cells": [
	{
	"cell_type": "code",
	"source": [
	"!pip install openai"
	],
	"metadata": {
	"colab": {
	"base_uri": "https://localhost:8080/"
	},
	"id": "riDjpPY84fmi",
	"outputId": "5143fe14-0d88-451e-b3f1-2692eb24b06c"
	},
	"execution_count": 1,
	"outputs": [
	{
	"output_type": "stream",
	"name": "stdout",
	"text": [
	"Collecting openai\n",
	" Downloading openai-1.43.0-py3-none-any.whl.metadata (22 kB)\n",
	"Requirement already satisfied: anyio<5,>=3.5.0 in /usr/local/lib/python3.10/dist-packages (from openai) (3.7.1)\n",
	"Requirement already satisfied: distro<2,>=1.7.0 in /usr/lib/python3/dist-packages (from openai) (1.7.0)\n",
	"Collecting httpx<1,>=0.23.0 (from openai)\n",
	" Downloading httpx-0.27.2-py3-none-any.whl.metadata (7.1 kB)\n",
	"Collecting jiter<1,>=0.4.0 (from openai)\n",
	" Downloading jiter-0.5.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.6 kB)\n",
	"Requirement already satisfied: pydantic<3,>=1.9.0 in /usr/local/lib/python3.10/dist-packages (from openai) (2.8.2)\n",
	"Requirement already satisfied: sniffio in /usr/local/lib/python3.10/dist-packages (from openai) (1.3.1)\n",
	"Requirement already satisfied: tqdm>4 in /usr/local/lib/python3.10/dist-packages (from openai) (4.66.5)\n",
	"Requirement already satisfied: typing-extensions<5,>=4.11 in /usr/local/lib/python3.10/dist-packages (from openai) (4.12.2)\n",
	"Requirement already satisfied: idna>=2.8 in /usr/local/lib/python3.10/dist-packages (from anyio<5,>=3.5.0->openai) (3.8)\n",
	"Requirement already satisfied: exceptiongroup in /usr/local/lib/python3.10/dist-packages (from anyio<5,>=3.5.0->openai) (1.2.2)\n",
	"Requirement already satisfied: certifi in /usr/local/lib/python3.10/dist-packages (from httpx<1,>=0.23.0->openai) (2024.8.30)\n",
	"Collecting httpcore==1.* (from httpx<1,>=0.23.0->openai)\n",
	" Downloading httpcore-1.0.5-py3-none-any.whl.metadata (20 kB)\n",
	"Collecting h11<0.15,>=0.13 (from httpcore==1.*->httpx<1,>=0.23.0->openai)\n",
	" Downloading h11-0.14.0-py3-none-any.whl.metadata (8.2 kB)\n",
	"Requirement already satisfied: annotated-types>=0.4.0 in /usr/local/lib/python3.10/dist-packages (from pydantic<3,>=1.9.0->openai) (0.7.0)\n",
	"Requirement already satisfied: pydantic-core==2.20.1 in /usr/local/lib/python3.10/dist-packages (from pydantic<3,>=1.9.0->openai) (2.20.1)\n",
	"Downloading openai-1.43.0-py3-none-any.whl (365 kB)\n",
	"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m365.7/365.7 kB\u001b[0m \u001b[31m2.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
	"\u001b[?25hDownloading httpx-0.27.2-py3-none-any.whl (76 kB)\n",
	"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m76.4/76.4 kB\u001b[0m \u001b[31m3.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
	"\u001b[?25hDownloading httpcore-1.0.5-py3-none-any.whl (77 kB)\n",
	"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m77.9/77.9 kB\u001b[0m \u001b[31m3.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
	"\u001b[?25hDownloading jiter-0.5.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (318 kB)\n",
	"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m318.9/318.9 kB\u001b[0m \u001b[31m10.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
	"\u001b[?25hDownloading h11-0.14.0-py3-none-any.whl (58 kB)\n",
	"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m58.3/58.3 kB\u001b[0m \u001b[31m2.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
	"\u001b[?25hInstalling collected packages: jiter, h11, httpcore, httpx, openai\n",
	"Successfully installed h11-0.14.0 httpcore-1.0.5 httpx-0.27.2 jiter-0.5.0 openai-1.43.0\n"
	]
	}
	]
	},
	{
	"cell_type": "markdown",
	"source": [
	"To run this in Colab, you will need to set an API Key named `DSBA_LLAMA3_KEY` and `MODAL_BASE_URL`, which is the URL endpoint where the LLaMa 3 model is hosted. You will need to add `/v1/` to the `MODAL_BASE_URL` path so it will look like:\n",
	"\n",
	"```\n",
	"# MODAL_BASE_URL\n",
	"https://your-workspace-name--vllm-openai-compatible-serve.modal.run/v1/\n",
	"```\n",
	"\n",
	"\n",
	"\n",
	"If using the class API example, these will be provided to you. Otherwise you will need to get these from your Modal service.\n",
	"\n",
	"![](https://miro.medium.com/v2/resize:fit:1400/format:webp/1*5wEevNCOf80GTHwptPTB4g.png)\n",
	"\n",
	"As mentioned, I have hosted `LLaMa3-8B-Instruct` model that we'll use instead of OpenAI. The reason is this avoids individual costs on the API -- only cost to me for hosting on Modal.\n",
	"\n",
	"This hosted model will not be up indefinitely and only for class demo purposes.\n",
	"\n",
	"If you host your own model, be sure to destroy it when you're done or you'll be charged."
	],
	"metadata": {
	"id": "nylKS_U36keK"
	}
	},
	{
	"cell_type": "code",
	"execution_count": 2,
	"metadata": {
	"id": "LiLY95O62dOS"
	},
	"outputs": [],
	"source": [
	"from openai import OpenAI\n",
	"from google.colab import userdata\n",
	"\n",
	"client = OpenAI(api_key=userdata.get(\"DSBA_LLAMA3_KEY\"))\n",
	"client.base_url = userdata.get(\"MODAL_BASE_URL\")\n",
	"model = \"/models/NousResearch/Meta-Llama-3-8B-Instruct\"\n",
	"\n",
	"messages = [\n",
	" {\n",
	" \"role\": \"system\",\n",
	" \"content\": \"You are a poetic assistant, skilled in writing satirical doggerel with creative flair.\",\n",
	" },\n",
	" {\n",
	" \"role\": \"user\",\n",
	" \"content\": \"Compose a limerick about baboons and racoons.\",\n",
	" },\n",
	"]\n",
	"\n",
	"stream = client.chat.completions.create(\n",
	" model=model,\n",
	" messages=messages,\n",
	" stream=True,\n",
	")"
	]
	},
	{
	"cell_type": "code",
	"source": [
	"for chunk in stream:\n",
	" if chunk.choices[0].delta.content is not None:\n",
	" print(chunk.choices[0].delta.content, end=\"\")"
	],
	"metadata": {
	"colab": {
	"base_uri": "https://localhost:8080/"
	},
	"id": "T8p3ZWX16Ghi",
	"outputId": "5e450ad2-9697-41c7-cd32-a81f8216c32b"
	},
	"execution_count": 3,
	"outputs": [
	{
	"output_type": "stream",
	"name": "stdout",
	"text": [
	" There once were two creatures quite fine,\n",
	"Baboons and raccoons, a curious combine,\n",
	"They raided the trash cans with glee,\n",
	"In the moon's silver shine,\n",
	"Together they dined, a messy entwine."
	]
	}
	]
	},
	{
	"cell_type": "markdown",
	"source": [
	"Alternatively, you can run this as a cURL command. This example shows how to run it in bash (Unix/Mac).\n",
	"\n",
	"This assumes you have set local environmental variables (e.g., `.env` with `MODAL_BASE_URL` and `DSBA_LLAMA3_KEY` and loaded them)\n",
	"\n",
	"```bash\n",
	"curl \"$MODAL_BASE_URL/chat/completions\" \\\n",
	" -H \"Content-Type: application/json\" \\\n",
	" -H \"Authorization: Bearer $DSBA_LLAMA3_KEY\" \\\n",
	" -d '{\n",
	" \"model\": \"/models/NousResearch/Meta-Llama-3-8B-Instruct\",\n",
	" \"messages\": [\n",
	" {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},\n",
	" {\"role\": \"user\", \"content\": \"Who won the world series in 2020?\"}\n",
	" ]\n",
	" }'\n",
	"```\n",
	"\n",
	"You can also add `\| jq` if you have [`jq`](https://jqlang.github.io/jq/download/) installed to have it \"pretty print\":\n",
	"```bash\n",
	"curl \"$MODAL_BASE_URL/chat/completions\" \\\n",
	" -H \"Content-Type: application/json\" \\\n",
	" -H \"Authorization: Bearer $DSBA_LLAMA3_KEY\" \\\n",
	" -d '{\n",
	" \"model\": \"/models/NousResearch/Meta-Llama-3-8B-Instruct\",\n",
	" \"messages\": [\n",
	" {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},\n",
	" {\"role\": \"user\", \"content\": \"Who won the world series in 2020?\"}\n",
	" ]\n",
	" }' \| jq\n",
	"```"
	],
	"metadata": {
	"id": "6WYclO3ZycJu"
	}
	}
	]
	}
No results found