Skip to content

Instantly share code, notes, and snippets.

@wesslen
Last active September 5, 2024 22:06
Show Gist options
  • Select an option

  • Save wesslen/1b7cdbf60591677475869f570c8efd90 to your computer and use it in GitHub Desktop.

Select an option

Save wesslen/1b7cdbf60591677475869f570c8efd90 to your computer and use it in GitHub Desktop.
dsba6010-openai-api-prompting-with-modal.ipynb
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"provenance": []
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"language_info": {
"name": "python"
}
},
"cells": [
{
"cell_type": "code",
"source": [
"!pip install openai"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "riDjpPY84fmi",
"outputId": "5143fe14-0d88-451e-b3f1-2692eb24b06c"
},
"execution_count": 1,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Collecting openai\n",
" Downloading openai-1.43.0-py3-none-any.whl.metadata (22 kB)\n",
"Requirement already satisfied: anyio<5,>=3.5.0 in /usr/local/lib/python3.10/dist-packages (from openai) (3.7.1)\n",
"Requirement already satisfied: distro<2,>=1.7.0 in /usr/lib/python3/dist-packages (from openai) (1.7.0)\n",
"Collecting httpx<1,>=0.23.0 (from openai)\n",
" Downloading httpx-0.27.2-py3-none-any.whl.metadata (7.1 kB)\n",
"Collecting jiter<1,>=0.4.0 (from openai)\n",
" Downloading jiter-0.5.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.6 kB)\n",
"Requirement already satisfied: pydantic<3,>=1.9.0 in /usr/local/lib/python3.10/dist-packages (from openai) (2.8.2)\n",
"Requirement already satisfied: sniffio in /usr/local/lib/python3.10/dist-packages (from openai) (1.3.1)\n",
"Requirement already satisfied: tqdm>4 in /usr/local/lib/python3.10/dist-packages (from openai) (4.66.5)\n",
"Requirement already satisfied: typing-extensions<5,>=4.11 in /usr/local/lib/python3.10/dist-packages (from openai) (4.12.2)\n",
"Requirement already satisfied: idna>=2.8 in /usr/local/lib/python3.10/dist-packages (from anyio<5,>=3.5.0->openai) (3.8)\n",
"Requirement already satisfied: exceptiongroup in /usr/local/lib/python3.10/dist-packages (from anyio<5,>=3.5.0->openai) (1.2.2)\n",
"Requirement already satisfied: certifi in /usr/local/lib/python3.10/dist-packages (from httpx<1,>=0.23.0->openai) (2024.8.30)\n",
"Collecting httpcore==1.* (from httpx<1,>=0.23.0->openai)\n",
" Downloading httpcore-1.0.5-py3-none-any.whl.metadata (20 kB)\n",
"Collecting h11<0.15,>=0.13 (from httpcore==1.*->httpx<1,>=0.23.0->openai)\n",
" Downloading h11-0.14.0-py3-none-any.whl.metadata (8.2 kB)\n",
"Requirement already satisfied: annotated-types>=0.4.0 in /usr/local/lib/python3.10/dist-packages (from pydantic<3,>=1.9.0->openai) (0.7.0)\n",
"Requirement already satisfied: pydantic-core==2.20.1 in /usr/local/lib/python3.10/dist-packages (from pydantic<3,>=1.9.0->openai) (2.20.1)\n",
"Downloading openai-1.43.0-py3-none-any.whl (365 kB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m365.7/365.7 kB\u001b[0m \u001b[31m2.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25hDownloading httpx-0.27.2-py3-none-any.whl (76 kB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m76.4/76.4 kB\u001b[0m \u001b[31m3.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25hDownloading httpcore-1.0.5-py3-none-any.whl (77 kB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m77.9/77.9 kB\u001b[0m \u001b[31m3.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25hDownloading jiter-0.5.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (318 kB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m318.9/318.9 kB\u001b[0m \u001b[31m10.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25hDownloading h11-0.14.0-py3-none-any.whl (58 kB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m58.3/58.3 kB\u001b[0m \u001b[31m2.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25hInstalling collected packages: jiter, h11, httpcore, httpx, openai\n",
"Successfully installed h11-0.14.0 httpcore-1.0.5 httpx-0.27.2 jiter-0.5.0 openai-1.43.0\n"
]
}
]
},
{
"cell_type": "markdown",
"source": [
"To run this in Colab, you will need to set an API Key named `DSBA_LLAMA3_KEY` and `MODAL_BASE_URL`, which is the URL endpoint where the LLaMa 3 model is hosted. You will need to add `/v1/` to the `MODAL_BASE_URL` path so it will look like:\n",
"\n",
"```\n",
"# MODAL_BASE_URL\n",
"https://your-workspace-name--vllm-openai-compatible-serve.modal.run/v1/\n",
"```\n",
"\n",
"\n",
"\n",
"If using the class API example, these will be provided to you. Otherwise you will need to get these from your Modal service.\n",
"\n",
"![](https://miro.medium.com/v2/resize:fit:1400/format:webp/1*5wEevNCOf80GTHwptPTB4g.png)\n",
"\n",
"As mentioned, I have hosted `LLaMa3-8B-Instruct` model that we'll use instead of OpenAI. The reason is this avoids individual costs on the API -- only cost to me for hosting on Modal.\n",
"\n",
"This hosted model will **not** be up indefinitely and only for class demo purposes.\n",
"\n",
"If you host your own model, be sure to destroy it when you're done or you'll be charged."
],
"metadata": {
"id": "nylKS_U36keK"
}
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"id": "LiLY95O62dOS"
},
"outputs": [],
"source": [
"from openai import OpenAI\n",
"from google.colab import userdata\n",
"\n",
"client = OpenAI(api_key=userdata.get(\"DSBA_LLAMA3_KEY\"))\n",
"client.base_url = userdata.get(\"MODAL_BASE_URL\")\n",
"model = \"/models/NousResearch/Meta-Llama-3-8B-Instruct\"\n",
"\n",
"messages = [\n",
" {\n",
" \"role\": \"system\",\n",
" \"content\": \"You are a poetic assistant, skilled in writing satirical doggerel with creative flair.\",\n",
" },\n",
" {\n",
" \"role\": \"user\",\n",
" \"content\": \"Compose a limerick about baboons and racoons.\",\n",
" },\n",
"]\n",
"\n",
"stream = client.chat.completions.create(\n",
" model=model,\n",
" messages=messages,\n",
" stream=True,\n",
")"
]
},
{
"cell_type": "code",
"source": [
"for chunk in stream:\n",
" if chunk.choices[0].delta.content is not None:\n",
" print(chunk.choices[0].delta.content, end=\"\")"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "T8p3ZWX16Ghi",
"outputId": "5e450ad2-9697-41c7-cd32-a81f8216c32b"
},
"execution_count": 3,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
" There once were two creatures quite fine,\n",
"Baboons and raccoons, a curious combine,\n",
"They raided the trash cans with glee,\n",
"In the moon's silver shine,\n",
"Together they dined, a messy entwine."
]
}
]
},
{
"cell_type": "markdown",
"source": [
"Alternatively, you can run this as a cURL command. This example shows how to run it in bash (Unix/Mac).\n",
"\n",
"This assumes you have set local environmental variables (e.g., `.env` with `MODAL_BASE_URL` and `DSBA_LLAMA3_KEY` and loaded them)\n",
"\n",
"```bash\n",
"curl \"$MODAL_BASE_URL/chat/completions\" \\\n",
" -H \"Content-Type: application/json\" \\\n",
" -H \"Authorization: Bearer $DSBA_LLAMA3_KEY\" \\\n",
" -d '{\n",
" \"model\": \"/models/NousResearch/Meta-Llama-3-8B-Instruct\",\n",
" \"messages\": [\n",
" {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},\n",
" {\"role\": \"user\", \"content\": \"Who won the world series in 2020?\"}\n",
" ]\n",
" }'\n",
"```\n",
"\n",
"You can also add `| jq` if you have [`jq`](https://jqlang.github.io/jq/download/) installed to have it \"pretty print\":\n",
"```bash\n",
"curl \"$MODAL_BASE_URL/chat/completions\" \\\n",
" -H \"Content-Type: application/json\" \\\n",
" -H \"Authorization: Bearer $DSBA_LLAMA3_KEY\" \\\n",
" -d '{\n",
" \"model\": \"/models/NousResearch/Meta-Llama-3-8B-Instruct\",\n",
" \"messages\": [\n",
" {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},\n",
" {\"role\": \"user\", \"content\": \"Who won the world series in 2020?\"}\n",
" ]\n",
" }' | jq\n",
"```"
],
"metadata": {
"id": "6WYclO3ZycJu"
}
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment