Last active
October 14, 2025 23:39
-
-
Save JGalego/dc5945d798f948625c4111c4844de563 to your computer and use it in GitHub Desktop.
Deploy LLaVA-OneVision on Amazon SageMaker using the Hugging Face Inference Toolkit π€πΏ
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| { | |
| "cells": [ | |
| { | |
| "cell_type": "markdown", | |
| "id": "09606c0f-34d9-4c8a-9d53-3b969e81795d", | |
| "metadata": {}, | |
| "source": [ | |
| "# Deploying LLaVA-OneVision on Amazon SageMaker\n", | |
| "\n", | |
| "This guide provides instructions for deploying the [LLaVA-OneVision](https://llava-vl.github.io/blog/2024-08-05-llava-onevision/) model on [Amazon SageMaker](https://aws.amazon.com/sagemaker/) using the [Hugging Face Inference Toolkit](https://github.com/aws/sagemaker-huggingface-inference-toolkit).\n", | |
| "\n", | |
| "<img src=\"https://llava-vl.github.io/blog/2024-08-05-llava-onevision/demos/fig1.png\" width=\"75%\"/>" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "id": "b6057254-51fa-47ed-9200-ccdc2ffaefb2", | |
| "metadata": {}, | |
| "source": [ | |
| "## Prerequisites β " | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 1, | |
| "id": "d1a8c914-106c-40d3-a326-ed79391f9440", | |
| "metadata": { | |
| "execution": { | |
| "iopub.execute_input": "2025-10-14T23:29:10.194107Z", | |
| "iopub.status.busy": "2025-10-14T23:29:10.193948Z", | |
| "iopub.status.idle": "2025-10-14T23:29:12.215509Z", | |
| "shell.execute_reply": "2025-10-14T23:29:12.214942Z", | |
| "shell.execute_reply.started": "2025-10-14T23:29:10.194089Z" | |
| } | |
| }, | |
| "outputs": [], | |
| "source": [ | |
| "# Make sure Amazon SageMaker Python SDK is installed / updated\n", | |
| "!pip install -qU --use-deprecated=legacy-resolver sagemaker" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "id": "362c6f76-1257-417d-96bd-8e6ef09066e3", | |
| "metadata": {}, | |
| "source": [ | |
| "## Initial Setup β‘οΈ" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 2, | |
| "id": "97b5b215-8b32-4059-a77c-31284a355cc8", | |
| "metadata": { | |
| "execution": { | |
| "iopub.execute_input": "2025-10-14T23:29:12.218730Z", | |
| "iopub.status.busy": "2025-10-14T23:29:12.218323Z", | |
| "iopub.status.idle": "2025-10-14T23:29:12.221930Z", | |
| "shell.execute_reply": "2025-10-14T23:29:12.221345Z", | |
| "shell.execute_reply.started": "2025-10-14T23:29:12.218708Z" | |
| } | |
| }, | |
| "outputs": [], | |
| "source": [ | |
| "import logging\n", | |
| "import warnings\n", | |
| "\n", | |
| "# Suppress all warnings\n", | |
| "warnings.filterwarnings(\"ignore\")\n", | |
| "\n", | |
| "# Sagemaker continuously complains about config, so we'll suppress that too\n", | |
| "logging.getLogger(\"sagemaker.config\").setLevel(logging.WARNING)" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 3, | |
| "id": "ddf8465f-dafc-46a2-b7e1-4d799fea0dfe", | |
| "metadata": { | |
| "execution": { | |
| "iopub.execute_input": "2025-10-14T23:29:12.222615Z", | |
| "iopub.status.busy": "2025-10-14T23:29:12.222445Z", | |
| "iopub.status.idle": "2025-10-14T23:29:14.189340Z", | |
| "shell.execute_reply": "2025-10-14T23:29:14.188787Z", | |
| "shell.execute_reply.started": "2025-10-14T23:29:12.222598Z" | |
| } | |
| }, | |
| "outputs": [], | |
| "source": [ | |
| "import sagemaker\n", | |
| "\n", | |
| "# Initialize SageMaker session\n", | |
| "sess = sagemaker.Session()\n", | |
| "role = sagemaker.get_execution_role()" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "id": "74c7e7f4-0809-4698-bc89-971af8493bdd", | |
| "metadata": {}, | |
| "source": [ | |
| "## Model Setup π€" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 4, | |
| "id": "53adaa07-2274-479f-a6aa-59d9fe8bef3e", | |
| "metadata": { | |
| "execution": { | |
| "iopub.execute_input": "2025-10-14T23:29:14.191642Z", | |
| "iopub.status.busy": "2025-10-14T23:29:14.191472Z", | |
| "iopub.status.idle": "2025-10-14T23:29:14.341955Z", | |
| "shell.execute_reply": "2025-10-14T23:29:14.341423Z", | |
| "shell.execute_reply.started": "2025-10-14T23:29:14.191625Z" | |
| } | |
| }, | |
| "outputs": [], | |
| "source": [ | |
| "from sagemaker.huggingface import HuggingFaceModel\n", | |
| "\n", | |
| "hub = {\n", | |
| " 'HF_MODEL_ID': \"jgalego/llava-onevision-qwen2-0.5b-ov-hf\", # original repo + code folder\n", | |
| " 'HF_TASK': \"image-text-to-text\"\n", | |
| "}\n", | |
| "\n", | |
| "huggingface_model = HuggingFaceModel(\n", | |
| " transformers_version=\"4.49\",\n", | |
| " pytorch_version=\"2.6\",\n", | |
| " py_version=\"py312\",\n", | |
| " env=hub,\n", | |
| " role=role,\n", | |
| " entry_point='inference.py',\n", | |
| " source_dir='./code'\n", | |
| ")" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "id": "3324f323-f6ac-494b-ab7e-dde7c6c43cef", | |
| "metadata": {}, | |
| "source": [ | |
| "## Model Deployment π" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 5, | |
| "id": "00635cce-2258-4294-9b23-435ec4ff7f1b", | |
| "metadata": { | |
| "execution": { | |
| "iopub.execute_input": "2025-10-14T23:29:14.344348Z", | |
| "iopub.status.busy": "2025-10-14T23:29:14.344040Z", | |
| "iopub.status.idle": "2025-10-14T23:35:47.049548Z", | |
| "shell.execute_reply": "2025-10-14T23:35:47.048902Z", | |
| "shell.execute_reply.started": "2025-10-14T23:29:14.344329Z" | |
| } | |
| }, | |
| "outputs": [ | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "------------!" | |
| ] | |
| } | |
| ], | |
| "source": [ | |
| "predictor = huggingface_model.deploy(\n", | |
| " initial_instance_count=1,\n", | |
| " instance_type='ml.g4dn.xlarge',\n", | |
| " endpoint_name='llava-onevision-endpoint',\n", | |
| " model_data_download_timeout=5*60,\n", | |
| " container_startup_health_check_timeout=5*60\n", | |
| ")" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "id": "1276c497-6c68-4113-975d-02a09bec616f", | |
| "metadata": {}, | |
| "source": [ | |
| "## Test Endpoint π§ͺ\n", | |
| "\n", | |
| "Download a sample image" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 6, | |
| "id": "b27f04f7-5330-43df-a5f9-64ee43f1d77b", | |
| "metadata": { | |
| "execution": { | |
| "iopub.execute_input": "2025-10-14T23:35:47.052362Z", | |
| "iopub.status.busy": "2025-10-14T23:35:47.052158Z", | |
| "iopub.status.idle": "2025-10-14T23:35:47.289291Z", | |
| "shell.execute_reply": "2025-10-14T23:35:47.288409Z", | |
| "shell.execute_reply.started": "2025-10-14T23:35:47.052344Z" | |
| } | |
| }, | |
| "outputs": [], | |
| "source": [ | |
| "!wget -q -O example.jpg https://www.surfertoday.com/images/stories/dog-surfing-guide.jpg" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 7, | |
| "id": "56e3b73d-7795-4897-80ea-b85eee29c2c5", | |
| "metadata": { | |
| "execution": { | |
| "iopub.execute_input": "2025-10-14T23:35:47.292917Z", | |
| "iopub.status.busy": "2025-10-14T23:35:47.292676Z", | |
| "iopub.status.idle": "2025-10-14T23:35:47.297307Z", | |
| "shell.execute_reply": "2025-10-14T23:35:47.296733Z", | |
| "shell.execute_reply.started": "2025-10-14T23:35:47.292896Z" | |
| } | |
| }, | |
| "outputs": [], | |
| "source": [ | |
| "import base64\n", | |
| "\n", | |
| "# Read and encode your image\n", | |
| "with open('example.jpg', 'rb') as f:\n", | |
| " image_bytes = base64.b64encode(f.read()).decode('utf-8')\n", | |
| "\n", | |
| "# Prepare request data\n", | |
| "payload = {\n", | |
| " \"inputs\": \"Describe this image in detail.\",\n", | |
| " \"images\": [image_bytes],\n", | |
| " \"parameters\": {\n", | |
| " \"max_new_tokens\": 256,\n", | |
| " \"temperature\": 0.7,\n", | |
| " \"top_p\": 0.9,\n", | |
| " \"do_sample\": True,\n", | |
| " \"repetition_penalty\": 1.2,\n", | |
| " \"no_repeat_ngram_size\": 3\n", | |
| " }\n", | |
| "}" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 8, | |
| "id": "5b30c0fa-bae1-4e73-8301-8726bff5a8a0", | |
| "metadata": { | |
| "execution": { | |
| "iopub.execute_input": "2025-10-14T23:35:47.298118Z", | |
| "iopub.status.busy": "2025-10-14T23:35:47.297927Z", | |
| "iopub.status.idle": "2025-10-14T23:35:54.913910Z", | |
| "shell.execute_reply": "2025-10-14T23:35:54.913303Z", | |
| "shell.execute_reply.started": "2025-10-14T23:35:47.298101Z" | |
| } | |
| }, | |
| "outputs": [ | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "In the heart of a clear blue sky, a small white and brown dog is having an adventure on a vibrant yellow surfboard that floats gently in the water. The dog, adorned with a pair of striking red sunglasses, gazes directly into the camera, capturing our attention with its infectious smile. A black leash clings to one side of the board, ready for any unforeseen adventures ahead. The scene is serene yet filled with joy, encapsulating a moment of pure fun and excitement for our canine friend.\n" | |
| ] | |
| } | |
| ], | |
| "source": [ | |
| "response = predictor.predict(payload)\n", | |
| "print(response['generated_text'])" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "id": "03220441-a72d-4268-832d-61563f228542", | |
| "metadata": {}, | |
| "source": [ | |
| "## Cleanup" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 9, | |
| "id": "0541f59f-cedd-47fe-bd5b-749a72fea5c8", | |
| "metadata": { | |
| "execution": { | |
| "iopub.execute_input": "2025-10-14T23:35:54.916660Z", | |
| "iopub.status.busy": "2025-10-14T23:35:54.916456Z", | |
| "iopub.status.idle": "2025-10-14T23:35:55.483667Z", | |
| "shell.execute_reply": "2025-10-14T23:35:55.482992Z", | |
| "shell.execute_reply.started": "2025-10-14T23:35:54.916641Z" | |
| } | |
| }, | |
| "outputs": [], | |
| "source": [ | |
| "predictor.delete_model()\n", | |
| "predictor.delete_endpoint()" | |
| ] | |
| } | |
| ], | |
| "metadata": { | |
| "kernelspec": { | |
| "display_name": "Python 3 (ipykernel)", | |
| "language": "python", | |
| "name": "python3" | |
| }, | |
| "language_info": { | |
| "codemirror_mode": { | |
| "name": "ipython", | |
| "version": 3 | |
| }, | |
| "file_extension": ".py", | |
| "mimetype": "text/x-python", | |
| "name": "python", | |
| "nbconvert_exporter": "python", | |
| "pygments_lexer": "ipython3", | |
| "version": "3.12.9" | |
| } | |
| }, | |
| "nbformat": 4, | |
| "nbformat_minor": 5 | |
| } |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment