Skip to content

Instantly share code, notes, and snippets.

@manasouza
Last active January 20, 2022 23:48
Show Gist options
  • Select an option

  • Save manasouza/dc9bda335f3d7ade3961a7228e025eac to your computer and use it in GitHub Desktop.

Select an option

Save manasouza/dc9bda335f3d7ade3961a7228e025eac to your computer and use it in GitHub Desktop.
INF-550-spec-operation-analysis.ipynb
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 5,
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.5"
},
"colab": {
"name": "INF-550-spec-operation-analysis.ipynb",
"provenance": [],
"collapsed_sections": [],
"include_colab_link": true
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/manasouza/dc9bda335f3d7ade3961a7228e025eac/inf-550-spec-operation-analysis.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "g_4QPWwqK0i7"
},
"source": [
"# Example of XML extraction from SPECjvm2008 result .raw file\n",
"\n",
"The following code has 2 responsibilities:\n",
"* Get files stored in Google Drive (the .raw files)\n",
"* Convert the content of such files in Python dict for data handling\n",
"\n",
"Since the result files from SPECjvm2008 have the metric operations/minute, the code will aggregate this information an plot it into a linear graph.\n",
"\n",
"Those ops/m information are related to executions into Azure VM instances.\n",
"\n",
"The tests were run after ssh into the VM instance and executing:\n",
"```\n",
"$ declare -a StringArray=(\"xml.transform\" \"crypto.signverify\" )\n",
"\n",
"$ for val in ${StringArray[@]}; do\n",
"cd SPECjvm2008/ && java -jar SPECjvm2008.jar --createTextFile true --createHtmlFile false --iterations 50 $val > ~/\"$val\".bench.log & && cd ..\n",
"done\n",
"```"
],
"id": "g_4QPWwqK0i7"
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "93988c29",
"outputId": "a05d2c18-0386-4eb7-f707-57f716738b8f"
},
"source": [
"# REF: https://stackabuse.com/reading-and-writing-xml-files-in-python-with-pandas/\n",
"\n",
"!pip install xmltodict\n",
"!pip install -U -q PyDrive"
],
"id": "93988c29",
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"Requirement already satisfied: xmltodict in /usr/local/lib/python3.7/dist-packages (0.12.0)\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "b3cff272"
},
"source": [
"import xmltodict\n",
"import pandas as pd\n",
"import matplotlib.pyplot as plt\n",
"from datetime import datetime, timezone\n",
"import re"
],
"id": "b3cff272",
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "Ffh2st4z7r2F"
},
"source": [
"from pydrive.auth import GoogleAuth\n",
"from pydrive.drive import GoogleDrive\n",
"from google.colab import auth\n",
"from oauth2client.client import GoogleCredentials# Authenticate and create the PyDrive client.\n",
"\n",
"auth.authenticate_user()\n",
"gauth = GoogleAuth()\n",
"gauth.credentials = GoogleCredentials.get_application_default()\n",
"drive = GoogleDrive(gauth) "
],
"id": "Ffh2st4z7r2F",
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "69a542dd"
},
"source": [
"def get_iterations(xml):\n",
" return xml['specjvm-result']['benchmark-results']['benchmark-result'][1]['iterations']\n",
"\n",
"def hms_to_seconds(t):\n",
" h, m, s = [int(i) for i in t.split(':')]\n",
" return 3600*h + 60*m + s\n",
"\n",
"def get_ops_values(iterations):\n",
" data = []\n",
" times = []\n",
" previous_start = None\n",
" for result in iterations['iteration-result']:\n",
" start_time = int(result['@startTime'])\n",
" end_time = int(result['@endTime'])\n",
" duration = end_time - start_time\n",
" data.append(float(result['@operations']))\n",
" return data, times\n",
"\n",
"def extract_gdrive_fileid_from_link(file_link):\n",
" return re.search('\\/([\\d\\w_]{33})\\/view', file_link).group(1)\n",
"\n",
"def convert_xml_file_to_dict(file_name):\n",
" id = extract_gdrive_fileid_from_link(xml_data_files[file_name])\n",
" downloaded = drive.CreateFile({'id':id})\n",
" xml_data = downloaded.GetContentString()\n",
" return xmltodict.parse(xml_data)"
],
"id": "69a542dd",
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "Uq5zvDYU8AoC"
},
"source": [
"xml_data_files = {\n",
" 'SPECjvm2008.50-signverify-1vcpu.raw': 'https://drive.google.com/file/d/1PO7jrvC2P5MunAxVplg1CpbvpszMXymO/view?usp=sharing', \n",
" 'SPECjvm2008.50-signverify-4vcpu.raw': 'https://drive.google.com/file/d/1E5Lr69czJBBEgTXyE1brE1jeuunOv9V5/view?usp=sharing',\n",
" 'SPECjvm2008.50-xmltransform-1vcpu.raw': 'https://drive.google.com/file/d/1JsJ4quyJUgfRyU1XyMs13Q9iUJ0zLrGl/view?usp=sharing',\n",
" 'SPECjvm2008.50-xmltransform-4vcpu.raw': 'https://drive.google.com/file/d/19R7uME8v29IF78j_DtlwvJ1iyUqHSTU2/view?usp=sharing'\n",
"}\n",
"\n",
"xmltransform_4vcpu = convert_xml_file_to_dict('SPECjvm2008.50-xmltransform-4vcpu.raw')\n",
"xmltransform_1vcpu = convert_xml_file_to_dict('SPECjvm2008.50-xmltransform-1vcpu.raw')\n",
"signverify_4vcpu = convert_xml_file_to_dict('SPECjvm2008.50-signverify-4vcpu.raw')\n",
"signverify_1vcpu = convert_xml_file_to_dict('SPECjvm2008.50-signverify-1vcpu.raw')\n",
"\n"
],
"id": "Uq5zvDYU8AoC",
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "QDv9ov__6ShE"
},
"source": [
""
],
"id": "QDv9ov__6ShE"
},
{
"cell_type": "code",
"metadata": {
"id": "343b9204"
},
"source": [
"# extract xml.transform data \n",
"xmltransform_4vcpu_data, xmltransform_4vcpu_times = get_ops_values(get_iterations(xmltransform_4vcpu))\n",
"xmltransform_1vcpu_data, xmltransform_1vcpu_times = get_ops_values(get_iterations(xmltransform_1vcpu))\n",
"\n",
"signverify_4vcpu_data, signverify_4vcpu_times = get_ops_values(get_iterations(signverify_4vcpu))\n",
"signverify_1vcpu_data, signverify_4vcpu_times = get_ops_values(get_iterations(signverify_1vcpu))"
],
"id": "343b9204",
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "e1d0d562",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 589
},
"outputId": "fd814b18-4796-428d-96df-d911f4af5589"
},
"source": [
"print(f'4vcpu average ops/min: {pd.Series(signverify_4vcpu_data).mean()}')\n",
"print(f'1vcpu average ops/min: {pd.Series(signverify_1vcpu_data).mean()}')\n",
"\n",
"fig, ax = plt.subplots(figsize=(15,9))\n",
"ax.plot(signverify_1vcpu_data, label='1vcpu')\n",
"ax.plot(signverify_4vcpu_data, label='4vcpu')\n",
"plt.legend()\n",
"plt.title('50 iterações de crypto.signverify em Ops/min')\n",
"plt.show()"
],
"id": "e1d0d562",
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"4vcpu average ops/min: 1152.7549378888982\n",
"1vcpu average ops/min: 199.17653617410804\n"
],
"name": "stdout"
},
{
"output_type": "display_data",
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 1080x648 with 1 Axes>"
]
},
"metadata": {
"tags": [],
"needs_background": "light"
}
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "fbcec607",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 589
},
"outputId": "745e22c2-d0df-45d5-9739-13916d274876"
},
"source": [
"print(f'4vcpu average ops/min: {pd.Series(xmltransform_4vcpu_data).mean()}')\n",
"print(f'1vcpu average ops/min: {pd.Series(xmltransform_1vcpu_data).mean()}')\n",
"\n",
"fig, ax = plt.subplots(figsize=(15,9))\n",
"ax.plot(xmltransform_1vcpu_data, label='1vcpu')\n",
"ax.plot(xmltransform_4vcpu_data, label='4vcpu')\n",
"plt.legend()\n",
"plt.title('50 iterações de xml.transform em Ops/min')\n",
"plt.show()"
],
"id": "fbcec607",
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"4vcpu average ops/min: 810.4185799940233\n",
"1vcpu average ops/min: 144.43659287884068\n"
],
"name": "stdout"
},
{
"output_type": "display_data",
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 1080x648 with 1 Axes>"
]
},
"metadata": {
"tags": [],
"needs_background": "light"
}
}
]
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment