Created
December 10, 2018 13:45
-
-
Save EsterRibeiro/4c985e88daff097c96ff547990254858 to your computer and use it in GitHub Desktop.
Filtragem Colaborativa.ipynb
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| { | |
| "nbformat": 4, | |
| "nbformat_minor": 0, | |
| "metadata": { | |
| "colab": { | |
| "name": "Filtragem Colaborativa.ipynb", | |
| "version": "0.3.2", | |
| "provenance": [], | |
| "collapsed_sections": [], | |
| "include_colab_link": true | |
| }, | |
| "kernelspec": { | |
| "name": "python3", | |
| "display_name": "Python 3" | |
| } | |
| }, | |
| "cells": [ | |
| { | |
| "cell_type": "markdown", | |
| "metadata": { | |
| "id": "view-in-github", | |
| "colab_type": "text" | |
| }, | |
| "source": [ | |
| "<a href=\"https://colab.research.google.com/gist/EsterRibeiro/4c985e88daff097c96ff547990254858/filtragem-colaborativa.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>" | |
| ] | |
| }, | |
| { | |
| "metadata": { | |
| "id": "JHQBUPZFHXWx", | |
| "colab_type": "text" | |
| }, | |
| "cell_type": "markdown", | |
| "source": [ | |
| "Um sistema de filtragem colaborativa recomenda os itens de acordo com avaliações de outros usuários, e prediz a nota que determinado usuário (que ainda não avaliou o item) irá dar para o item. Para isso, é necessário calcular os usuários que tenham maior semelhança com o usuário alvo. Esse tipo de sistema é menos genérico do que um SR com filtragem demográfica por exemplo, mas ele não precisa saber a categoria do item, entre outras informações mais específicas para fazer recomendações." | |
| ] | |
| }, | |
| { | |
| "metadata": { | |
| "id": "g0R_yIXvKUF8", | |
| "colab_type": "code", | |
| "outputId": "ea50a2c1-e69e-4b98-c14b-f58992e15e37", | |
| "colab": { | |
| "base_uri": "https://localhost:8080/", | |
| "height": 272 | |
| } | |
| }, | |
| "cell_type": "code", | |
| "source": [ | |
| "#libraries\n", | |
| "\n", | |
| "!pip install surprise\n", | |
| "\n", | |
| "import pandas as pd \n", | |
| "import numpy as np \n", | |
| "import matplotlib.pyplot as plt\n", | |
| "from surprise import Reader\n", | |
| "from surprise import Dataset\n", | |
| "from surprise import SVD\n", | |
| "from surprise import evaluate\n", | |
| "from google.colab import files\n", | |
| "\n", | |
| "reader = Reader()" | |
| ], | |
| "execution_count": 0, | |
| "outputs": [ | |
| { | |
| "output_type": "stream", | |
| "text": [ | |
| "Collecting surprise\n", | |
| " Downloading https://files.pythonhosted.org/packages/61/de/e5cba8682201fcf9c3719a6fdda95693468ed061945493dea2dd37c5618b/surprise-0.1-py2.py3-none-any.whl\n", | |
| "Collecting scikit-surprise (from surprise)\n", | |
| "\u001b[?25l Downloading https://files.pythonhosted.org/packages/4d/fc/cd4210b247d1dca421c25994740cbbf03c5e980e31881f10eaddf45fdab0/scikit-surprise-1.0.6.tar.gz (3.3MB)\n", | |
| "\u001b[K 100% |████████████████████████████████| 3.3MB 7.4MB/s \n", | |
| "\u001b[?25hRequirement already satisfied: joblib>=0.11 in /usr/local/lib/python3.6/dist-packages (from scikit-surprise->surprise) (0.13.0)\n", | |
| "Requirement already satisfied: numpy>=1.11.2 in /usr/local/lib/python3.6/dist-packages (from scikit-surprise->surprise) (1.14.6)\n", | |
| "Requirement already satisfied: scipy>=1.0.0 in /usr/local/lib/python3.6/dist-packages (from scikit-surprise->surprise) (1.1.0)\n", | |
| "Requirement already satisfied: six>=1.10.0 in /usr/local/lib/python3.6/dist-packages (from scikit-surprise->surprise) (1.11.0)\n", | |
| "Building wheels for collected packages: scikit-surprise\n", | |
| " Running setup.py bdist_wheel for scikit-surprise ... \u001b[?25l-\b \b\\\b \b|\b \b/\b \b-\b \b\\\b \b|\b \b/\b \b-\b \b\\\b \b|\b \b/\b \b-\b \b\\\b \bdone\n", | |
| "\u001b[?25h Stored in directory: /root/.cache/pip/wheels/ec/c0/55/3a28eab06b53c220015063ebbdb81213cd3dcbb72c088251ec\n", | |
| "Successfully built scikit-surprise\n", | |
| "Installing collected packages: scikit-surprise, surprise\n", | |
| "Successfully installed scikit-surprise-1.0.6 surprise-0.1\n" | |
| ], | |
| "name": "stdout" | |
| } | |
| ] | |
| }, | |
| { | |
| "metadata": { | |
| "id": "1ggVjiPxH0PX", | |
| "colab_type": "code", | |
| "outputId": "fff3c49d-8449-47df-a4e2-2d23c18d423b", | |
| "colab": { | |
| "resources": { | |
| "http://localhost:8080/nbextensions/google.colab/files.js": { | |
| "data": "Ly8gQ29weXJpZ2h0IDIwMTcgR29vZ2xlIExMQwovLwovLyBMaWNlbnNlZCB1bmRlciB0aGUgQXBhY2hlIExpY2Vuc2UsIFZlcnNpb24gMi4wICh0aGUgIkxpY2Vuc2UiKTsKLy8geW91IG1heSBub3QgdXNlIHRoaXMgZmlsZSBleGNlcHQgaW4gY29tcGxpYW5jZSB3aXRoIHRoZSBMaWNlbnNlLgovLyBZb3UgbWF5IG9idGFpbiBhIGNvcHkgb2YgdGhlIExpY2Vuc2UgYXQKLy8KLy8gICAgICBodHRwOi8vd3d3LmFwYWNoZS5vcmcvbGljZW5zZXMvTElDRU5TRS0yLjAKLy8KLy8gVW5sZXNzIHJlcXVpcmVkIGJ5IGFwcGxpY2FibGUgbGF3IG9yIGFncmVlZCB0byBpbiB3cml0aW5nLCBzb2Z0d2FyZQovLyBkaXN0cmlidXRlZCB1bmRlciB0aGUgTGljZW5zZSBpcyBkaXN0cmlidXRlZCBvbiBhbiAiQVMgSVMiIEJBU0lTLAovLyBXSVRIT1VUIFdBUlJBTlRJRVMgT1IgQ09ORElUSU9OUyBPRiBBTlkgS0lORCwgZWl0aGVyIGV4cHJlc3Mgb3IgaW1wbGllZC4KLy8gU2VlIHRoZSBMaWNlbnNlIGZvciB0aGUgc3BlY2lmaWMgbGFuZ3VhZ2UgZ292ZXJuaW5nIHBlcm1pc3Npb25zIGFuZAovLyBsaW1pdGF0aW9ucyB1bmRlciB0aGUgTGljZW5zZS4KCi8qKgogKiBAZmlsZW92ZXJ2aWV3IEhlbHBlcnMgZm9yIGdvb2dsZS5jb2xhYiBQeXRob24gbW9kdWxlLgogKi8KKGZ1bmN0aW9uKHNjb3BlKSB7CmZ1bmN0aW9uIHNwYW4odGV4dCwgc3R5bGVBdHRyaWJ1dGVzID0ge30pIHsKICBjb25zdCBlbGVtZW50ID0gZG9jdW1lbnQuY3JlYXRlRWxlbWVudCgnc3BhbicpOwogIGVsZW1lbnQudGV4dENvbnRlbnQgPSB0ZXh0OwogIGZvciAoY29uc3Qga2V5IG9mIE9iamVjdC5rZXlzKHN0eWxlQXR0cmlidXRlcykpIHsKICAgIGVsZW1lbnQuc3R5bGVba2V5XSA9IHN0eWxlQXR0cmlidXRlc1trZXldOwogIH0KICByZXR1cm4gZWxlbWVudDsKfQoKLy8gTWF4IG51bWJlciBvZiBieXRlcyB3aGljaCB3aWxsIGJlIHVwbG9hZGVkIGF0IGEgdGltZS4KY29uc3QgTUFYX1BBWUxPQURfU0laRSA9IDEwMCAqIDEwMjQ7Ci8vIE1heCBhbW91bnQgb2YgdGltZSB0byBibG9jayB3YWl0aW5nIGZvciB0aGUgdXNlci4KY29uc3QgRklMRV9DSEFOR0VfVElNRU9VVF9NUyA9IDMwICogMTAwMDsKCmZ1bmN0aW9uIF91cGxvYWRGaWxlcyhpbnB1dElkLCBvdXRwdXRJZCkgewogIGNvbnN0IHN0ZXBzID0gdXBsb2FkRmlsZXNTdGVwKGlucHV0SWQsIG91dHB1dElkKTsKICBjb25zdCBvdXRwdXRFbGVtZW50ID0gZG9jdW1lbnQuZ2V0RWxlbWVudEJ5SWQob3V0cHV0SWQpOwogIC8vIENhY2hlIHN0ZXBzIG9uIHRoZSBvdXRwdXRFbGVtZW50IHRvIG1ha2UgaXQgYXZhaWxhYmxlIGZvciB0aGUgbmV4dCBjYWxsCiAgLy8gdG8gdXBsb2FkRmlsZXNDb250aW51ZSBmcm9tIFB5dGhvbi4KICBvdXRwdXRFbGVtZW50LnN0ZXBzID0gc3RlcHM7CgogIHJldHVybiBfdXBsb2FkRmlsZXNDb250aW51ZShvdXRwdXRJZCk7Cn0KCi8vIFRoaXMgaXMgcm91Z2hseSBhbiBhc3luYyBnZW5lcmF0b3IgKG5vdCBzdXBwb3J0ZWQgaW4gdGhlIGJyb3dzZXIgeWV0KSwKLy8gd2hlcmUgdGhlcmUgYXJlIG11bHRpcGxlIGFzeW5jaHJvbm91cyBzdGVwcyBhbmQgdGhlIFB5dGhvbiBzaWRlIGlzIGdvaW5nCi8vIHRvIHBvbGwgZm9yIGNvbXBsZXRpb24gb2YgZWFjaCBzdGVwLgovLyBUaGlzIHVzZXMgYSBQcm9taXNlIHRvIGJsb2NrIHRoZSBweXRob24gc2lkZSBvbiBjb21wbGV0aW9uIG9mIGVhY2ggc3RlcCwKLy8gdGhlbiBwYXNzZXMgdGhlIHJlc3VsdCBvZiB0aGUgcHJldmlvdXMgc3RlcCBhcyB0aGUgaW5wdXQgdG8gdGhlIG5leHQgc3RlcC4KZnVuY3Rpb24gX3VwbG9hZEZpbGVzQ29udGludWUob3V0cHV0SWQpIHsKICBjb25zdCBvdXRwdXRFbGVtZW50ID0gZG9jdW1lbnQuZ2V0RWxlbWVudEJ5SWQob3V0cHV0SWQpOwogIGNvbnN0IHN0ZXBzID0gb3V0cHV0RWxlbWVudC5zdGVwczsKCiAgY29uc3QgbmV4dCA9IHN0ZXBzLm5leHQob3V0cHV0RWxlbWVudC5sYXN0UHJvbWlzZVZhbHVlKTsKICByZXR1cm4gUHJvbWlzZS5yZXNvbHZlKG5leHQudmFsdWUucHJvbWlzZSkudGhlbigodmFsdWUpID0+IHsKICAgIC8vIENhY2hlIHRoZSBsYXN0IHByb21pc2UgdmFsdWUgdG8gbWFrZSBpdCBhdmFpbGFibGUgdG8gdGhlIG5leHQKICAgIC8vIHN0ZXAgb2YgdGhlIGdlbmVyYXRvci4KICAgIG91dHB1dEVsZW1lbnQubGFzdFByb21pc2VWYWx1ZSA9IHZhbHVlOwogICAgcmV0dXJuIG5leHQudmFsdWUucmVzcG9uc2U7CiAgfSk7Cn0KCi8qKgogKiBHZW5lcmF0b3IgZnVuY3Rpb24gd2hpY2ggaXMgY2FsbGVkIGJldHdlZW4gZWFjaCBhc3luYyBzdGVwIG9mIHRoZSB1cGxvYWQKICogcHJvY2Vzcy4KICogQHBhcmFtIHtzdHJpbmd9IGlucHV0SWQgRWxlbWVudCBJRCBvZiB0aGUgaW5wdXQgZmlsZSBwaWNrZXIgZWxlbWVudC4KICogQHBhcmFtIHtzdHJpbmd9IG91dHB1dElkIEVsZW1lbnQgSUQgb2YgdGhlIG91dHB1dCBkaXNwbGF5LgogKiBAcmV0dXJuIHshSXRlcmFibGU8IU9iamVjdD59IEl0ZXJhYmxlIG9mIG5leHQgc3RlcHMuCiAqLwpmdW5jdGlvbiogdXBsb2FkRmlsZXNTdGVwKGlucHV0SWQsIG91dHB1dElkKSB7CiAgY29uc3QgaW5wdXRFbGVtZW50ID0gZG9jdW1lbnQuZ2V0RWxlbWVudEJ5SWQoaW5wdXRJZCk7CiAgaW5wdXRFbGVtZW50LmRpc2FibGVkID0gZmFsc2U7CgogIGNvbnN0IG91dHB1dEVsZW1lbnQgPSBkb2N1bWVudC5nZXRFbGVtZW50QnlJZChvdXRwdXRJZCk7CiAgb3V0cHV0RWxlbWVudC5pbm5lckhUTUwgPSAnJzsKCiAgY29uc3QgcGlja2VkUHJvbWlzZSA9IG5ldyBQcm9taXNlKChyZXNvbHZlKSA9PiB7CiAgICBpbnB1dEVsZW1lbnQuYWRkRXZlbnRMaXN0ZW5lcignY2hhbmdlJywgKGUpID0+IHsKICAgICAgcmVzb2x2ZShlLnRhcmdldC5maWxlcyk7CiAgICB9KTsKICB9KTsKCiAgY29uc3QgY2FuY2VsID0gZG9jdW1lbnQuY3JlYXRlRWxlbWVudCgnYnV0dG9uJyk7CiAgaW5wdXRFbGVtZW50LnBhcmVudEVsZW1lbnQuYXBwZW5kQ2hpbGQoY2FuY2VsKTsKICBjYW5jZWwudGV4dENvbnRlbnQgPSAnQ2FuY2VsIHVwbG9hZCc7CiAgY29uc3QgY2FuY2VsUHJvbWlzZSA9IG5ldyBQcm9taXNlKChyZXNvbHZlKSA9PiB7CiAgICBjYW5jZWwub25jbGljayA9ICgpID0+IHsKICAgICAgcmVzb2x2ZShudWxsKTsKICAgIH07CiAgfSk7CgogIC8vIENhbmNlbCB1cGxvYWQgaWYgdXNlciBoYXNuJ3QgcGlja2VkIGFueXRoaW5nIGluIHRpbWVvdXQuCiAgY29uc3QgdGltZW91dFByb21pc2UgPSBuZXcgUHJvbWlzZSgocmVzb2x2ZSkgPT4gewogICAgc2V0VGltZW91dCgoKSA9PiB7CiAgICAgIHJlc29sdmUobnVsbCk7CiAgICB9LCBGSUxFX0NIQU5HRV9USU1FT1VUX01TKTsKICB9KTsKCiAgLy8gV2FpdCBmb3IgdGhlIHVzZXIgdG8gcGljayB0aGUgZmlsZXMuCiAgY29uc3QgZmlsZXMgPSB5aWVsZCB7CiAgICBwcm9taXNlOiBQcm9taXNlLnJhY2UoW3BpY2tlZFByb21pc2UsIHRpbWVvdXRQcm9taXNlLCBjYW5jZWxQcm9taXNlXSksCiAgICByZXNwb25zZTogewogICAgICBhY3Rpb246ICdzdGFydGluZycsCiAgICB9CiAgfTsKCiAgaWYgKCFmaWxlcykgewogICAgcmV0dXJuIHsKICAgICAgcmVzcG9uc2U6IHsKICAgICAgICBhY3Rpb246ICdjb21wbGV0ZScsCiAgICAgIH0KICAgIH07CiAgfQoKICBjYW5jZWwucmVtb3ZlKCk7CgogIC8vIERpc2FibGUgdGhlIGlucHV0IGVsZW1lbnQgc2luY2UgZnVydGhlciBwaWNrcyBhcmUgbm90IGFsbG93ZWQuCiAgaW5wdXRFbGVtZW50LmRpc2FibGVkID0gdHJ1ZTsKCiAgZm9yIChjb25zdCBmaWxlIG9mIGZpbGVzKSB7CiAgICBjb25zdCBsaSA9IGRvY3VtZW50LmNyZWF0ZUVsZW1lbnQoJ2xpJyk7CiAgICBsaS5hcHBlbmQoc3BhbihmaWxlLm5hbWUsIHtmb250V2VpZ2h0OiAnYm9sZCd9KSk7CiAgICBsaS5hcHBlbmQoc3BhbigKICAgICAgICBgKCR7ZmlsZS50eXBlIHx8ICduL2EnfSkgLSAke2ZpbGUuc2l6ZX0gYnl0ZXMsIGAgKwogICAgICAgIGBsYXN0IG1vZGlmaWVkOiAkewogICAgICAgICAgICBmaWxlLmxhc3RNb2RpZmllZERhdGUgPyBmaWxlLmxhc3RNb2RpZmllZERhdGUudG9Mb2NhbGVEYXRlU3RyaW5nKCkgOgogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAnbi9hJ30gLSBgKSk7CiAgICBjb25zdCBwZXJjZW50ID0gc3BhbignMCUgZG9uZScpOwogICAgbGkuYXBwZW5kQ2hpbGQocGVyY2VudCk7CgogICAgb3V0cHV0RWxlbWVudC5hcHBlbmRDaGlsZChsaSk7CgogICAgY29uc3QgZmlsZURhdGFQcm9taXNlID0gbmV3IFByb21pc2UoKHJlc29sdmUpID0+IHsKICAgICAgY29uc3QgcmVhZGVyID0gbmV3IEZpbGVSZWFkZXIoKTsKICAgICAgcmVhZGVyLm9ubG9hZCA9IChlKSA9PiB7CiAgICAgICAgcmVzb2x2ZShlLnRhcmdldC5yZXN1bHQpOwogICAgICB9OwogICAgICByZWFkZXIucmVhZEFzQXJyYXlCdWZmZXIoZmlsZSk7CiAgICB9KTsKICAgIC8vIFdhaXQgZm9yIHRoZSBkYXRhIHRvIGJlIHJlYWR5LgogICAgbGV0IGZpbGVEYXRhID0geWllbGQgewogICAgICBwcm9taXNlOiBmaWxlRGF0YVByb21pc2UsCiAgICAgIHJlc3BvbnNlOiB7CiAgICAgICAgYWN0aW9uOiAnY29udGludWUnLAogICAgICB9CiAgICB9OwoKICAgIC8vIFVzZSBhIGNodW5rZWQgc2VuZGluZyB0byBhdm9pZCBtZXNzYWdlIHNpemUgbGltaXRzLiBTZWUgYi82MjExNTY2MC4KICAgIGxldCBwb3NpdGlvbiA9IDA7CiAgICB3aGlsZSAocG9zaXRpb24gPCBmaWxlRGF0YS5ieXRlTGVuZ3RoKSB7CiAgICAgIGNvbnN0IGxlbmd0aCA9IE1hdGgubWluKGZpbGVEYXRhLmJ5dGVMZW5ndGggLSBwb3NpdGlvbiwgTUFYX1BBWUxPQURfU0laRSk7CiAgICAgIGNvbnN0IGNodW5rID0gbmV3IFVpbnQ4QXJyYXkoZmlsZURhdGEsIHBvc2l0aW9uLCBsZW5ndGgpOwogICAgICBwb3NpdGlvbiArPSBsZW5ndGg7CgogICAgICBjb25zdCBiYXNlNjQgPSBidG9hKFN0cmluZy5mcm9tQ2hhckNvZGUuYXBwbHkobnVsbCwgY2h1bmspKTsKICAgICAgeWllbGQgewogICAgICAgIHJlc3BvbnNlOiB7CiAgICAgICAgICBhY3Rpb246ICdhcHBlbmQnLAogICAgICAgICAgZmlsZTogZmlsZS5uYW1lLAogICAgICAgICAgZGF0YTogYmFzZTY0LAogICAgICAgIH0sCiAgICAgIH07CiAgICAgIHBlcmNlbnQudGV4dENvbnRlbnQgPQogICAgICAgICAgYCR7TWF0aC5yb3VuZCgocG9zaXRpb24gLyBmaWxlRGF0YS5ieXRlTGVuZ3RoKSAqIDEwMCl9JSBkb25lYDsKICAgIH0KICB9CgogIC8vIEFsbCBkb25lLgogIHlpZWxkIHsKICAgIHJlc3BvbnNlOiB7CiAgICAgIGFjdGlvbjogJ2NvbXBsZXRlJywKICAgIH0KICB9Owp9CgpzY29wZS5nb29nbGUgPSBzY29wZS5nb29nbGUgfHwge307CnNjb3BlLmdvb2dsZS5jb2xhYiA9IHNjb3BlLmdvb2dsZS5jb2xhYiB8fCB7fTsKc2NvcGUuZ29vZ2xlLmNvbGFiLl9maWxlcyA9IHsKICBfdXBsb2FkRmlsZXMsCiAgX3VwbG9hZEZpbGVzQ29udGludWUsCn07Cn0pKHNlbGYpOwo=", | |
| "ok": true, | |
| "headers": [ | |
| [ | |
| "content-type", | |
| "application/javascript" | |
| ] | |
| ], | |
| "status": 200, | |
| "status_text": "" | |
| } | |
| }, | |
| "base_uri": "https://localhost:8080/", | |
| "height": 2024 | |
| } | |
| }, | |
| "cell_type": "code", | |
| "source": [ | |
| "#carregando os dataset de avaliações\n", | |
| "\n", | |
| "uploaded = files.upload()\n", | |
| "\n", | |
| "avaliacoes = pd.read_csv('ratings_small.csv')\n", | |
| "\n", | |
| "avaliacoes\n", | |
| "\n" | |
| ], | |
| "execution_count": 0, | |
| "outputs": [ | |
| { | |
| "output_type": "display_data", | |
| "data": { | |
| "text/html": [ | |
| "\n", | |
| " <input type=\"file\" id=\"files-035be742-12ae-4eac-bd42-d873c1c7e38d\" name=\"files[]\" multiple disabled />\n", | |
| " <output id=\"result-035be742-12ae-4eac-bd42-d873c1c7e38d\">\n", | |
| " Upload widget is only available when the cell has been executed in the\n", | |
| " current browser session. Please rerun this cell to enable.\n", | |
| " </output>\n", | |
| " <script src=\"/nbextensions/google.colab/files.js\"></script> " | |
| ], | |
| "text/plain": [ | |
| "<IPython.core.display.HTML object>" | |
| ] | |
| }, | |
| "metadata": { | |
| "tags": [] | |
| } | |
| }, | |
| { | |
| "output_type": "stream", | |
| "text": [ | |
| "Saving ratings_small.csv to ratings_small.csv\n" | |
| ], | |
| "name": "stdout" | |
| }, | |
| { | |
| "output_type": "execute_result", | |
| "data": { | |
| "text/html": [ | |
| "<div>\n", | |
| "<style scoped>\n", | |
| " .dataframe tbody tr th:only-of-type {\n", | |
| " vertical-align: middle;\n", | |
| " }\n", | |
| "\n", | |
| " .dataframe tbody tr th {\n", | |
| " vertical-align: top;\n", | |
| " }\n", | |
| "\n", | |
| " .dataframe thead th {\n", | |
| " text-align: right;\n", | |
| " }\n", | |
| "</style>\n", | |
| "<table border=\"1\" class=\"dataframe\">\n", | |
| " <thead>\n", | |
| " <tr style=\"text-align: right;\">\n", | |
| " <th></th>\n", | |
| " <th>userId</th>\n", | |
| " <th>movieId</th>\n", | |
| " <th>rating</th>\n", | |
| " <th>timestamp</th>\n", | |
| " </tr>\n", | |
| " </thead>\n", | |
| " <tbody>\n", | |
| " <tr>\n", | |
| " <th>0</th>\n", | |
| " <td>1</td>\n", | |
| " <td>31</td>\n", | |
| " <td>2.5</td>\n", | |
| " <td>1260759144</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>1</th>\n", | |
| " <td>1</td>\n", | |
| " <td>1029</td>\n", | |
| " <td>3.0</td>\n", | |
| " <td>1260759179</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>2</th>\n", | |
| " <td>1</td>\n", | |
| " <td>1061</td>\n", | |
| " <td>3.0</td>\n", | |
| " <td>1260759182</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>3</th>\n", | |
| " <td>1</td>\n", | |
| " <td>1129</td>\n", | |
| " <td>2.0</td>\n", | |
| " <td>1260759185</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>4</th>\n", | |
| " <td>1</td>\n", | |
| " <td>1172</td>\n", | |
| " <td>4.0</td>\n", | |
| " <td>1260759205</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>5</th>\n", | |
| " <td>1</td>\n", | |
| " <td>1263</td>\n", | |
| " <td>2.0</td>\n", | |
| " <td>1260759151</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>6</th>\n", | |
| " <td>1</td>\n", | |
| " <td>1287</td>\n", | |
| " <td>2.0</td>\n", | |
| " <td>1260759187</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>7</th>\n", | |
| " <td>1</td>\n", | |
| " <td>1293</td>\n", | |
| " <td>2.0</td>\n", | |
| " <td>1260759148</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>8</th>\n", | |
| " <td>1</td>\n", | |
| " <td>1339</td>\n", | |
| " <td>3.5</td>\n", | |
| " <td>1260759125</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>9</th>\n", | |
| " <td>1</td>\n", | |
| " <td>1343</td>\n", | |
| " <td>2.0</td>\n", | |
| " <td>1260759131</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>10</th>\n", | |
| " <td>1</td>\n", | |
| " <td>1371</td>\n", | |
| " <td>2.5</td>\n", | |
| " <td>1260759135</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>11</th>\n", | |
| " <td>1</td>\n", | |
| " <td>1405</td>\n", | |
| " <td>1.0</td>\n", | |
| " <td>1260759203</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>12</th>\n", | |
| " <td>1</td>\n", | |
| " <td>1953</td>\n", | |
| " <td>4.0</td>\n", | |
| " <td>1260759191</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>13</th>\n", | |
| " <td>1</td>\n", | |
| " <td>2105</td>\n", | |
| " <td>4.0</td>\n", | |
| " <td>1260759139</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>14</th>\n", | |
| " <td>1</td>\n", | |
| " <td>2150</td>\n", | |
| " <td>3.0</td>\n", | |
| " <td>1260759194</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>15</th>\n", | |
| " <td>1</td>\n", | |
| " <td>2193</td>\n", | |
| " <td>2.0</td>\n", | |
| " <td>1260759198</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>16</th>\n", | |
| " <td>1</td>\n", | |
| " <td>2294</td>\n", | |
| " <td>2.0</td>\n", | |
| " <td>1260759108</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>17</th>\n", | |
| " <td>1</td>\n", | |
| " <td>2455</td>\n", | |
| " <td>2.5</td>\n", | |
| " <td>1260759113</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>18</th>\n", | |
| " <td>1</td>\n", | |
| " <td>2968</td>\n", | |
| " <td>1.0</td>\n", | |
| " <td>1260759200</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>19</th>\n", | |
| " <td>1</td>\n", | |
| " <td>3671</td>\n", | |
| " <td>3.0</td>\n", | |
| " <td>1260759117</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>20</th>\n", | |
| " <td>2</td>\n", | |
| " <td>10</td>\n", | |
| " <td>4.0</td>\n", | |
| " <td>835355493</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>21</th>\n", | |
| " <td>2</td>\n", | |
| " <td>17</td>\n", | |
| " <td>5.0</td>\n", | |
| " <td>835355681</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>22</th>\n", | |
| " <td>2</td>\n", | |
| " <td>39</td>\n", | |
| " <td>5.0</td>\n", | |
| " <td>835355604</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>23</th>\n", | |
| " <td>2</td>\n", | |
| " <td>47</td>\n", | |
| " <td>4.0</td>\n", | |
| " <td>835355552</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>24</th>\n", | |
| " <td>2</td>\n", | |
| " <td>50</td>\n", | |
| " <td>4.0</td>\n", | |
| " <td>835355586</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>25</th>\n", | |
| " <td>2</td>\n", | |
| " <td>52</td>\n", | |
| " <td>3.0</td>\n", | |
| " <td>835356031</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>26</th>\n", | |
| " <td>2</td>\n", | |
| " <td>62</td>\n", | |
| " <td>3.0</td>\n", | |
| " <td>835355749</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>27</th>\n", | |
| " <td>2</td>\n", | |
| " <td>110</td>\n", | |
| " <td>4.0</td>\n", | |
| " <td>835355532</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>28</th>\n", | |
| " <td>2</td>\n", | |
| " <td>144</td>\n", | |
| " <td>3.0</td>\n", | |
| " <td>835356016</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>29</th>\n", | |
| " <td>2</td>\n", | |
| " <td>150</td>\n", | |
| " <td>5.0</td>\n", | |
| " <td>835355395</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>...</th>\n", | |
| " <td>...</td>\n", | |
| " <td>...</td>\n", | |
| " <td>...</td>\n", | |
| " <td>...</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>99974</th>\n", | |
| " <td>671</td>\n", | |
| " <td>4034</td>\n", | |
| " <td>4.5</td>\n", | |
| " <td>1064245493</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>99975</th>\n", | |
| " <td>671</td>\n", | |
| " <td>4306</td>\n", | |
| " <td>5.0</td>\n", | |
| " <td>1064245548</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>99976</th>\n", | |
| " <td>671</td>\n", | |
| " <td>4308</td>\n", | |
| " <td>3.5</td>\n", | |
| " <td>1065111985</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>99977</th>\n", | |
| " <td>671</td>\n", | |
| " <td>4880</td>\n", | |
| " <td>4.0</td>\n", | |
| " <td>1065111973</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>99978</th>\n", | |
| " <td>671</td>\n", | |
| " <td>4886</td>\n", | |
| " <td>5.0</td>\n", | |
| " <td>1064245488</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>99979</th>\n", | |
| " <td>671</td>\n", | |
| " <td>4896</td>\n", | |
| " <td>5.0</td>\n", | |
| " <td>1065111996</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>99980</th>\n", | |
| " <td>671</td>\n", | |
| " <td>4963</td>\n", | |
| " <td>4.5</td>\n", | |
| " <td>1065111855</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>99981</th>\n", | |
| " <td>671</td>\n", | |
| " <td>4973</td>\n", | |
| " <td>4.5</td>\n", | |
| " <td>1064245471</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>99982</th>\n", | |
| " <td>671</td>\n", | |
| " <td>4993</td>\n", | |
| " <td>5.0</td>\n", | |
| " <td>1064245483</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>99983</th>\n", | |
| " <td>671</td>\n", | |
| " <td>4995</td>\n", | |
| " <td>4.0</td>\n", | |
| " <td>1064891537</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>99984</th>\n", | |
| " <td>671</td>\n", | |
| " <td>5010</td>\n", | |
| " <td>2.0</td>\n", | |
| " <td>1066793004</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>99985</th>\n", | |
| " <td>671</td>\n", | |
| " <td>5218</td>\n", | |
| " <td>2.0</td>\n", | |
| " <td>1065111990</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>99986</th>\n", | |
| " <td>671</td>\n", | |
| " <td>5299</td>\n", | |
| " <td>3.0</td>\n", | |
| " <td>1065112004</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>99987</th>\n", | |
| " <td>671</td>\n", | |
| " <td>5349</td>\n", | |
| " <td>4.0</td>\n", | |
| " <td>1065111863</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>99988</th>\n", | |
| " <td>671</td>\n", | |
| " <td>5377</td>\n", | |
| " <td>4.0</td>\n", | |
| " <td>1064245557</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>99989</th>\n", | |
| " <td>671</td>\n", | |
| " <td>5445</td>\n", | |
| " <td>4.5</td>\n", | |
| " <td>1064891627</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>99990</th>\n", | |
| " <td>671</td>\n", | |
| " <td>5464</td>\n", | |
| " <td>3.0</td>\n", | |
| " <td>1064891549</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>99991</th>\n", | |
| " <td>671</td>\n", | |
| " <td>5669</td>\n", | |
| " <td>4.0</td>\n", | |
| " <td>1063502711</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>99992</th>\n", | |
| " <td>671</td>\n", | |
| " <td>5816</td>\n", | |
| " <td>4.0</td>\n", | |
| " <td>1065111963</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>99993</th>\n", | |
| " <td>671</td>\n", | |
| " <td>5902</td>\n", | |
| " <td>3.5</td>\n", | |
| " <td>1064245507</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>99994</th>\n", | |
| " <td>671</td>\n", | |
| " <td>5952</td>\n", | |
| " <td>5.0</td>\n", | |
| " <td>1063502716</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>99995</th>\n", | |
| " <td>671</td>\n", | |
| " <td>5989</td>\n", | |
| " <td>4.0</td>\n", | |
| " <td>1064890625</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>99996</th>\n", | |
| " <td>671</td>\n", | |
| " <td>5991</td>\n", | |
| " <td>4.5</td>\n", | |
| " <td>1064245387</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>99997</th>\n", | |
| " <td>671</td>\n", | |
| " <td>5995</td>\n", | |
| " <td>4.0</td>\n", | |
| " <td>1066793014</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>99998</th>\n", | |
| " <td>671</td>\n", | |
| " <td>6212</td>\n", | |
| " <td>2.5</td>\n", | |
| " <td>1065149436</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>99999</th>\n", | |
| " <td>671</td>\n", | |
| " <td>6268</td>\n", | |
| " <td>2.5</td>\n", | |
| " <td>1065579370</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>100000</th>\n", | |
| " <td>671</td>\n", | |
| " <td>6269</td>\n", | |
| " <td>4.0</td>\n", | |
| " <td>1065149201</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>100001</th>\n", | |
| " <td>671</td>\n", | |
| " <td>6365</td>\n", | |
| " <td>4.0</td>\n", | |
| " <td>1070940363</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>100002</th>\n", | |
| " <td>671</td>\n", | |
| " <td>6385</td>\n", | |
| " <td>2.5</td>\n", | |
| " <td>1070979663</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>100003</th>\n", | |
| " <td>671</td>\n", | |
| " <td>6565</td>\n", | |
| " <td>3.5</td>\n", | |
| " <td>1074784724</td>\n", | |
| " </tr>\n", | |
| " </tbody>\n", | |
| "</table>\n", | |
| "<p>100004 rows × 4 columns</p>\n", | |
| "</div>" | |
| ], | |
| "text/plain": [ | |
| " userId movieId rating timestamp\n", | |
| "0 1 31 2.5 1260759144\n", | |
| "1 1 1029 3.0 1260759179\n", | |
| "2 1 1061 3.0 1260759182\n", | |
| "3 1 1129 2.0 1260759185\n", | |
| "4 1 1172 4.0 1260759205\n", | |
| "5 1 1263 2.0 1260759151\n", | |
| "6 1 1287 2.0 1260759187\n", | |
| "7 1 1293 2.0 1260759148\n", | |
| "8 1 1339 3.5 1260759125\n", | |
| "9 1 1343 2.0 1260759131\n", | |
| "10 1 1371 2.5 1260759135\n", | |
| "11 1 1405 1.0 1260759203\n", | |
| "12 1 1953 4.0 1260759191\n", | |
| "13 1 2105 4.0 1260759139\n", | |
| "14 1 2150 3.0 1260759194\n", | |
| "15 1 2193 2.0 1260759198\n", | |
| "16 1 2294 2.0 1260759108\n", | |
| "17 1 2455 2.5 1260759113\n", | |
| "18 1 2968 1.0 1260759200\n", | |
| "19 1 3671 3.0 1260759117\n", | |
| "20 2 10 4.0 835355493\n", | |
| "21 2 17 5.0 835355681\n", | |
| "22 2 39 5.0 835355604\n", | |
| "23 2 47 4.0 835355552\n", | |
| "24 2 50 4.0 835355586\n", | |
| "25 2 52 3.0 835356031\n", | |
| "26 2 62 3.0 835355749\n", | |
| "27 2 110 4.0 835355532\n", | |
| "28 2 144 3.0 835356016\n", | |
| "29 2 150 5.0 835355395\n", | |
| "... ... ... ... ...\n", | |
| "99974 671 4034 4.5 1064245493\n", | |
| "99975 671 4306 5.0 1064245548\n", | |
| "99976 671 4308 3.5 1065111985\n", | |
| "99977 671 4880 4.0 1065111973\n", | |
| "99978 671 4886 5.0 1064245488\n", | |
| "99979 671 4896 5.0 1065111996\n", | |
| "99980 671 4963 4.5 1065111855\n", | |
| "99981 671 4973 4.5 1064245471\n", | |
| "99982 671 4993 5.0 1064245483\n", | |
| "99983 671 4995 4.0 1064891537\n", | |
| "99984 671 5010 2.0 1066793004\n", | |
| "99985 671 5218 2.0 1065111990\n", | |
| "99986 671 5299 3.0 1065112004\n", | |
| "99987 671 5349 4.0 1065111863\n", | |
| "99988 671 5377 4.0 1064245557\n", | |
| "99989 671 5445 4.5 1064891627\n", | |
| "99990 671 5464 3.0 1064891549\n", | |
| "99991 671 5669 4.0 1063502711\n", | |
| "99992 671 5816 4.0 1065111963\n", | |
| "99993 671 5902 3.5 1064245507\n", | |
| "99994 671 5952 5.0 1063502716\n", | |
| "99995 671 5989 4.0 1064890625\n", | |
| "99996 671 5991 4.5 1064245387\n", | |
| "99997 671 5995 4.0 1066793014\n", | |
| "99998 671 6212 2.5 1065149436\n", | |
| "99999 671 6268 2.5 1065579370\n", | |
| "100000 671 6269 4.0 1065149201\n", | |
| "100001 671 6365 4.0 1070940363\n", | |
| "100002 671 6385 2.5 1070979663\n", | |
| "100003 671 6565 3.5 1074784724\n", | |
| "\n", | |
| "[100004 rows x 4 columns]" | |
| ] | |
| }, | |
| "metadata": { | |
| "tags": [] | |
| }, | |
| "execution_count": 3 | |
| } | |
| ] | |
| }, | |
| { | |
| "metadata": { | |
| "id": "7v2d7O0bKOQP", | |
| "colab_type": "text" | |
| }, | |
| "cell_type": "markdown", | |
| "source": [ | |
| "Os filmes serão avaliados de uma escala entre 1 e 5." | |
| ] | |
| }, | |
| { | |
| "metadata": { | |
| "id": "BRymPKV-KaH_", | |
| "colab_type": "code", | |
| "colab": {} | |
| }, | |
| "cell_type": "code", | |
| "source": [ | |
| "data = Dataset.load_from_df(avaliacoes[['userId', 'movieId', 'rating']], reader)\n", | |
| "\n", | |
| "x = data.split(n_folds=5)\n" | |
| ], | |
| "execution_count": 0, | |
| "outputs": [] | |
| }, | |
| { | |
| "metadata": { | |
| "id": "o6PjYwezKzUB", | |
| "colab_type": "text" | |
| }, | |
| "cell_type": "markdown", | |
| "source": [ | |
| "SVD - Decomposição do valor singular\n", | |
| "\n", | |
| "\n", | |
| "* MAE computes the average error. We use the absolute value for each error (r_ui - est_ui), else some errors would be negative which makes no sense.\n", | |
| "* RMSE does quite the same thing, expect that instead of taking the absolute value we square the errors. This has the effect of having only positive errors as well, but big errors are heavily penalized. To obtain a number which is of the dimension of a rating (error), we take the root.\n", | |
| "\n", | |
| "\n", | |
| "MAE - Calcula a média do erro. Usa o valor absoluto de cada erro (uma vez que erros negativos não tem sentido).\n", | |
| "RMSE - Faz o mesmo que o MAE, exceto que ao invés de pegar o valor do erro absoluto faz-se o quadrado do erro. O efeito disso é ter somente valores de erro positivos, mas grandes erros são fortemente penalizados.\n", | |
| "\n", | |
| "\n", | |
| "\n", | |
| "\n" | |
| ] | |
| }, | |
| { | |
| "metadata": { | |
| "id": "sOy-DraCKzru", | |
| "colab_type": "code", | |
| "outputId": "7b4c1af1-97d2-490c-8abd-f210d98fc473", | |
| "colab": { | |
| "base_uri": "https://localhost:8080/", | |
| "height": 768 | |
| } | |
| }, | |
| "cell_type": "code", | |
| "source": [ | |
| "#mostra a escala de erro\n", | |
| "\n", | |
| "svd = SVD()\n", | |
| "evaluate(svd, data, measures=['RMSE', 'MAE'])" | |
| ], | |
| "execution_count": 0, | |
| "outputs": [ | |
| { | |
| "output_type": "stream", | |
| "text": [ | |
| "/usr/local/lib/python3.6/dist-packages/surprise/evaluate.py:66: UserWarning: The evaluate() method is deprecated. Please use model_selection.cross_validate() instead.\n", | |
| " 'model_selection.cross_validate() instead.', UserWarning)\n", | |
| "/usr/local/lib/python3.6/dist-packages/surprise/dataset.py:193: UserWarning: Using data.split() or using load_from_folds() without using a CV iterator is now deprecated. \n", | |
| " UserWarning)\n" | |
| ], | |
| "name": "stderr" | |
| }, | |
| { | |
| "output_type": "stream", | |
| "text": [ | |
| "Evaluating RMSE, MAE of algorithm SVD.\n", | |
| "\n", | |
| "------------\n", | |
| "Fold 1\n", | |
| "RMSE: 0.8934\n", | |
| "MAE: 0.6875\n", | |
| "------------\n", | |
| "Fold 2\n", | |
| "RMSE: 0.8954\n", | |
| "MAE: 0.6894\n", | |
| "------------\n", | |
| "Fold 3\n", | |
| "RMSE: 0.8859\n", | |
| "MAE: 0.6834\n", | |
| "------------\n", | |
| "Fold 4\n", | |
| "RMSE: 0.9040\n", | |
| "MAE: 0.6976\n", | |
| "------------\n", | |
| "Fold 5\n", | |
| "RMSE: 0.9065\n", | |
| "MAE: 0.6963\n", | |
| "------------\n", | |
| "------------\n", | |
| "Mean RMSE: 0.8970\n", | |
| "Mean MAE : 0.6908\n", | |
| "------------\n", | |
| "------------\n" | |
| ], | |
| "name": "stdout" | |
| }, | |
| { | |
| "output_type": "execute_result", | |
| "data": { | |
| "text/plain": [ | |
| "CaseInsensitiveDefaultDict(list,\n", | |
| " {'mae': [0.6874891046447704,\n", | |
| " 0.6894408711894955,\n", | |
| " 0.6834057658802518,\n", | |
| " 0.6975685029693567,\n", | |
| " 0.6963356105796227],\n", | |
| " 'rmse': [0.8933654197506257,\n", | |
| " 0.8953883321484486,\n", | |
| " 0.8858690715615625,\n", | |
| " 0.9039984377663985,\n", | |
| " 0.9064656294955329]})" | |
| ] | |
| }, | |
| "metadata": { | |
| "tags": [] | |
| }, | |
| "execution_count": 5 | |
| } | |
| ] | |
| }, | |
| { | |
| "metadata": { | |
| "id": "7mA_wXpMK-MV", | |
| "colab_type": "text" | |
| }, | |
| "cell_type": "markdown", | |
| "source": [ | |
| "Como foi feita a predição." | |
| ] | |
| }, | |
| { | |
| "metadata": { | |
| "id": "ndtH6Bf0K-_9", | |
| "colab_type": "code", | |
| "outputId": "25d8f8cb-ace4-467d-d2a5-f3d441747d39", | |
| "colab": { | |
| "base_uri": "https://localhost:8080/", | |
| "height": 34 | |
| } | |
| }, | |
| "cell_type": "code", | |
| "source": [ | |
| "trein = data.build_full_trainset()\n", | |
| "\n", | |
| "svd.fit(trein)\n", | |
| "trein" | |
| ], | |
| "execution_count": 0, | |
| "outputs": [ | |
| { | |
| "output_type": "execute_result", | |
| "data": { | |
| "text/plain": [ | |
| "<surprise.trainset.Trainset at 0x7f9790f176d8>" | |
| ] | |
| }, | |
| "metadata": { | |
| "tags": [] | |
| }, | |
| "execution_count": 6 | |
| } | |
| ] | |
| }, | |
| { | |
| "metadata": { | |
| "id": "b8gKBbPkLNVt", | |
| "colab_type": "text" | |
| }, | |
| "cell_type": "markdown", | |
| "source": [ | |
| "Mostra o id do usuário, e as avaliações que ele deu em determinados filmes (os filmes também estão representados por ID)." | |
| ] | |
| }, | |
| { | |
| "metadata": { | |
| "id": "dH5gPi21LXW5", | |
| "colab_type": "code", | |
| "outputId": "8a290df3-0b34-4d1a-8f1a-32b5961aff52", | |
| "colab": { | |
| "base_uri": "https://localhost:8080/", | |
| "height": 1475 | |
| } | |
| }, | |
| "cell_type": "code", | |
| "source": [ | |
| "avaliacoes[avaliacoes['userId'] == 50]" | |
| ], | |
| "execution_count": 0, | |
| "outputs": [ | |
| { | |
| "output_type": "execute_result", | |
| "data": { | |
| "text/html": [ | |
| "<div>\n", | |
| "<style scoped>\n", | |
| " .dataframe tbody tr th:only-of-type {\n", | |
| " vertical-align: middle;\n", | |
| " }\n", | |
| "\n", | |
| " .dataframe tbody tr th {\n", | |
| " vertical-align: top;\n", | |
| " }\n", | |
| "\n", | |
| " .dataframe thead th {\n", | |
| " text-align: right;\n", | |
| " }\n", | |
| "</style>\n", | |
| "<table border=\"1\" class=\"dataframe\">\n", | |
| " <thead>\n", | |
| " <tr style=\"text-align: right;\">\n", | |
| " <th></th>\n", | |
| " <th>userId</th>\n", | |
| " <th>movieId</th>\n", | |
| " <th>rating</th>\n", | |
| " <th>timestamp</th>\n", | |
| " </tr>\n", | |
| " </thead>\n", | |
| " <tbody>\n", | |
| " <tr>\n", | |
| " <th>7987</th>\n", | |
| " <td>50</td>\n", | |
| " <td>10</td>\n", | |
| " <td>4.0</td>\n", | |
| " <td>847412607</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>7988</th>\n", | |
| " <td>50</td>\n", | |
| " <td>21</td>\n", | |
| " <td>3.0</td>\n", | |
| " <td>847412676</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>7989</th>\n", | |
| " <td>50</td>\n", | |
| " <td>39</td>\n", | |
| " <td>2.0</td>\n", | |
| " <td>847412692</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>7990</th>\n", | |
| " <td>50</td>\n", | |
| " <td>47</td>\n", | |
| " <td>3.0</td>\n", | |
| " <td>847412643</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>7991</th>\n", | |
| " <td>50</td>\n", | |
| " <td>95</td>\n", | |
| " <td>3.0</td>\n", | |
| " <td>847412837</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>7992</th>\n", | |
| " <td>50</td>\n", | |
| " <td>110</td>\n", | |
| " <td>4.0</td>\n", | |
| " <td>847412607</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>7993</th>\n", | |
| " <td>50</td>\n", | |
| " <td>150</td>\n", | |
| " <td>3.0</td>\n", | |
| " <td>847412515</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>7994</th>\n", | |
| " <td>50</td>\n", | |
| " <td>160</td>\n", | |
| " <td>3.0</td>\n", | |
| " <td>847412712</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>7995</th>\n", | |
| " <td>50</td>\n", | |
| " <td>161</td>\n", | |
| " <td>4.0</td>\n", | |
| " <td>847412607</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>7996</th>\n", | |
| " <td>50</td>\n", | |
| " <td>165</td>\n", | |
| " <td>4.0</td>\n", | |
| " <td>847412543</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>7997</th>\n", | |
| " <td>50</td>\n", | |
| " <td>185</td>\n", | |
| " <td>3.0</td>\n", | |
| " <td>847412606</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>7998</th>\n", | |
| " <td>50</td>\n", | |
| " <td>208</td>\n", | |
| " <td>3.0</td>\n", | |
| " <td>847412606</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>7999</th>\n", | |
| " <td>50</td>\n", | |
| " <td>231</td>\n", | |
| " <td>1.0</td>\n", | |
| " <td>847412567</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>8000</th>\n", | |
| " <td>50</td>\n", | |
| " <td>253</td>\n", | |
| " <td>3.0</td>\n", | |
| " <td>847412628</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>8001</th>\n", | |
| " <td>50</td>\n", | |
| " <td>282</td>\n", | |
| " <td>3.0</td>\n", | |
| " <td>847412811</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>8002</th>\n", | |
| " <td>50</td>\n", | |
| " <td>292</td>\n", | |
| " <td>4.0</td>\n", | |
| " <td>847412586</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>8003</th>\n", | |
| " <td>50</td>\n", | |
| " <td>296</td>\n", | |
| " <td>4.0</td>\n", | |
| " <td>847412515</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>8004</th>\n", | |
| " <td>50</td>\n", | |
| " <td>315</td>\n", | |
| " <td>3.0</td>\n", | |
| " <td>847412812</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>8005</th>\n", | |
| " <td>50</td>\n", | |
| " <td>316</td>\n", | |
| " <td>3.0</td>\n", | |
| " <td>847412567</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>8006</th>\n", | |
| " <td>50</td>\n", | |
| " <td>337</td>\n", | |
| " <td>3.0</td>\n", | |
| " <td>847412859</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>8007</th>\n", | |
| " <td>50</td>\n", | |
| " <td>339</td>\n", | |
| " <td>4.0</td>\n", | |
| " <td>847412628</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>8008</th>\n", | |
| " <td>50</td>\n", | |
| " <td>344</td>\n", | |
| " <td>2.0</td>\n", | |
| " <td>847412544</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>8009</th>\n", | |
| " <td>50</td>\n", | |
| " <td>349</td>\n", | |
| " <td>3.0</td>\n", | |
| " <td>847412567</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>8010</th>\n", | |
| " <td>50</td>\n", | |
| " <td>356</td>\n", | |
| " <td>4.0</td>\n", | |
| " <td>847412567</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>8011</th>\n", | |
| " <td>50</td>\n", | |
| " <td>357</td>\n", | |
| " <td>4.0</td>\n", | |
| " <td>847412692</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>8012</th>\n", | |
| " <td>50</td>\n", | |
| " <td>367</td>\n", | |
| " <td>4.0</td>\n", | |
| " <td>847412643</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>8013</th>\n", | |
| " <td>50</td>\n", | |
| " <td>368</td>\n", | |
| " <td>4.0</td>\n", | |
| " <td>847413071</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>8014</th>\n", | |
| " <td>50</td>\n", | |
| " <td>377</td>\n", | |
| " <td>3.0</td>\n", | |
| " <td>847412643</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>8015</th>\n", | |
| " <td>50</td>\n", | |
| " <td>380</td>\n", | |
| " <td>3.0</td>\n", | |
| " <td>847412515</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>8016</th>\n", | |
| " <td>50</td>\n", | |
| " <td>420</td>\n", | |
| " <td>3.0</td>\n", | |
| " <td>847412692</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>8017</th>\n", | |
| " <td>50</td>\n", | |
| " <td>434</td>\n", | |
| " <td>3.0</td>\n", | |
| " <td>847412586</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>8018</th>\n", | |
| " <td>50</td>\n", | |
| " <td>440</td>\n", | |
| " <td>4.0</td>\n", | |
| " <td>847412711</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>8019</th>\n", | |
| " <td>50</td>\n", | |
| " <td>442</td>\n", | |
| " <td>3.0</td>\n", | |
| " <td>847412812</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>8020</th>\n", | |
| " <td>50</td>\n", | |
| " <td>454</td>\n", | |
| " <td>3.0</td>\n", | |
| " <td>847412628</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>8021</th>\n", | |
| " <td>50</td>\n", | |
| " <td>457</td>\n", | |
| " <td>3.0</td>\n", | |
| " <td>847413020</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>8022</th>\n", | |
| " <td>50</td>\n", | |
| " <td>480</td>\n", | |
| " <td>4.0</td>\n", | |
| " <td>847412586</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>8023</th>\n", | |
| " <td>50</td>\n", | |
| " <td>509</td>\n", | |
| " <td>3.0</td>\n", | |
| " <td>847412812</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>8024</th>\n", | |
| " <td>50</td>\n", | |
| " <td>527</td>\n", | |
| " <td>4.0</td>\n", | |
| " <td>847412676</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>8025</th>\n", | |
| " <td>50</td>\n", | |
| " <td>539</td>\n", | |
| " <td>3.0</td>\n", | |
| " <td>847412676</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>8026</th>\n", | |
| " <td>50</td>\n", | |
| " <td>553</td>\n", | |
| " <td>3.0</td>\n", | |
| " <td>847413043</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>8027</th>\n", | |
| " <td>50</td>\n", | |
| " <td>587</td>\n", | |
| " <td>3.0</td>\n", | |
| " <td>847412659</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>8028</th>\n", | |
| " <td>50</td>\n", | |
| " <td>589</td>\n", | |
| " <td>5.0</td>\n", | |
| " <td>847412628</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>8029</th>\n", | |
| " <td>50</td>\n", | |
| " <td>590</td>\n", | |
| " <td>3.0</td>\n", | |
| " <td>847412515</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>8030</th>\n", | |
| " <td>50</td>\n", | |
| " <td>597</td>\n", | |
| " <td>3.0</td>\n", | |
| " <td>847412659</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>8031</th>\n", | |
| " <td>50</td>\n", | |
| " <td>780</td>\n", | |
| " <td>4.0</td>\n", | |
| " <td>847412879</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>8032</th>\n", | |
| " <td>50</td>\n", | |
| " <td>786</td>\n", | |
| " <td>3.0</td>\n", | |
| " <td>847413043</td>\n", | |
| " </tr>\n", | |
| " </tbody>\n", | |
| "</table>\n", | |
| "</div>" | |
| ], | |
| "text/plain": [ | |
| " userId movieId rating timestamp\n", | |
| "7987 50 10 4.0 847412607\n", | |
| "7988 50 21 3.0 847412676\n", | |
| "7989 50 39 2.0 847412692\n", | |
| "7990 50 47 3.0 847412643\n", | |
| "7991 50 95 3.0 847412837\n", | |
| "7992 50 110 4.0 847412607\n", | |
| "7993 50 150 3.0 847412515\n", | |
| "7994 50 160 3.0 847412712\n", | |
| "7995 50 161 4.0 847412607\n", | |
| "7996 50 165 4.0 847412543\n", | |
| "7997 50 185 3.0 847412606\n", | |
| "7998 50 208 3.0 847412606\n", | |
| "7999 50 231 1.0 847412567\n", | |
| "8000 50 253 3.0 847412628\n", | |
| "8001 50 282 3.0 847412811\n", | |
| "8002 50 292 4.0 847412586\n", | |
| "8003 50 296 4.0 847412515\n", | |
| "8004 50 315 3.0 847412812\n", | |
| "8005 50 316 3.0 847412567\n", | |
| "8006 50 337 3.0 847412859\n", | |
| "8007 50 339 4.0 847412628\n", | |
| "8008 50 344 2.0 847412544\n", | |
| "8009 50 349 3.0 847412567\n", | |
| "8010 50 356 4.0 847412567\n", | |
| "8011 50 357 4.0 847412692\n", | |
| "8012 50 367 4.0 847412643\n", | |
| "8013 50 368 4.0 847413071\n", | |
| "8014 50 377 3.0 847412643\n", | |
| "8015 50 380 3.0 847412515\n", | |
| "8016 50 420 3.0 847412692\n", | |
| "8017 50 434 3.0 847412586\n", | |
| "8018 50 440 4.0 847412711\n", | |
| "8019 50 442 3.0 847412812\n", | |
| "8020 50 454 3.0 847412628\n", | |
| "8021 50 457 3.0 847413020\n", | |
| "8022 50 480 4.0 847412586\n", | |
| "8023 50 509 3.0 847412812\n", | |
| "8024 50 527 4.0 847412676\n", | |
| "8025 50 539 3.0 847412676\n", | |
| "8026 50 553 3.0 847413043\n", | |
| "8027 50 587 3.0 847412659\n", | |
| "8028 50 589 5.0 847412628\n", | |
| "8029 50 590 3.0 847412515\n", | |
| "8030 50 597 3.0 847412659\n", | |
| "8031 50 780 4.0 847412879\n", | |
| "8032 50 786 3.0 847413043" | |
| ] | |
| }, | |
| "metadata": { | |
| "tags": [] | |
| }, | |
| "execution_count": 7 | |
| } | |
| ] | |
| }, | |
| { | |
| "metadata": { | |
| "id": "hvztfOXkLof7", | |
| "colab_type": "text" | |
| }, | |
| "cell_type": "markdown", | |
| "source": [ | |
| "Nessa linha, ele prediz a avaliação do filme com determinado ID do qual o usuário 1 ainda não avaliou." | |
| ] | |
| }, | |
| { | |
| "metadata": { | |
| "id": "gkRAn2vuLq_k", | |
| "colab_type": "code", | |
| "outputId": "53a44f83-a900-4dcd-f81b-25d73b207178", | |
| "colab": { | |
| "base_uri": "https://localhost:8080/", | |
| "height": 34 | |
| } | |
| }, | |
| "cell_type": "code", | |
| "source": [ | |
| "svd.predict(50, 509, 5)" | |
| ], | |
| "execution_count": 0, | |
| "outputs": [ | |
| { | |
| "output_type": "execute_result", | |
| "data": { | |
| "text/plain": [ | |
| "Prediction(uid=50, iid=509, r_ui=5, est=3.5132013541727947, details={'was_impossible': False})" | |
| ] | |
| }, | |
| "metadata": { | |
| "tags": [] | |
| }, | |
| "execution_count": 11 | |
| } | |
| ] | |
| } | |
| ] | |
| } |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment