Skip to content

Instantly share code, notes, and snippets.

@RodolfoFerro
Created September 3, 2025 13:54
Show Gist options
  • Select an option

  • Save RodolfoFerro/530dc67c9affe096668076051e9baabc to your computer and use it in GitHub Desktop.

Select an option

Save RodolfoFerro/530dc67c9affe096668076051e9baabc to your computer and use it in GitHub Desktop.
Regresión lineal & logística
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"private_outputs": true,
"provenance": [],
"authorship_tag": "ABX9TyMHXfR1u4vuUOHykycgXZtv",
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"language_info": {
"name": "python"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/RodolfoFerro/530dc67c9affe096668076051e9baabc/regresi-n-lineal-log-stica.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"source": [
"# **Ejercicio práctico:** Exploración de modelos de regresión\n",
"\n",
"- **Autor:** Rodolfo Ferro"
],
"metadata": {
"id": "ZSz472eW2GkR"
}
},
{
"cell_type": "markdown",
"source": [
"### Importar paqueterías"
],
"metadata": {
"id": "nk4Vole02QRK"
}
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "5cnawWIFY0Fh"
},
"outputs": [],
"source": [
"import numpy as np\n",
"import plotly.graph_objects as go\n",
"from sklearn.datasets import make_regression\n",
"from sklearn.datasets import make_classification\n",
"from sklearn.linear_model import LinearRegression\n",
"from sklearn.linear_model import LogisticRegression"
]
},
{
"cell_type": "markdown",
"source": [
"## **Regresión lineal**"
],
"metadata": {
"id": "Iu4wHzRvXIYD"
}
},
{
"cell_type": "markdown",
"source": [
"### Generación de datos"
],
"metadata": {
"id": "yO5K_8WZ2Xj5"
}
},
{
"cell_type": "code",
"source": [
"n_samples = 700 #@param {type:\"slider\", min:100, max:1000, step:100}\n",
"noise = 15 #@param {type:\"slider\", min:5, max:40, step:5}"
],
"metadata": {
"id": "REyq13ZJ2lQ-"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"# Datos sintéticos (2 features para graficar en 3D)\n",
"X, y = make_regression(\n",
" n_samples=n_samples,\n",
" n_features=2,\n",
" noise=noise,\n",
" random_state=42\n",
")"
],
"metadata": {
"id": "1FxQ1HSM2YnP"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"### Gráfico de dispersión"
],
"metadata": {
"id": "YWM1g9JHXBBN"
}
},
{
"cell_type": "code",
"source": [
"# Gráfico de dispersión\n",
"fig = go.Figure()\n",
"fig.add_trace(\n",
" go.Scatter3d(\n",
" x=X[:, 0], y=X[:, 1], z=y,\n",
" mode=\"markers\",\n",
" marker=dict(size=5, color=\"#636EFB\", opacity=0.5),\n",
" name=\"Datos generados\"\n",
" )\n",
")\n",
"fig.update_layout(\n",
" title=\"Gráfico de dispersión\",\n",
" scene=dict(\n",
" xaxis_title=\"X1\",\n",
" yaxis_title=\"X2\",\n",
" zaxis_title=\"y\"\n",
" )\n",
")\n",
"fig.show()"
],
"metadata": {
"id": "7jQsB2_EV2pQ"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"### Entrenamiento del modelo"
],
"metadata": {
"id": "4qxpWjjrXEGr"
}
},
{
"cell_type": "code",
"source": [
"# Creamos y ajustamos el modelo de regresión lineal\n",
"model = LinearRegression()\n",
"model.fit(X, y)"
],
"metadata": {
"id": "lxWyE4YS2a21"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"### Mostrar resultados de entrenamiento"
],
"metadata": {
"id": "hWf8vXKjXLCT"
}
},
{
"cell_type": "code",
"source": [
"# Predicciones sobre una malla\n",
"x_range = np.linspace(X[:,0].min(), X[:,0].max(), 20)\n",
"y_range = np.linspace(X[:,1].min(), X[:,1].max(), 20)\n",
"xx, yy = np.meshgrid(x_range, y_range)\n",
"zz = model.predict(np.c_[xx.ravel(), yy.ravel()]).reshape(xx.shape)\n",
"\n",
"# Gráfico de dispersión\n",
"fig = go.Figure()\n",
"fig.add_trace(\n",
" go.Scatter3d(\n",
" x=X[:, 0], y=X[:, 1], z=y,\n",
" mode=\"markers\",\n",
" marker=dict(size=5, color=\"#636EFB\", opacity=0.5),\n",
" name=\"Datos generados\"\n",
" )\n",
")\n",
"\n",
"# Plano ajustado\n",
"colorscale = [[0, \"#EF553B\"], [1, \"#EF553B\"]]\n",
"fig.add_trace(\n",
" go.Surface(\n",
" x=xx, y=yy, z=zz,\n",
" colorscale=colorscale,\n",
" opacity=0.5,\n",
" name=\"Plano de regresión\"\n",
" )\n",
")\n",
"\n",
"fig.update_layout(\n",
" title=\"Regresión lineal en 3 dimensiones\",\n",
" scene=dict(\n",
" xaxis_title=\"X1\",\n",
" yaxis_title=\"X2\",\n",
" zaxis_title=\"y\"\n",
" )\n",
")\n",
"fig.show()"
],
"metadata": {
"id": "aqku8Mi-V5No"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"### Evaluación del modelo"
],
"metadata": {
"id": "hVZFdsAZXTxR"
}
},
{
"cell_type": "code",
"source": [
"model.score(X, y)"
],
"metadata": {
"id": "5T3Ux06yXOXH"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"## **Tarea**\n",
"\n",
"- Investigar en la documentación de scikit-learn qué significa el score para este tipo de modelos.\n",
"- Modificar el parámetro de ruido y el número de datos. Reentrenar el modelo y evaluar los nuevos resultados obtenidos. **¿Qué podemos concluir de ello?**"
],
"metadata": {
"id": "YLw4i3A5XVRr"
}
},
{
"cell_type": "markdown",
"source": [
"---"
],
"metadata": {
"id": "BsGrGvTKcuK8"
}
},
{
"cell_type": "markdown",
"source": [
"## **Regresión logística**"
],
"metadata": {
"id": "Lql9UAXGXqGq"
}
},
{
"cell_type": "markdown",
"source": [
"### Generación de datos"
],
"metadata": {
"id": "O-ZSkocfXr-6"
}
},
{
"cell_type": "code",
"source": [
"# Datos sintéticos (2 features para graficar en 3D)\n",
"X, y = make_classification(\n",
" n_samples=n_samples, n_features=2, n_redundant=0, n_informative=1,\n",
" class_sep=2.3, n_clusters_per_class=1, scale=0.5, random_state=42\n",
")"
],
"metadata": {
"id": "uEr6BAlcX1oE"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"### Gráfico de dispersión"
],
"metadata": {
"id": "8hdLmIULcXHe"
}
},
{
"cell_type": "code",
"source": [
"# Gráfico de dispersión\n",
"fig = go.Figure()\n",
"colorscale = [[0, \"#636EFB\"], [1, \"#EF553B\"]]\n",
"fig.add_trace(\n",
" go.Scatter3d(\n",
" x=X[:, 0], y=X[:, 1], z=y,\n",
" mode=\"markers\",\n",
" marker=dict(size=5, color=y, colorscale=colorscale, opacity=0.5),\n",
" name=\"Datos generados\"\n",
" )\n",
")\n",
"fig.update_layout(\n",
" title=\"Gráfico de dispersión\",\n",
" scene=dict(\n",
" xaxis_title=\"X1\",\n",
" yaxis_title=\"X2\",\n",
" zaxis_title=\"y\"\n",
" )\n",
")\n",
"fig.show()"
],
"metadata": {
"id": "Og-z3M-PX3E0"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"### Entrenamiento del modelo"
],
"metadata": {
"id": "n4gGWyF9cghF"
}
},
{
"cell_type": "code",
"source": [
"# Creamos y ajustamos el modelo de regresión logística\n",
"log_reg = LogisticRegression()\n",
"log_reg.fit(X, y)"
],
"metadata": {
"id": "H7TYaEcIY2zh"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"# Obtenemos los coeficientes del modelo\n",
"b0 = log_reg.intercept_[0]\n",
"b1, b2 = log_reg.coef_[0]"
],
"metadata": {
"id": "NqgMVmCwbuqp"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"### Mostrar resultados de entrenamiento"
],
"metadata": {
"id": "D1TzJ1-ccsRs"
}
},
{
"cell_type": "code",
"source": [
"# Crear plano de decisión: b0 + b1*x1 + b2*x2 = 0\n",
"x_range = np.linspace(X[:,0].min(), X[:,0].max(), 20)\n",
"y_range = np.linspace(X[:,1].min(), X[:,1].max(), 20)\n",
"xx, yy = np.meshgrid(x_range, y_range)\n",
"zz = -(b0 + b1*xx + b2*yy)/1 # frontera lineal (plano)\n",
"\n",
"# Gráfico de dispersión\n",
"fig = go.Figure()\n",
"colorscale = [[0, \"#636EFB\"], [1, \"#EF553B\"]]\n",
"fig.add_trace(\n",
" go.Scatter3d(\n",
" x=X[:, 0], y=X[:, 1], z=y,\n",
" mode=\"markers\",\n",
" marker=dict(size=5, color=y, colorscale=colorscale, opacity=0.5),\n",
" name=\"Datos generados\"\n",
" )\n",
")\n",
"\n",
"# Plano de decisión\n",
"colorscale = [[0, \"#AB63FA\"], [1, \"#AB63FA\"]]\n",
"fig.add_trace(go.Surface(\n",
" x=xx, y=yy, z=zz,\n",
" colorscale=colorscale,\n",
" opacity=0.8,\n",
" showscale=False,\n",
" name=\"Frontera de decisión\"\n",
"))\n",
"\n",
"fig.update_layout(\n",
" title=\"Regresión logística en 3 dimensiones\",\n",
" scene=dict(\n",
" xaxis_title=\"X1\",\n",
" yaxis_title=\"X2\",\n",
" zaxis_title=\"Clase\"\n",
" )\n",
")\n",
"\n",
"fig.show()"
],
"metadata": {
"id": "_3_UEZX4ZYrZ"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"### Evaluación del modelo"
],
"metadata": {
"id": "yhXM7EcFcxin"
}
},
{
"cell_type": "code",
"source": [
" from sklearn.metrics import classification_report\n",
"\n",
"\n",
" y_pred = log_reg.predict(X)\n",
" report = classification_report(y, y_pred)\n",
" print(report)"
],
"metadata": {
"id": "n9bAgPK0bPcT"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"## **Tarea**\n",
"\n",
"- Investigar en la documentación de scikit-learn qué significan los parámetros del modelo.\n",
"- Modificar parámetros en la generación del dataset. Reentrenar el modelo y evaluar los nuevos resultados obtenidos. **¿Qué podemos concluir de ello?**"
],
"metadata": {
"id": "P4-eH_uVdXuz"
}
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment