Skip to content

Instantly share code, notes, and snippets.

@mattbullen
Last active October 16, 2025 12:00
Show Gist options
  • Select an option

  • Save mattbullen/f34eb03709863466ace421a666d30361 to your computer and use it in GitHub Desktop.

Select an option

Save mattbullen/f34eb03709863466ace421a666d30361 to your computer and use it in GitHub Desktop.
unit07-ex3-multi-layer-perceptron.ipynb
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/mattbullen/f34eb03709863466ace421a666d30361/unit07-ex3-multi-layer-perceptron.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "8tBFC1qoqAcq"
},
"source": [
"### Author: Dr Mike Lakoju"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "QrgDwyKOqAcr"
},
"source": [
" * If X is high, the value is approximately 1\n",
" * if X is small, the value is approximately 0"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "DEC81gtrqAcr"
},
"source": [
"## Import Libraries"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "zagQTPSsqAcr"
},
"outputs": [],
"source": [
"import numpy as np"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "fD5lg7qCqAcr"
},
"source": [
"## Define the Sigmoid Function"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "RO8pnlS4qAcr"
},
"outputs": [],
"source": [
"def sigmoid(sum_func):\n",
" return 1 / (1 + np.exp(-sum_func))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "rZfW7OGEqAcr",
"outputId": "c0365218-2ab8-4021-9733-d03bca62494e"
},
"outputs": [
{
"data": {
"text/plain": [
"0.5"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"sigmoid(0)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "B1bQ461mqAcs",
"outputId": "09d3cee6-2556-44d1-cffd-d1e955e33492"
},
"outputs": [
{
"data": {
"text/plain": [
"7.38905609893065"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"np.exp(2)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "IHDOx92BqAcs",
"outputId": "1d5de01a-b565-49db-e756-4aa92e09f3d6"
},
"outputs": [
{
"data": {
"text/plain": [
"2.718281828459045"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"np.exp(1)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "RJah_yP9qAcs",
"outputId": "561fd71a-0c9e-4218-c63f-54d9fabf4a9e"
},
"outputs": [
{
"data": {
"text/plain": [
"1.0"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"sigmoid(40)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "J-sORjD1qAcs",
"outputId": "0c90d064-72fd-4dd2-a1e8-910f90e529fd"
},
"outputs": [
{
"data": {
"text/plain": [
"1.2501528648238605e-09"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"sigmoid(-20.5)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "euvFzbaaqAcs"
},
"source": [
"# Input Layer to Hidden Layer"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "73WWcJDdqAcs"
},
"source": [
"### Define \"Inputs, outputs and weights\" as Numpy arrays"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "7lF9qPClqAcs"
},
"source": [
"#### Inputs"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "5AeI3yiuqAcs"
},
"outputs": [],
"source": [
"inputs = np.array([[0,0],\n",
" [0,1],\n",
" [1,0],\n",
" [1,1]])"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "vMNF3jWSqAcs",
"outputId": "235cf247-7b01-43a5-a12d-f57b66c6c56d"
},
"outputs": [
{
"data": {
"text/plain": [
"array([[0, 0],\n",
" [0, 1],\n",
" [1, 0],\n",
" [1, 1]])"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"inputs"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "dL6d9-2vqAcs",
"outputId": "b4e494f4-2c48-4d10-dffc-14b7fe1efaa2"
},
"outputs": [
{
"data": {
"text/plain": [
"(4, 2)"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"inputs.shape"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "mUoffeinqAct"
},
"source": [
"#### Outputs"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "0RaM1nQDqAct"
},
"outputs": [],
"source": [
"outputs = np.array([[0],\n",
" [1],\n",
" [1],\n",
" [0]])"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "gTTlycuXqAct",
"outputId": "1f0faaef-f61f-4db1-db68-d43fde20a189"
},
"outputs": [
{
"data": {
"text/plain": [
"(4, 1)"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"outputs.shape"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "RLM6-mjzqAct"
},
"source": [
"### Weights"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "8gsRh6NHqAct"
},
"source": [
"#### These weights are for the connection between the inputs and the hidden layer"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "UdbiOFdjqAct",
"outputId": "6a689392-ee20-4fcd-f567-3ba86c00f21b"
},
"outputs": [
{
"data": {
"text/plain": [
"(2, 3)"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# First row holds the weights for x1, 2nd row contains the weights for x2\n",
"\n",
"weights_0 = np.array([[-0.424, -0.740, -0.961],\n",
" [0.358, -0.577, -0.469]])\n",
"weights_0.shape"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "bbZOIonRqAct"
},
"source": [
"#### These weights are for the connection between the hidden layer and the output"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "4EgiLMZXqAct",
"outputId": "163ddf2e-d561-4616-f8e9-6174a050f7d8"
},
"outputs": [
{
"data": {
"text/plain": [
"(3, 1)"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"weights_1 = np.array([[-0.017],\n",
" [-0.893],\n",
" [0.148]])\n",
"weights_1.shape"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "6Zo0uDO8qAct"
},
"source": [
"#### Epochs & Learning Rate"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "8o-8BaUMqAct"
},
"outputs": [],
"source": [
"epochs = 100\n",
"learning_rate = 0.3"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "4mny4DvuqAct"
},
"outputs": [],
"source": [
"#for epoch in epochs:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "Q6wE-S7UqAct",
"outputId": "45233c67-b567-4e75-8769-40a57a89fdac"
},
"outputs": [
{
"data": {
"text/plain": [
"array([[0, 0],\n",
" [0, 1],\n",
" [1, 0],\n",
" [1, 1]])"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"input_layer = inputs\n",
"input_layer"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "YwCm1boPqAct",
"outputId": "bdea99df-1b46-4c2e-d367-7495da528640"
},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 0. , 0. , 0. ],\n",
" [ 0.358, -0.577, -0.469],\n",
" [-0.424, -0.74 , -0.961],\n",
" [-0.066, -1.317, -1.43 ]])"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# \"sum_synapse_0\" This holds the sum function total of weights for the hidden layer\n",
"# For the Output: Each row holds the sum_func for each input data [0,0,0 -> data 0,0],[0.358, -0.577, -0.469 --> 0,1]\n",
"# The dot product does the matrix multiplication and also the sum\n",
"\n",
"sum_synapse_0 = np.dot(input_layer, weights_0)\n",
"sum_synapse_0"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "-qZK6fn5qAct",
"outputId": "b3b8108b-0137-4da6-da43-9198c63ea765"
},
"outputs": [
{
"data": {
"text/plain": [
"array([[0.5 , 0.5 , 0.5 ],\n",
" [0.5885562 , 0.35962319, 0.38485296],\n",
" [0.39555998, 0.32300414, 0.27667802],\n",
" [0.48350599, 0.21131785, 0.19309868]])"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Computing the Sigmoid function for the Hidden layer\n",
"\n",
"hidden_layer = sigmoid(sum_synapse_0)\n",
"hidden_layer"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "b7v1NakZqAct"
},
"source": [
"Dealing with the 2nd side"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "arlNsISrqAcu",
"outputId": "981018ef-6857-4043-d659-24a3ee82712c"
},
"outputs": [
{
"data": {
"text/plain": [
"array([[-0.017],\n",
" [-0.893],\n",
" [ 0.148]])"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"weights_1"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "VyDCFNMiqAcu",
"outputId": "07f7a770-e58a-4188-adde-a0d7a39d795f"
},
"outputs": [
{
"data": {
"text/plain": [
"array([[-0.381 ],\n",
" [-0.27419072],\n",
" [-0.25421887],\n",
" [-0.16834784]])"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# \"sum_synapse_1\" This holds the sum function total of weights for the output layer\n",
"# For the Output: Each row holds the sum_func for each input data\n",
"\n",
"sum_synapse_1 = np.dot(hidden_layer, weights_1)\n",
"sum_synapse_1"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "HZIqsARQqAcu",
"outputId": "ba33778e-bb38-47f3-df70-0cd15a7256cf"
},
"outputs": [
{
"data": {
"text/plain": [
"array([[0.40588573],\n",
" [0.43187857],\n",
" [0.43678536],\n",
" [0.45801216]])"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"output_layer = sigmoid(sum_synapse_1)\n",
"output_layer"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "JcDaEYaJqAcv",
"outputId": "4c064d85-9e32-4773-ee51-4d4e904c8994"
},
"outputs": [
{
"data": {
"text/plain": [
"array([[0],\n",
" [1],\n",
" [1],\n",
" [0]])"
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"outputs"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "Ap31u1PrqAcv",
"outputId": "d7d2f216-91e7-4a0a-b406-167644e0446d"
},
"outputs": [
{
"data": {
"text/plain": [
"array([[0.40588573],\n",
" [0.43187857],\n",
" [0.43678536],\n",
" [0.45801216]])"
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"output_layer"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "Lswsw2hgqAcv",
"outputId": "6192d18e-d869-4e0b-c5da-c7964bc733c9"
},
"outputs": [
{
"data": {
"text/plain": [
"array([[-0.40588573],\n",
" [ 0.56812143],\n",
" [ 0.56321464],\n",
" [-0.45801216]])"
]
},
"execution_count": 26,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"error_output_layer = outputs - output_layer\n",
"error_output_layer"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "EGMiI5tnqAcv",
"outputId": "cb3d36c5-a39c-4334-d723-29a0f454d66f"
},
"outputs": [
{
"data": {
"text/plain": [
"0.49880848923713045"
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"average_error = np.mean(abs(error_output_layer))\n",
"average_error"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "OXjlMKivqAcv"
},
"source": [
"Sigmoid Derivative"
]
},
{
"cell_type": "markdown",
"source": [],
"metadata": {
"id": "DdDS0W2F48_e"
}
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "L_VH4g2AqAcv"
},
"outputs": [],
"source": [
"def sigmoid_derivative(sigmoid):\n",
" return sigmoid * (1 - sigmoid)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "EAsSgsHpqAcv"
},
"source": [
"# Delta output Calculation"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "Y--xrV5HqAcv",
"outputId": "8c4ee9f0-c415-4c7c-e087-dbdc59c1e45b"
},
"outputs": [
{
"data": {
"text/plain": [
"array([[0.40588573],\n",
" [0.43187857],\n",
" [0.43678536],\n",
" [0.45801216]])"
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# output_layer holds the results of our application of the sigmoid, computed above\n",
"\n",
"output_layer"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "oV4jRnGXqAcv",
"outputId": "9f3c99fe-e52e-4102-a51c-57c5e9fe0ac5"
},
"outputs": [
{
"data": {
"text/plain": [
"array([[0.2411425 ],\n",
" [0.24535947],\n",
" [0.24600391],\n",
" [0.24823702]])"
]
},
"execution_count": 30,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# derivative_output is our Derivative of the activation function (sigmoid) which we have on the slide\n",
"# each row is for each instance of our input dataset\n",
"\n",
"derivative_output = sigmoid_derivative(output_layer)\n",
"derivative_output"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "S4Y48lY3qAcv",
"outputId": "4dd8fae5-b0d6-4fa3-f2d1-aa27e3282247"
},
"outputs": [
{
"data": {
"text/plain": [
"array([[-0.40588573],\n",
" [ 0.56812143],\n",
" [ 0.56321464],\n",
" [-0.45801216]])"
]
},
"execution_count": 31,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"error_output_layer"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "b0vIxroyqAcv",
"outputId": "bed08a4b-4ed7-4235-ae6e-92c4c9e6f7b5"
},
"outputs": [
{
"data": {
"text/plain": [
"array([[-0.0978763 ],\n",
" [ 0.13939397],\n",
" [ 0.138553 ],\n",
" [-0.11369557]])"
]
},
"execution_count": 32,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Delta output\n",
"# each row is for each instance of our input dataset\n",
"\n",
"delta_output = error_output_layer * derivative_output\n",
"delta_output"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "iT8P-280qAcw"
},
"source": [
"# Delta calculations for the Hidden Layer"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "h-OHZANIqAcw",
"outputId": "e55ee1be-92a4-4a25-e4ea-3c1415407537"
},
"outputs": [
{
"data": {
"text/plain": [
"array([[-0.0978763 ],\n",
" [ 0.13939397],\n",
" [ 0.138553 ],\n",
" [-0.11369557]])"
]
},
"execution_count": 33,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"delta_output"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "gE0GCA7MqAcw",
"outputId": "c262b665-f71b-4658-dd64-7afe6cbbced0"
},
"outputs": [
{
"data": {
"text/plain": [
"array([[-0.017],\n",
" [-0.893],\n",
" [ 0.148]])"
]
},
"execution_count": 34,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"weights_1"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "EEAIW4LMqAcw"
},
"source": [
"#### NOTE THAT:\n",
" * Lets deal with this part first (Weight * delta_output)\n",
" * Notice that we will get an error below becuase of the shape of the weights_1 (Transpose)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "ZO8ECwLzqAcw",
"outputId": "8b37f58b-5b18-412e-9739-4e9a2031f611"
},
"outputs": [
{
"ename": "ValueError",
"evalue": "shapes (4,1) and (3,1) not aligned: 1 (dim 1) != 3 (dim 0)",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m<ipython-input-35-50b740e5a31c>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mdelta_output_x_weight\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mdelta_output\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdot\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mweights_1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[0;31mValueError\u001b[0m: shapes (4,1) and (3,1) not aligned: 1 (dim 1) != 3 (dim 0)"
]
}
],
"source": [
"delta_output_x_weight = delta_output.dot(weights_1)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "eOGSmwgzqAcw",
"outputId": "78106652-5fe2-4582-fb4a-ed9b1da412f5"
},
"outputs": [
{
"data": {
"text/plain": [
"(3, 1)"
]
},
"execution_count": 36,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"weights_1.shape"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "2VmrhC6mqAcw",
"outputId": "caa44f62-3154-4b5e-98fb-cc44747e5d77"
},
"outputs": [
{
"data": {
"text/plain": [
"array([[-0.017, -0.893, 0.148]])"
]
},
"execution_count": 37,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"weights_1T = weights_1.T\n",
"weights_1T"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "Ec6fpu1wqAcw",
"outputId": "0f899f0b-ec31-4f50-eb2c-c7e0c1931492"
},
"outputs": [
{
"data": {
"text/plain": [
"(1, 3)"
]
},
"execution_count": 38,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"weights_1T.shape"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "fV_XdKC_qAcw"
},
"source": [
"#### Each one of the weights will have to be multiplied by each delta_output for each data instance \n",
" \n",
" array([[-0.017],\n",
" [-0.893],\n",
" [ 0.148]])"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "i6FOg58jqAcy",
"outputId": "80ccfb87-e046-4a9d-e4e7-e8917fbf45d4"
},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 0.0016639 , 0.08740354, -0.01448569],\n",
" [-0.0023697 , -0.12447882, 0.02063031],\n",
" [-0.0023554 , -0.12372783, 0.02050584],\n",
" [ 0.00193282, 0.10153015, -0.01682694]])"
]
},
"execution_count": 39,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"delta_output_x_weight = delta_output.dot(weights_1T)\n",
"delta_output_x_weight"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ZHMDzHlSqAcy"
},
"source": [
"#### NOTE THAT:\n",
" * Now we need to deal with the last part of the equation \n",
" * sigmoid_derivative * delta_output_x_weight"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "0G_6CfrOqAcy",
"outputId": "53c938e0-42af-49f1-808e-74e739078124"
},
"outputs": [
{
"data": {
"text/plain": [
"array([[0.5 , 0.5 , 0.5 ],\n",
" [0.5885562 , 0.35962319, 0.38485296],\n",
" [0.39555998, 0.32300414, 0.27667802],\n",
" [0.48350599, 0.21131785, 0.19309868]])"
]
},
"execution_count": 40,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"hidden_layer"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "SgLBM-fBqAcy",
"outputId": "b2fcccf6-4b58-4cfd-e745-d301c30c580e"
},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 0.00041597, 0.02185088, -0.00362142],\n",
" [-0.00057384, -0.02866677, 0.00488404],\n",
" [-0.00056316, -0.02705587, 0.00410378],\n",
" [ 0.00048268, 0.01692128, -0.00262183]])"
]
},
"execution_count": 41,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Each row in the output of delta_hidden_layer is for the data input values\n",
"\n",
"delta_hidden_layer = delta_output_x_weight * sigmoid_derivative(hidden_layer)\n",
"delta_hidden_layer"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "urPoyUlUqAcy"
},
"source": [
"#### We will deal with the (input * delta) first\n",
"* The first column in \"hidden_layer\" holds the activation value for the first neuron"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "fPGF1TOKqAcy",
"outputId": "1a020639-8ae5-4f0b-b6aa-4f178c622bf8"
},
"outputs": [
{
"data": {
"text/plain": [
"array([[0.5 , 0.5 , 0.5 ],\n",
" [0.5885562 , 0.35962319, 0.38485296],\n",
" [0.39555998, 0.32300414, 0.27667802],\n",
" [0.48350599, 0.21131785, 0.19309868]])"
]
},
"execution_count": 42,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"hidden_layer"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "t5rM4mxUqAcz",
"outputId": "7e3ca281-7748-4fb0-cbe5-655202108cdf"
},
"outputs": [
{
"data": {
"text/plain": [
"array([[-0.0978763 ],\n",
" [ 0.13939397],\n",
" [ 0.138553 ],\n",
" [-0.11369557]])"
]
},
"execution_count": 43,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"delta_output"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "BAnIbfyuqAcz"
},
"source": [
"* We need to multiply the \"inputs\" by \"delta\" however, for the matrix multiplication we need to transpose the values in the hidden_layer, so we have all of them on one row for each neuron"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "5SQ839HoqAcz",
"outputId": "999d4638-87a1-483d-a59c-a56cac78c0b3"
},
"outputs": [
{
"data": {
"text/plain": [
"array([[0.5 , 0.5885562 , 0.39555998, 0.48350599],\n",
" [0.5 , 0.35962319, 0.32300414, 0.21131785],\n",
" [0.5 , 0.38485296, 0.27667802, 0.19309868]])"
]
},
"execution_count": 44,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"hidden_layerT = hidden_layer.T\n",
"hidden_layerT"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "JNvJ6-yrqAcz",
"outputId": "76306471-a9f3-41dc-f88d-dad11eadd791"
},
"outputs": [
{
"data": {
"text/plain": [
"array([[0.03293657],\n",
" [0.02191844],\n",
" [0.02108814]])"
]
},
"execution_count": 45,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"input_x_delta1 = hidden_layerT.dot(delta_output)\n",
"input_x_delta1"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "nMXRkXH5qAcz"
},
"source": [
"#### Let us now update the \"weights_1\""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "PEDTED82qAcz",
"outputId": "51e040e3-6f74-4595-80e4-b8f8ac0b23ea"
},
"outputs": [
{
"data": {
"text/plain": [
"array([[-0.00711903],\n",
" [-0.88642447],\n",
" [ 0.15432644]])"
]
},
"execution_count": 48,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"weights_1 = weights_1 + (input_x_delta1 * learning_rate)\n",
"weights_1"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Y3a5a-2zqAcz"
},
"source": [
"## Dealing with the Hidden Layer to Input Layer"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "pDLTEKTOqAcz",
"outputId": "d21620e4-fa9d-4544-c4f0-d9d9c4ce5403"
},
"outputs": [
{
"data": {
"text/plain": [
"array([[0, 0],\n",
" [0, 1],\n",
" [1, 0],\n",
" [1, 1]])"
]
},
"execution_count": 54,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# First column is X1, and 2nd column is X2 (our input values )\n",
"\n",
"input_layer"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "gVMMsw2jqAcz",
"outputId": "cf6b8e2a-6c9a-4110-b522-1605607f3b6e"
},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 0.00041597, 0.02185088, -0.00362142],\n",
" [-0.00057384, -0.02866677, 0.00488404],\n",
" [-0.00056316, -0.02705587, 0.00410378],\n",
" [ 0.00048268, 0.01692128, -0.00262183]])"
]
},
"execution_count": 50,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"delta_hidden_layer"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "VpI0j_WCqAcz",
"outputId": "517631b4-f294-4e52-8578-3a75350b2b0c"
},
"outputs": [
{
"data": {
"text/plain": [
"array([[0, 0, 1, 1],\n",
" [0, 1, 0, 1]])"
]
},
"execution_count": 51,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# we need to transpose the values just as we did before\n",
"\n",
"input_layerT = input_layer.T\n",
"input_layerT"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "U1L1WkwtqAcz",
"outputId": "7b8b0eaa-c3fe-4aaa-c5de-75267ec140e2"
},
"outputs": [
{
"data": {
"text/plain": [
"array([[-8.04778516e-05, -1.01345901e-02, 1.48194623e-03],\n",
" [-9.11603819e-05, -1.17454886e-02, 2.26221011e-03]])"
]
},
"execution_count": 52,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"input_x_delta0 = input_layerT.dot(delta_hidden_layer)\n",
"input_x_delta0"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "Y8JYuTJHqAcz",
"outputId": "2789fd6f-c8bd-4aa9-b03c-3c75defed723"
},
"outputs": [
{
"data": {
"text/plain": [
"array([[-0.42402414, -0.74304038, -0.96055542],\n",
" [ 0.35797265, -0.58052365, -0.46832134]])"
]
},
"execution_count": 53,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"weights_0 = weights_0 + (input_x_delta0 * learning_rate)\n",
"weights_0"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "NueQni1tqAcz"
},
"source": [
"### So all the lines of code above, has allowed us to complete our first epoch. we will need to put all the code together so we can run multiple epochs"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "pZpeA5k5qAc0"
},
"source": [
"# Complete Artificial Neural Network"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "7-HT1z8sqAc0"
},
"outputs": [],
"source": [
"#Importing Numpy\n",
"import numpy as np\n",
"\n",
"# This is the sigmoid Function\n",
"def sigmoid(sum):\n",
" return 1 / (1 + np.exp(-sum))\n",
"\n",
"#This is the sigmoid derivative as used before\n",
"def sigmoid_derivative(sigmoid):\n",
" return sigmoid * (1 - sigmoid)\n",
"\n",
"# Our input values\n",
"inputs = np.array([[0,0],\n",
" [0,1],\n",
" [1,0],\n",
" [1,1]])\n",
"#Our output values\n",
"outputs = np.array([[0],\n",
" [1],\n",
" [1],\n",
" [0]])"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "w2URHhU6qAc0"
},
"outputs": [],
"source": [
"# weights_0 = np.array([[-0.424, -0.740, -0.961],\n",
"# [0.358, -0.577, -0.469]])\n",
"\n",
"# weights_1 = np.array([[-0.017],\n",
"# [-0.893],\n",
"# [0.148]])"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "TyVUiXytqAc0"
},
"source": [
"### Initializing our weights with random values\n",
"* <b>Note:</b> Multiplying the random number by 2 and subtracting by 1, allows us to have a mix of both positive and negative random numbers for the weights"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "CRvokXFcqAc0"
},
"outputs": [],
"source": [
"weights_0 = 2 * np.random.random((2, 3)) - 1\n",
"weights_1 = 2 * np.random.random((3, 1)) - 1"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "kNHnmgpeqAc0",
"outputId": "92aa16df-e06c-4401-dd90-d724d14e5d1a"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Epoch: 1 Error: 0.4999269690201742\n",
"Epoch: 100001 Error: 0.015563199459768319\n",
"Epoch: 200001 Error: 0.010326071808337757\n",
"Epoch: 300001 Error: 0.008192022809586367\n"
]
}
],
"source": [
"epochs = 400000\n",
"learning_rate = 0.6\n",
"error = []\n",
"\n",
"for epoch in range(epochs):\n",
" input_layer = inputs\n",
" sum_synapse0 = np.dot(input_layer, weights_0)\n",
" hidden_layer = sigmoid(sum_synapse0)\n",
"\n",
" sum_synapse1 = np.dot(hidden_layer, weights_1)\n",
" output_layer = sigmoid(sum_synapse1)\n",
"\n",
" error_output_layer = outputs - output_layer\n",
" average = np.mean(abs(error_output_layer))\n",
" #print after every specified range of the value\n",
" if epoch % 100000 == 0:\n",
" print('Epoch: ' + str(epoch + 1) + ' Error: ' + str(average))\n",
" error.append(average)\n",
"\n",
" derivative_output = sigmoid_derivative(output_layer)\n",
" delta_output = error_output_layer * derivative_output\n",
"\n",
" weights1T = weights1.T\n",
" delta_output_weight = delta_output.dot(weights1T)\n",
" delta_hidden_layer = delta_output_weight * sigmoid_derivative(hidden_layer)\n",
"\n",
" hidden_layerT = hidden_layer.T\n",
" input_x_delta1 = hidden_layerT.dot(delta_output)\n",
" weights_1 = weights_1 + (input_x_delta1 * learning_rate)\n",
"\n",
" input_layerT = input_layer.T\n",
" input_x_delta0 = input_layerT.dot(delta_hidden_layer)\n",
" weights_0 = weights_0 + (input_x_delta0 * learning_rate)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "UrNetloPqAc0"
},
"source": [
"#### At this point after runing for 1million epochs you can see the value is very low."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "3G0qCYE6qAc0",
"outputId": "787fe7bc-c9fc-47cb-be5c-c6f4b31eefa6"
},
"outputs": [
{
"data": {
"text/plain": [
"0.9903290320690693"
]
},
"execution_count": 67,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#1 million epochs with a learning rate of 0.3\n",
"1 - 0.009670967930930745"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "UviN0IGAqAc0",
"outputId": "9a52dc57-affa-457f-d9ad-c9a9bb1f7543"
},
"outputs": [
{
"data": {
"text/plain": [
"0.9918079771904136"
]
},
"execution_count": 92,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#after 400,000 epochs, with a learning rate of 0.6\n",
"1- 0.008192022809586367"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "csQl8orzqAc0"
},
"source": [
"# Let's visualize this result"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "OgiBiWkKqAc0"
},
"outputs": [],
"source": [
"import matplotlib.pyplot as plt\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "wgYwKrx3qAc0",
"outputId": "8a8f8f34-2435-4bd7-f9f1-d69eda514fd0"
},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"plt.xlabel('Number of Epochs')\n",
"plt.ylabel('Error')\n",
"plt.title('Plot showing results from Neural Network')\n",
"plt.plot(error)\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "8pLBUsrUqAc0",
"outputId": "3d3130f3-e540-48bc-9faf-a9fc2523c9ff"
},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"plt.xlabel('Number of Epochs')\n",
"plt.ylabel('Error')\n",
"plt.title('Plot showing results from Neural Network')\n",
"plt.plot(error)\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "_CPGHGIUqAc1"
},
"source": [
"#### Compearing the outputs and the predictions"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "P8k6xUoLqAc1",
"outputId": "69e2c004-e912-490d-985e-14f86ba5b80a"
},
"outputs": [
{
"data": {
"text/plain": [
"array([[0],\n",
" [1],\n",
" [1],\n",
" [0]])"
]
},
"execution_count": 74,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"outputs"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "WHl5K1IlqAc1",
"outputId": "42d32027-a0a3-4794-ff35-0971d77856fa"
},
"outputs": [
{
"data": {
"text/plain": [
"array([[0.0105802 ],\n",
" [0.9826343 ],\n",
" [0.98124727],\n",
" [0.02222762]])"
]
},
"execution_count": 75,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"output_layer"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "CEC0iCDfqAc1"
},
"source": [
"#### * We see that our neural network was able to get values close to the actual values from the results.\n",
"#### * This shows that our neural network can handle the complexity of the XOR operator dataset."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "QmVC2QavqAc1"
},
"source": [
"* Let us see the updated weights. These are the weights we will require if we want to make future predictions"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "Sjt5y3XgqAc1",
"outputId": "a29d52fe-c626-408e-beeb-d4014529b09f"
},
"outputs": [
{
"data": {
"text/plain": [
"array([[-0.04578885, -5.45125419, -1.01868082],\n",
" [ 1.04772434, -5.25578328, -0.5958262 ]])"
]
},
"execution_count": 77,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"weights_0"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "VN-RO2wsqAc1",
"outputId": "b27e8087-aa30-4952-cc42-3e7d9b26b425"
},
"outputs": [
{
"data": {
"text/plain": [
"array([[-14.93268248],\n",
" [-37.16040415],\n",
" [ 43.01681387]])"
]
},
"execution_count": 78,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"weights_1"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "g3QxkIBYqAc1"
},
"outputs": [],
"source": [
"# This function accepts an instance of a dataset\n",
"\n",
"def calculate_output(instance):\n",
" #input to hidden layer\n",
" hidden_layer = sigmoid(np.dot(instance, weights_0))\n",
" #hidden to output layer\n",
" output_layer = sigmoid(np.dot(hidden_layer, weights_1))\n",
" return output_layer[0]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "0vTs9KUbqAc1",
"outputId": "9e9bd79d-5806-4ca8-a568-8801e437fc9c"
},
"outputs": [
{
"data": {
"text/plain": [
"0"
]
},
"execution_count": 95,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"round(calculate_output(np.array([0, 0])))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "pYxALdNOqAc1",
"outputId": "b78d1f10-bd23-40c6-93be-89d9c86b931f"
},
"outputs": [
{
"data": {
"text/plain": [
"1"
]
},
"execution_count": 96,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"round(calculate_output(np.array([0, 1])))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "yj9-9ifZqAc1",
"outputId": "44c89328-2891-4968-cf7e-cd2b30e367a3"
},
"outputs": [
{
"data": {
"text/plain": [
"1"
]
},
"execution_count": 97,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"round(calculate_output(np.array([1, 0])))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "A6n4cav9qAc1",
"outputId": "0fbef7fe-c267-4fce-8d67-7ab56d4a3141"
},
"outputs": [
{
"data": {
"text/plain": [
"0"
]
},
"execution_count": 98,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"round(calculate_output(np.array([1, 1])))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "y11IEZdKqAc1"
},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
},
"colab": {
"provenance": [],
"include_colab_link": true
}
},
"nbformat": 4,
"nbformat_minor": 0
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment