Created
November 22, 2022 19:29
-
-
Save deebuls/3796d4b8f2bd2c1324407aa31b4d03ff to your computer and use it in GitHub Desktop.
fomo.ipynb
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| { | |
| "nbformat": 4, | |
| "nbformat_minor": 0, | |
| "metadata": { | |
| "colab": { | |
| "provenance": [], | |
| "authorship_tag": "ABX9TyOjR21sSY9zNjU5RgfDxWBE", | |
| "include_colab_link": true | |
| }, | |
| "kernelspec": { | |
| "name": "python3", | |
| "display_name": "Python 3" | |
| }, | |
| "language_info": { | |
| "name": "python" | |
| } | |
| }, | |
| "cells": [ | |
| { | |
| "cell_type": "markdown", | |
| "metadata": { | |
| "id": "view-in-github", | |
| "colab_type": "text" | |
| }, | |
| "source": [ | |
| "<a href=\"https://colab.research.google.com/gist/deebuls/3796d4b8f2bd2c1324407aa31b4d03ff/fomo.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "source": [ | |
| "# Faster Objects More Objects aka FOMO\n", | |
| "## Pytorch implementation \n", | |
| "\n", | |
| "FOMO introduced by [Edge Impulse](https://peter-ing.medium.com/introducing-faster-objects-more-objects-aka-fomo-3ce1c4ce2e3a) is actually rebranded architecture callen bnn which was intially developed by Mat Palm and explained in the [blog](http://matpalm.com/blog/counting_bees/). The tensorflow code was made available in [github](https://github.com/matpalm/bnn).\n", | |
| "\n", | |
| "The architecture diagram of FOMO/BNN is describe sa shown below : \n", | |
| "\n", | |
| "\n", | |
| "Here I try to convert the above diagram into a pytorch model. Hoep it helps anyone looking to deploy the FOMO model in real world." | |
| ], | |
| "metadata": { | |
| "id": "ELS0uQOvwQUT" | |
| } | |
| }, | |
| { | |
| "cell_type": "code", | |
| "source": [], | |
| "metadata": { | |
| "id": "s5Sfx7MdykSO" | |
| }, | |
| "execution_count": null, | |
| "outputs": [] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "source": [ | |
| "## Model Description as per Mat Palm\n", | |
| "\n", | |
| "```\n", | |
| "the model the architecture of the network is a very vanilla u-net.\n", | |
| "\n", | |
| "a fully convolutional network trained on half resolution patches but run\n", | |
| " against full resolution images encoding is a sequence of 4 3x3 convolutions\n", | |
| " with stride 2 decoding is a sequence of nearest neighbours resizes + 3x3 \n", | |
| " convolution (stride 1) + skip connection from the encoders final layer is a \n", | |
| " 1x1 convolution (stride 1) with sigmoid activation (i.e. binary bee / no bee\n", | |
| " choice per pixel) after some emperical experiments i chose to only decode \n", | |
| " back to half the resolution of the input. it was good enough.\n", | |
| "\n", | |
| "i did the decoding using a nearest neighbour resize instead of a deconvolution \n", | |
| "pretty much out of habit.\n", | |
| "```" | |
| ], | |
| "metadata": { | |
| "id": "gZyvlY2Kyk0G" | |
| } | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 3, | |
| "metadata": { | |
| "id": "j_VnAOzGGm9Q" | |
| }, | |
| "outputs": [], | |
| "source": [ | |
| "import torch\n", | |
| "import torch.nn as nn" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "source": [ | |
| "#ToDo Questions\n", | |
| "# how is the padding working should it be the paper is same but we are doing zero\n", | |
| "# upsampling what mode should it be \n", | |
| "\n", | |
| "class FOMO(torch.nn.Module):\n", | |
| " def __init__(self):\n", | |
| " super(FOMO, self).__init__()\n", | |
| "\n", | |
| " #Reduction\n", | |
| " #3x3 conv stride 2 with 4 out channel \n", | |
| " self.conv1 = torch.nn.Conv2d(in_channels=3, out_channels=4, kernel_size=3, stride=2, padding=(1,1))\n", | |
| " #3x3 conv stride 2 with 8 out channel \n", | |
| " self.conv2 = torch.nn.Conv2d(in_channels=4, out_channels=8, kernel_size=3, stride=2, padding=(1,1))\n", | |
| " #3x3 conv stride 2 with 16 out channel \n", | |
| " self.conv3 = torch.nn.Conv2d(in_channels=8, out_channels=16, kernel_size=3, stride=2, padding=(1,1))\n", | |
| " #3x3 conv stride 2 with 32 out channel \n", | |
| " self.conv4 = torch.nn.Conv2d(in_channels=16, out_channels=32, kernel_size=3, stride=2, padding=(1,1))\n", | |
| " #3x3 conv stride 1 with 16 out channel \n", | |
| " self.conv5 = torch.nn.Conv2d(in_channels=32, out_channels=16, kernel_size=3, stride=1, padding='same')\n", | |
| "\n", | |
| " self.upsample = torch.nn.Upsample(scale_factor=2, mode='bilinear')\n", | |
| "\n", | |
| " #Increasing\n", | |
| " self.conv6 = torch.nn.Conv2d(in_channels=32, out_channels=8, kernel_size=3, stride=1, padding='same')\n", | |
| " self.conv7 = torch.nn.Conv2d(in_channels=16, out_channels=4, kernel_size=3, stride=1, padding='same')\n", | |
| " self.conv8 = torch.nn.Conv2d(in_channels=8, out_channels=1, kernel_size=1, stride=1, padding='same')\n", | |
| "\n", | |
| " \n", | |
| "\n", | |
| " \n", | |
| "\n", | |
| " def forward(self, x):\n", | |
| " \n", | |
| " #Downsample\n", | |
| " out1 = self.conv1(x)\n", | |
| " out2 = self.conv2(out1)\n", | |
| " out3 = self.conv3(out2)\n", | |
| "\n", | |
| " output = self.conv4(out3)\n", | |
| " \n", | |
| " output = self.upsample(output)\n", | |
| " output = self.conv5(output)\n", | |
| " output = torch.concat(( output, out3), dim=1)\n", | |
| " output = self.upsample(output)\n", | |
| " output = self.conv6(output)\n", | |
| " output = torch.concat(( output, out2), dim=1)\n", | |
| " output = self.upsample(output)\n", | |
| " output = self.conv7(output)\n", | |
| " output = torch.concat(( output, out1), dim=1)\n", | |
| " output = self.conv8(output)\n", | |
| "\n", | |
| " return output\n", | |
| "\n", | |
| " \n" | |
| ], | |
| "metadata": { | |
| "id": "RPJHjmLLGpRF" | |
| }, | |
| "execution_count": 9, | |
| "outputs": [] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "source": [ | |
| "model = FOMO()\n", | |
| "print (model)" | |
| ], | |
| "metadata": { | |
| "colab": { | |
| "base_uri": "https://localhost:8080/" | |
| }, | |
| "id": "6tzoJPtuJ4Y4", | |
| "outputId": "41e106cb-0832-48f8-ddd2-523f9780e464" | |
| }, | |
| "execution_count": 10, | |
| "outputs": [ | |
| { | |
| "output_type": "stream", | |
| "name": "stdout", | |
| "text": [ | |
| "FOMO(\n", | |
| " (conv1): Conv2d(3, 4, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))\n", | |
| " (conv2): Conv2d(4, 8, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))\n", | |
| " (conv3): Conv2d(8, 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))\n", | |
| " (conv4): Conv2d(16, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))\n", | |
| " (conv5): Conv2d(32, 16, kernel_size=(3, 3), stride=(1, 1), padding=same)\n", | |
| " (upsample): Upsample(scale_factor=2.0, mode=bilinear)\n", | |
| " (conv6): Conv2d(32, 8, kernel_size=(3, 3), stride=(1, 1), padding=same)\n", | |
| " (conv7): Conv2d(16, 4, kernel_size=(3, 3), stride=(1, 1), padding=same)\n", | |
| " (conv8): Conv2d(8, 1, kernel_size=(1, 1), stride=(1, 1), padding=same)\n", | |
| ")\n" | |
| ] | |
| } | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "source": [ | |
| "x = torch.randn(1, 3, 512, 384)\n", | |
| "y = model(x)\n", | |
| "print (y.shape)" | |
| ], | |
| "metadata": { | |
| "colab": { | |
| "base_uri": "https://localhost:8080/" | |
| }, | |
| "id": "cMlg8AR0VlR-", | |
| "outputId": "568cedcb-1a23-4fac-8bfd-372b973df3f0" | |
| }, | |
| "execution_count": 11, | |
| "outputs": [ | |
| { | |
| "output_type": "stream", | |
| "name": "stdout", | |
| "text": [ | |
| "torch.Size([1, 1, 256, 192])\n" | |
| ] | |
| } | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "source": [ | |
| "%timeit y=model(x)" | |
| ], | |
| "metadata": { | |
| "colab": { | |
| "base_uri": "https://localhost:8080/" | |
| }, | |
| "id": "SIVQaqD9hPul", | |
| "outputId": "05b019e1-0645-433b-ef31-f72f67a80d22" | |
| }, | |
| "execution_count": 7, | |
| "outputs": [ | |
| { | |
| "output_type": "stream", | |
| "name": "stdout", | |
| "text": [ | |
| "19.8 ms ± 2.77 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)\n" | |
| ] | |
| } | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "source": [], | |
| "metadata": { | |
| "id": "TcWExgJLz0Ue" | |
| }, | |
| "execution_count": null, | |
| "outputs": [] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "source": [ | |
| "## ToDo train on a dataset \n", | |
| "\n" | |
| ], | |
| "metadata": { | |
| "id": "uLux4Q-Y0Q1o" | |
| } | |
| } | |
| ] | |
| } |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment