Skip to content

Instantly share code, notes, and snippets.

@deebuls
Created November 22, 2022 19:29
Show Gist options
  • Select an option

  • Save deebuls/3796d4b8f2bd2c1324407aa31b4d03ff to your computer and use it in GitHub Desktop.

Select an option

Save deebuls/3796d4b8f2bd2c1324407aa31b4d03ff to your computer and use it in GitHub Desktop.
fomo.ipynb
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"provenance": [],
"authorship_tag": "ABX9TyOjR21sSY9zNjU5RgfDxWBE",
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"language_info": {
"name": "python"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/deebuls/3796d4b8f2bd2c1324407aa31b4d03ff/fomo.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"source": [
"# Faster Objects More Objects aka FOMO\n",
"## Pytorch implementation \n",
"\n",
"FOMO introduced by [Edge Impulse](https://peter-ing.medium.com/introducing-faster-objects-more-objects-aka-fomo-3ce1c4ce2e3a) is actually rebranded architecture callen bnn which was intially developed by Mat Palm and explained in the [blog](http://matpalm.com/blog/counting_bees/). The tensorflow code was made available in [github](https://github.com/matpalm/bnn).\n",
"\n",
"The architecture diagram of FOMO/BNN is describe sa shown below : \n",
"![](http://matpalm.com/blog/imgs/2018/bnn/network.png)\n",
"\n",
"Here I try to convert the above diagram into a pytorch model. Hoep it helps anyone looking to deploy the FOMO model in real world."
],
"metadata": {
"id": "ELS0uQOvwQUT"
}
},
{
"cell_type": "code",
"source": [],
"metadata": {
"id": "s5Sfx7MdykSO"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"## Model Description as per Mat Palm\n",
"\n",
"```\n",
"the model the architecture of the network is a very vanilla u-net.\n",
"\n",
"a fully convolutional network trained on half resolution patches but run\n",
" against full resolution images encoding is a sequence of 4 3x3 convolutions\n",
" with stride 2 decoding is a sequence of nearest neighbours resizes + 3x3 \n",
" convolution (stride 1) + skip connection from the encoders final layer is a \n",
" 1x1 convolution (stride 1) with sigmoid activation (i.e. binary bee / no bee\n",
" choice per pixel) after some emperical experiments i chose to only decode \n",
" back to half the resolution of the input. it was good enough.\n",
"\n",
"i did the decoding using a nearest neighbour resize instead of a deconvolution \n",
"pretty much out of habit.\n",
"```"
],
"metadata": {
"id": "gZyvlY2Kyk0G"
}
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"id": "j_VnAOzGGm9Q"
},
"outputs": [],
"source": [
"import torch\n",
"import torch.nn as nn"
]
},
{
"cell_type": "code",
"source": [
"#ToDo Questions\n",
"# how is the padding working should it be the paper is same but we are doing zero\n",
"# upsampling what mode should it be \n",
"\n",
"class FOMO(torch.nn.Module):\n",
" def __init__(self):\n",
" super(FOMO, self).__init__()\n",
"\n",
" #Reduction\n",
" #3x3 conv stride 2 with 4 out channel \n",
" self.conv1 = torch.nn.Conv2d(in_channels=3, out_channels=4, kernel_size=3, stride=2, padding=(1,1))\n",
" #3x3 conv stride 2 with 8 out channel \n",
" self.conv2 = torch.nn.Conv2d(in_channels=4, out_channels=8, kernel_size=3, stride=2, padding=(1,1))\n",
" #3x3 conv stride 2 with 16 out channel \n",
" self.conv3 = torch.nn.Conv2d(in_channels=8, out_channels=16, kernel_size=3, stride=2, padding=(1,1))\n",
" #3x3 conv stride 2 with 32 out channel \n",
" self.conv4 = torch.nn.Conv2d(in_channels=16, out_channels=32, kernel_size=3, stride=2, padding=(1,1))\n",
" #3x3 conv stride 1 with 16 out channel \n",
" self.conv5 = torch.nn.Conv2d(in_channels=32, out_channels=16, kernel_size=3, stride=1, padding='same')\n",
"\n",
" self.upsample = torch.nn.Upsample(scale_factor=2, mode='bilinear')\n",
"\n",
" #Increasing\n",
" self.conv6 = torch.nn.Conv2d(in_channels=32, out_channels=8, kernel_size=3, stride=1, padding='same')\n",
" self.conv7 = torch.nn.Conv2d(in_channels=16, out_channels=4, kernel_size=3, stride=1, padding='same')\n",
" self.conv8 = torch.nn.Conv2d(in_channels=8, out_channels=1, kernel_size=1, stride=1, padding='same')\n",
"\n",
" \n",
"\n",
" \n",
"\n",
" def forward(self, x):\n",
" \n",
" #Downsample\n",
" out1 = self.conv1(x)\n",
" out2 = self.conv2(out1)\n",
" out3 = self.conv3(out2)\n",
"\n",
" output = self.conv4(out3)\n",
" \n",
" output = self.upsample(output)\n",
" output = self.conv5(output)\n",
" output = torch.concat(( output, out3), dim=1)\n",
" output = self.upsample(output)\n",
" output = self.conv6(output)\n",
" output = torch.concat(( output, out2), dim=1)\n",
" output = self.upsample(output)\n",
" output = self.conv7(output)\n",
" output = torch.concat(( output, out1), dim=1)\n",
" output = self.conv8(output)\n",
"\n",
" return output\n",
"\n",
" \n"
],
"metadata": {
"id": "RPJHjmLLGpRF"
},
"execution_count": 9,
"outputs": []
},
{
"cell_type": "code",
"source": [
"model = FOMO()\n",
"print (model)"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "6tzoJPtuJ4Y4",
"outputId": "41e106cb-0832-48f8-ddd2-523f9780e464"
},
"execution_count": 10,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"FOMO(\n",
" (conv1): Conv2d(3, 4, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))\n",
" (conv2): Conv2d(4, 8, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))\n",
" (conv3): Conv2d(8, 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))\n",
" (conv4): Conv2d(16, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))\n",
" (conv5): Conv2d(32, 16, kernel_size=(3, 3), stride=(1, 1), padding=same)\n",
" (upsample): Upsample(scale_factor=2.0, mode=bilinear)\n",
" (conv6): Conv2d(32, 8, kernel_size=(3, 3), stride=(1, 1), padding=same)\n",
" (conv7): Conv2d(16, 4, kernel_size=(3, 3), stride=(1, 1), padding=same)\n",
" (conv8): Conv2d(8, 1, kernel_size=(1, 1), stride=(1, 1), padding=same)\n",
")\n"
]
}
]
},
{
"cell_type": "code",
"source": [
"x = torch.randn(1, 3, 512, 384)\n",
"y = model(x)\n",
"print (y.shape)"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "cMlg8AR0VlR-",
"outputId": "568cedcb-1a23-4fac-8bfd-372b973df3f0"
},
"execution_count": 11,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"torch.Size([1, 1, 256, 192])\n"
]
}
]
},
{
"cell_type": "code",
"source": [
"%timeit y=model(x)"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "SIVQaqD9hPul",
"outputId": "05b019e1-0645-433b-ef31-f72f67a80d22"
},
"execution_count": 7,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"19.8 ms ± 2.77 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)\n"
]
}
]
},
{
"cell_type": "code",
"source": [],
"metadata": {
"id": "TcWExgJLz0Ue"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"## ToDo train on a dataset \n",
"\n"
],
"metadata": {
"id": "uLux4Q-Y0Q1o"
}
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment