mrnabati · June 21, 2020 21:14
diff --git a/010_advanced_pytorch_modifying_the_last_layer.ipynb b/010_advanced_pytorch_modifying_the_last_layer.ipynb
 {
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# PyTorch 101: Modifying the Last Layer\n",
    "\n",
    "All the pre-trained models provided in the `torchvision` package in PyTorch are trained on the [ImageNet](http://www.image-net.org/) dataset and can be used out of the box on this dataset. But often times you want to use these models on other available image datasets or even your own custom dataset. This usually requires modifying and fine-tuning the model to work with the new dataset. Changing the output dimension of the last layer in the model is usually among the first changes you need to make, and that's the focus of this post.\n",
    "\n",
    "Let's start with loading a pre-trained model from the `torchvision` package. We use the [VGG16](https://arxiv.org/abs/1409.1556) model, pretrained on the ImageNet dataset with 1000 object categories. Let's take a look at the modules on this model:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "odict_keys(['features', 'avgpool', 'classifier'])\n"
     ]
    }
   ],
   "source": [
    "import torch\n",
    "import torch.nn as nn\n",
    "import torchvision.models as models\n",
    "\n",
    "vgg16 = models.vgg16(pretrained=True)\n",
    "print(vgg16._modules.keys())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We are only interested in the last layer, so let's print the layers in the 'classifier' module:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Sequential(\n",
      "  (0): Linear(in_features=25088, out_features=4096, bias=True)\n",
      "  (1): ReLU(inplace=True)\n",
      "  (2): Dropout(p=0.5, inplace=False)\n",
      "  (3): Linear(in_features=4096, out_features=4096, bias=True)\n",
      "  (4): ReLU(inplace=True)\n",
      "  (5): Dropout(p=0.5, inplace=False)\n",
      "  (6): Linear(in_features=4096, out_features=1000, bias=True)\n",
      ")\n"
     ]
    }
   ],
   "source": [
    "print(vgg16._modules['classifier'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As expected, the output dimension for the last layer is 1000. Let's assume we are going to use this model on the [COCO dataset](http://cocodataset.org/#home) with 80 object categories. To change the output dimension of the model to 80, we simply replace the last sub-layer with a new Linear layer. The Linear layer takes two required arguments: `in_features` and `out_features`. The `in_features` is going to be the same as before, and `out_features` is goint to be 80:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Sequential(\n",
      "  (0): Linear(in_features=25088, out_features=4096, bias=True)\n",
      "  (1): ReLU(inplace=True)\n",
      "  (2): Dropout(p=0.5, inplace=False)\n",
      "  (3): Linear(in_features=4096, out_features=4096, bias=True)\n",
      "  (4): ReLU(inplace=True)\n",
      "  (5): Dropout(p=0.5, inplace=False)\n",
      "  (6): Linear(in_features=4096, out_features=80, bias=True)\n",
      ")\n"
     ]
    }
   ],
   "source": [
    "in_features = vgg16._modules['classifier'][-1].in_features\n",
    "out_features = 80\n",
    "vgg16._modules['classifier'][-1] = nn.Linear(in_features, out_features, bias=True)\n",
    "print(vgg16._modules['classifier'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "That's it! The output dimension is now 80. You need to keep in mind that by replacing the last layer we removed any learned parameter in this layer. You need to finetune the model on the new dataset at this point to learn the parameters again. "
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
 }
	{
	"cells": [
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"# PyTorch 101: Modifying the Last Layer\n",
	"\n",
	"All the pre-trained models provided in the `torchvision` package in PyTorch are trained on the [ImageNet](http://www.image-net.org/) dataset and can be used out of the box on this dataset. But often times you want to use these models on other available image datasets or even your own custom dataset. This usually requires modifying and fine-tuning the model to work with the new dataset. Changing the output dimension of the last layer in the model is usually among the first changes you need to make, and that's the focus of this post.\n",
	"\n",
	"Let's start with loading a pre-trained model from the `torchvision` package. We use the [VGG16](https://arxiv.org/abs/1409.1556) model, pretrained on the ImageNet dataset with 1000 object categories. Let's take a look at the modules on this model:"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 1,
	"metadata": {},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"odict_keys(['features', 'avgpool', 'classifier'])\n"
	]
	}
	],
	"source": [
	"import torch\n",
	"import torch.nn as nn\n",
	"import torchvision.models as models\n",
	"\n",
	"vgg16 = models.vgg16(pretrained=True)\n",
	"print(vgg16._modules.keys())"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"We are only interested in the last layer, so let's print the layers in the 'classifier' module:"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 3,
	"metadata": {},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"Sequential(\n",
	" (0): Linear(in_features=25088, out_features=4096, bias=True)\n",
	" (1): ReLU(inplace=True)\n",
	" (2): Dropout(p=0.5, inplace=False)\n",
	" (3): Linear(in_features=4096, out_features=4096, bias=True)\n",
	" (4): ReLU(inplace=True)\n",
	" (5): Dropout(p=0.5, inplace=False)\n",
	" (6): Linear(in_features=4096, out_features=1000, bias=True)\n",
	")\n"
	]
	}
	],
	"source": [
	"print(vgg16._modules['classifier'])"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"As expected, the output dimension for the last layer is 1000. Let's assume we are going to use this model on the [COCO dataset](http://cocodataset.org/#home) with 80 object categories. To change the output dimension of the model to 80, we simply replace the last sub-layer with a new Linear layer. The Linear layer takes two required arguments: `in_features` and `out_features`. The `in_features` is going to be the same as before, and `out_features` is goint to be 80:"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 4,
	"metadata": {},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"Sequential(\n",
	" (0): Linear(in_features=25088, out_features=4096, bias=True)\n",
	" (1): ReLU(inplace=True)\n",
	" (2): Dropout(p=0.5, inplace=False)\n",
	" (3): Linear(in_features=4096, out_features=4096, bias=True)\n",
	" (4): ReLU(inplace=True)\n",
	" (5): Dropout(p=0.5, inplace=False)\n",
	" (6): Linear(in_features=4096, out_features=80, bias=True)\n",
	")\n"
	]
	}
	],
	"source": [
	"in_features = vgg16._modules['classifier'][-1].in_features\n",
	"out_features = 80\n",
	"vgg16._modules['classifier'][-1] = nn.Linear(in_features, out_features, bias=True)\n",
	"print(vgg16._modules['classifier'])"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"That's it! The output dimension is now 80. You need to keep in mind that by replacing the last layer we removed any learned parameter in this layer. You need to finetune the model on the new dataset at this point to learn the parameters again. "
	]
	}
	],
	"metadata": {
	"kernelspec": {
	"display_name": "Python 3",
	"language": "python",
	"name": "python3"
	},
	"language_info": {
	"codemirror_mode": {
	"name": "ipython",
	"version": 3
	},
	"file_extension": ".py",
	"mimetype": "text/x-python",
	"name": "python",
	"nbconvert_exporter": "python",
	"pygments_lexer": "ipython3",
	"version": "3.7.7"
	}
	},
	"nbformat": 4,
	"nbformat_minor": 4
	}
No results found