Skip to content

Instantly share code, notes, and snippets.

@piiswrong
Last active July 26, 2017 23:00
Show Gist options
  • Select an option

  • Save piiswrong/2716581ebeb3d6560c1ad916a793f90d to your computer and use it in GitHub Desktop.

Select an option

Save piiswrong/2716581ebeb3d6560c1ad916a793f90d to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Imperative Machine Learning with MXNet and Foobar\n",
"\n",
"This tutorial introduces how to do imperative tensor computation with MXNet and how to use Foo, which is a new user friendly interface for MXNet that doesn't have a name yet :)\n",
"\n",
"You can download this tutorial here: https://gist.github.com/piiswrong/2716581ebeb3d6560c1ad916a793f90d\n",
"\n",
"API Reference for Foo package can be found here: http://mxnet-doc.s3-website-us-east-1.amazonaws.com/api/python/foo.html\n",
"\n",
"## Setup\n",
"\n",
"You need to clone MXNet and checkout the nn branch:\n",
"```\n",
"git clone https://github.com/dmlc/mxnet.git --recursive\n",
"git checkout nn\n",
"```\n",
"Then follow the \"Build from Source\" section of the [installation guide]( http://mxnet.io/get_started/install.html)\n",
"\n",
"Now you can import MXNet"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"from __future__ import print_function\n",
"import mxnet as mx"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Basics\n",
"\n",
"NDArray and Operators"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"x: <NDArray 2x2 @cpu(0)>\n",
"[[ 1. 2.]\n",
" [ 3. 4.]]\n"
]
}
],
"source": [
"x = mx.nd.array([[1, 2], [3, 4]])\n",
"y = mx.nd.array([[5, 6], [7, 8]])\n",
"# gpu_x = mx.nd.array([1, 2], ctx=mx.gpu(0))\n",
"print('x: ', x)\n",
"print(x.asnumpy())"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[ 19. 22.]\n",
" [ 43. 50.]]\n"
]
}
],
"source": [
"z = mx.nd.dot(x, y)\n",
"print(z.asnumpy())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"See [here](http://mxnet.io/api/python/ndarray.html) for a list of all operators"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Automatic differentiation\n",
"\n",
"Attach gradient buffers to NDArrays:"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"from mxnet.autograd import *\n",
"\n",
"dx = mx.nd.zeros_like(x)\n",
"dy = mx.nd.zeros_like(y)\n",
"mark_variables(x, dx)\n",
"mark_variables(y, dy)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"backward, gradient, and the train_section:"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"dx = [[ 5. 6.]\n",
" [ 7. 8.]]\n",
"dy = [[ 1. 2.]\n",
" [ 3. 4.]]\n"
]
}
],
"source": [
"with train_section():\n",
" z = x * y\n",
" z.backward()\n",
"print('dx = ', dx.asnumpy())\n",
"print('dy = ', dy.asnumpy())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Supplying head gradient (gradient w.r.t z):"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"dx = [[ 10. 12.]\n",
" [ 14. 16.]]\n",
"dy = [[ 2. 4.]\n",
" [ 6. 8.]]\n"
]
}
],
"source": [
"dz = mx.nd.ones_like(z)*2\n",
"with train_section():\n",
" z = x * y\n",
" z.backward(dz)\n",
"print('dx = ', dx.asnumpy())\n",
"print('dy = ', dy.asnumpy())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Foo\n",
"\n",
"### Layers\n",
"\n",
"Foo provides basic neural network building block as `Layer`. For example, `Dense(4, in_units=2)` is a fully connected layer that takes in length 2 inputs and produce length 4 outputs:"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"from mxnet import foo\n",
"from mxnet.foo import nn\n",
"dense = nn.Dense(4, activation='relu', in_units=2)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Before we can use it, we must initialize dense's parameters:"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"weight: <NDArray 4x2 @cpu(0)>\n",
"[[ 0.04881352 0.09284461]\n",
" [ 0.21518934 0.34426576]\n",
" [ 0.10276335 0.35794562]\n",
" [ 0.04488319 0.34725171]]\n",
"bias: <NDArray 4 @cpu(0)>\n",
"[ 0. 0. 0. 0.]\n"
]
}
],
"source": [
"dense.all_params().initialize(mx.init.Uniform(0.5), ctx=mx.cpu(0))\n",
"print('weight: ', dense.weight.data())\n",
"print(dense.weight.data().asnumpy())\n",
"\n",
"print('bias: ', dense.bias.data())\n",
"print(dense.bias.data().asnumpy())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we are ready to do a *forward pass*:"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[ 0.23450273 0.90372086 0.8186546 0.73938662]\n",
" [ 0.51781899 2.02263117 1.74007249 1.52365637]]\n"
]
}
],
"source": [
"output = dense(x)\n",
"print(output.asnumpy())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here `x` is interpreted as a input with batch_size=2 and length 2.\n",
"\n",
"### Composing Layers\n",
"You can compose multiple layers into a neural network by inheriting `nn.Layer`:"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"class Net(nn.Layer):\n",
" def __init__(self, **kwargs):\n",
" super(Net, self).__init__(**kwargs)\n",
" with self.scope:\n",
" # layers assigned to self in scope will be registered as sub-layers\n",
" self.fc1 = nn.Dense(4, in_units=2)\n",
" self.fc2 = nn.Dense(3, in_units=4)\n",
" \n",
" def generic_forward(self, F, x):\n",
" # when x is an NDArray, F will be set to mx.nd\n",
" x = F.relu(self.fc1(x))\n",
" x = self.fc2(x)\n",
" return x"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we can use it the same way we used dense:"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[-0.72901833 -2.01382256 1.48214853]\n",
" [-1.71972466 -4.79868174 3.42445707]]\n"
]
}
],
"source": [
"net = Net()\n",
"net.all_params().initialize(ctx=mx.cpu(0))\n",
"output = net(x)\n",
"print(output.asnumpy())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Trainer and Loss Functions\n",
"\n",
"To train the network you need an optimizer:"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"trainer = foo.Trainer(net.all_params(), 'sgd', {'learning_rate': 0.1})"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Then you can forward the network in a train_section and compute gradient with backward:"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"loss = [ 0.94735891 1.18011999]\n",
"fc2.bias = [ 0.0547721 0.08332293 -0.13809504]\n",
"d(loss)/d(fc2.bias) = [-0.17545167 -0.35362723 0.52907884]\n"
]
}
],
"source": [
"label = mx.nd.array([0, 1])\n",
"with train_section():\n",
" output = net(x)\n",
" loss = foo.loss.softmax_cross_entropy_loss(output, label)\n",
" loss.backward()\n",
"print('loss = ', loss.asnumpy())\n",
"print('fc2.bias = ', net.fc2.bias.data().asnumpy())\n",
"print('d(loss)/d(fc2.bias) = ', net.fc2.bias.grad().asnumpy())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we can make a gradient step with Trainer"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"fc2.bias = [ 0.06354468 0.1010043 -0.16454898]\n"
]
}
],
"source": [
"trainer.step(batch_size=2)\n",
"print('fc2.bias = ', net.fc2.bias.data().asnumpy())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that the weights has changed a little. You can repeat the last two cells in a loop to train your network."
]
}
],
"metadata": {
"anaconda-cloud": {},
"kernelspec": {
"display_name": "Python [default]",
"language": "python",
"name": "python2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.12"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment