- Apache MXNet - Discussion
- Apache MXNet - Useful Resources
- Apache MXNet - Quick Guide
- Apache MXNet - Python API Module
- Apache MXNet - Python API Symbol
- Apache MXNet - Python API autograd and initializer
- Apache MXNet - Python API gluon
- Apache MXNet - Python API ndarray
- Apache MXNet - KVStore and Visualization
- Apache MXNet - Gluon
- Apache MXNet - NDArray
- Apache MXNet - Python Packages
- Apache MXNet - Distributed Training
- Apache MXNet - Unified Operator API
- Apache MXNet - System Components
- Apache MXNet - System Architecture
- Apache MXNet - Toolkits and Ecosystem
- Apache MXNet - Installing MXNet
- Apache MXNet - Introduction
- Apache MXNet - Home
Selected Reading
- Who is Who
- Computer Glossary
- HR Interview Questions
- Effective Resume Writing
- Questions and Answers
- UPSC IAS Exams Notes
Apache MXNet - Gluon
Another most important MXNet Python package is Gluon. In this chapter, we will be discussing this package. Gluon provides a clear, concise, and simple API for DL projects. It enables Apache MXNet to prototype, build, and train DL models without forfeiting the training speed.
Blocks
Blocks form the basis of more complex network designs. In a neural network, as the complexity of neural network increases, we need to move from designing single to entire layers of neurons. For example, NN design pke ResNet-152 have a very fair degree of regularity by consisting of blocks of repeated layers.
Example
In the example given below, we will write code a simple block, namely block for a multilayer perceptron.
from mxnet import nd from mxnet.gluon import nn x = nd.random.uniform(shape=(2, 20)) N_net = nn.Sequential() N_net.add(nn.Dense(256, activation= relu )) N_net.add(nn.Dense(10)) N_net.initiapze() N_net(x)
Output
This produces the following output:
[[ 0.09543004 0.04614332 -0.00286655 -0.07790346 -0.05130241 0.02942038 0.08696645 -0.0190793 -0.04122177 0.05088576] [ 0.0769287 0.03099706 0.00856576 -0.044672 -0.06926838 0.09132431 0.06786592 -0.06187843 -0.03436674 0.04234696]] <NDArray 2x10 @cpu(0)>
Steps needed to go from defining layers to defining blocks of one or more layers −
Step 1 − Block take the data as input.
Step 2 − Now, blocks will store the state in the form of parameters. For example, in the above coding example the block contains two hidden layers and we need a place to store parameters for it.
Step 3 − Next block will invoke the forward function to perform forward propagation. It is also called forward computation. As a part of first forward call, blocks initiapze the parameters in a lazy fashion.
Step 4 − At last the blocks will invoke backward function and calculate the gradient with reference to their input. Typically, this step is performed automatically.
Sequential Block
A sequential block is a special kind of block in which the data flows through a sequence of blocks. In this, each block appped to the output of one before with the first block being appped on the input data itself.
Let us see how sequential class works −
from mxnet import nd from mxnet.gluon import nn class MySequential(nn.Block): def __init__(self, **kwargs): super(MySequential, self).__init__(**kwargs) def add(self, block): self._children[block.name] = block def forward(self, x): for block in self._children.values(): x = block(x) return x x = nd.random.uniform(shape=(2, 20)) N_net = MySequential() N_net.add(nn.Dense(256, activation = relu )) N_net.add(nn.Dense(10)) N_net.initiapze() N_net(x)
Output
The output is given herewith −
[[ 0.09543004 0.04614332 -0.00286655 -0.07790346 -0.05130241 0.02942038 0.08696645 -0.0190793 -0.04122177 0.05088576] [ 0.0769287 0.03099706 0.00856576 -0.044672 -0.06926838 0.09132431 0.06786592 -0.06187843 -0.03436674 0.04234696]] <NDArray 2x10 @cpu(0)>
Custom Block
We can easily go beyond concatenation with sequential block as defined above. But, if we would pke to make customisations then the Block class also provides us the required functionapty. Block class has a model constructor provided in nn module. We can inherit that model constructor to define the model we want.
In the following example, the MLP class overrides the __init__ and forward functions of the Block class.
Let us see how it works.
class MLP(nn.Block): def __init__(self, **kwargs): super(MLP, self).__init__(**kwargs) self.hidden = nn.Dense(256, activation= relu ) # Hidden layer self.output = nn.Dense(10) # Output layer def forward(self, x): hidden_out = self.hidden(x) return self.output(hidden_out) x = nd.random.uniform(shape=(2, 20)) N_net = MLP() N_net.initiapze() N_net(x)
Output
When you run the code, you will see the following output:
[[ 0.07787763 0.00216403 0.01682201 0.03059879 -0.00702019 0.01668715 0.04822846 0.0039432 -0.09300035 -0.04494302] [ 0.08891078 -0.00625484 -0.01619131 0.0380718 -0.01451489 0.02006172 0.0303478 0.02463485 -0.07605448 -0.04389168]] <NDArray 2x10 @cpu(0)>
Custom Layers
Apache MXNet’s Gluon API comes with a modest number of pre-defined layers. But still at some point, we may find that a new layer is needed. We can easily add a new layer in Gluon API. In this section, we will see how we can create a new layer from scratch.
The Simplest Custom Layer
To create a new layer in Gluon API, we must have to create a class inherits from the Block class which provides the most basic functionapty. We can inherit all the pre-defined layers from it directly or via other subclasses.
For creating the new layer, the only instance method needed to be implemented is forward (self, x). This method defines, what exactly our layer is going to do during forward propagation. As discussed earper also, the back-propagation pass for blocks will be done by Apache MXNet itself automatically.
Example
In the example below, we will be defining a new layer. We will also implement forward() method to normapse the input data by fitting it into a range of [0, 1].
from __future__ import print_function import mxnet as mx from mxnet import nd, gluon, autograd from mxnet.gluon.nn import Dense mx.random.seed(1) class NormapzationLayer(gluon.Block): def __init__(self): super(NormapzationLayer, self).__init__() def forward(self, x): return (x - nd.min(x)) / (nd.max(x) - nd.min(x)) x = nd.random.uniform(shape=(2, 20)) N_net = NormapzationLayer() N_net.initiapze() N_net(x)
Output
On executing the above program, you will get the following result −
[[0.5216355 0.03835821 0.02284337 0.5945146 0.17334817 0.69329053 0.7782702 1. 0.5508242 0. 0.07058554 0.3677264 0.4366546 0.44362497 0.7192635 0.37616986 0.6728799 0.7032008 0.46907538 0.63514024] [0.9157533 0.7667402 0.08980197 0.03593295 0.16176797 0.27679572 0.07331014 0.3905285 0.6513384 0.02713427 0.05523694 0.12147208 0.45582628 0.8139887 0.91629887 0.36665893 0.07873632 0.78268915 0.63404864 0.46638715]] <NDArray 2x20 @cpu(0)>
Hybridisation
It may be defined as a process used by Apache MXNet’s to create a symbopc graph of a forward computation. Hybridisation allows MXNet to upsurge the computation performance by optimising the computational symbopc graph. Rather than directly inheriting from Block, in fact, we may find that while implementing existing layers a block inherits from a HybridBlock.
Following are the reasons for this −
Allows us to write custom layers: HybridBlock allows us to write custom layers that can further be used in imperative and symbopc programming both.
Increase computation performance− HybridBlock optimise the computational symbopc graph which allows MXNet to increase computation performance.
Example
In this example, we will be rewriting our example layer, created above, by using HybridBlock:
class NormapzationHybridLayer(gluon.HybridBlock): def __init__(self): super(NormapzationHybridLayer, self).__init__() def hybrid_forward(self, F, x): return F.broadcast_span(F.broadcast_sub(x, F.min(x)), (F.broadcast_sub(F.max(x), F.min(x)))) layer_hybd = NormapzationHybridLayer() layer_hybd(nd.array([1, 2, 3, 4, 5, 6], ctx=mx.cpu()))
Output
The output is stated below:
[0. 0.2 0.4 0.6 0.8 1. ] <NDArray 6 @cpu(0)>
Hybridisation has nothing to do with computation on GPU and one can train hybridised as well as non-hybridised networks on both CPU and GPU.
Difference between Block and HybridBlock
If we will compare the Block Class and HybridBlock, we will see that HybridBlock already has its forward() method implemented. HybridBlock defines a hybrid_forward() method that needs to be implemented while creating the layers. F argument creates the main difference between forward() and hybrid_forward(). In MXNet community, F argument is referred to as a backend. F can either refer to mxnet.ndarray API (used for imperative programming) or mxnet.symbol API (used for Symbopc programming).
How to add custom layer to a network?
Instead of using custom layers separately, these layers are used with predefined layers. We can use either Sequential or HybridSequential containers to from a sequential neural network. As discussed earper also, Sequential container inherit from Block and HybridSequential inherit from HybridBlock respectively.
Example
In the example below, we will be creating a simple neural network with a custom layer. The output from Dense (5) layer will be the input of NormapzationHybridLayer. The output of NormapzationHybridLayer will become the input of Dense (1) layer.
net = gluon.nn.HybridSequential() with net.name_scope(): net.add(Dense(5)) net.add(NormapzationHybridLayer()) net.add(Dense(1)) net.initiapze(mx.init.Xavier(magnitude=2.24)) net.hybridize() input = nd.random_uniform(low=-10, high=10, shape=(10, 2)) net(input)
Output
You will see the following output −
[[-1.1272651] [-1.2299833] [-1.0662932] [-1.1805027] [-1.3382034] [-1.2081106] [-1.1263978] [-1.2524893] [-1.1044774] [-1.316593 ]] <NDArray 10x1 @cpu(0)>
Custom layer parameters
In a neural network, a layer has a set of parameters associated with it. We sometimes refer them as weights, which is internal state of a layer. These parameters play different roles −
Sometimes these are the ones that we want to learn during backpropagation step.
Sometimes these are just constants we want to use during forward pass.
If we talk about the programming concept, these parameters (weights) of a block are stored and accessed via ParameterDict class which helps in initiapsation, updation, saving, and loading of them.
Example
In the example below, we will be defining two following sets of parameters −
Parameter weights − This is trainable, and its shape is unknown during construction phase. It will be inferred on the first run of forward propagation.
Parameter scale − This is a constant whose value doesn’t change. As opposite to parameter weights, its shape is defined during construction.
class NormapzationHybridLayer(gluon.HybridBlock): def __init__(self, hidden_units, scales): super(NormapzationHybridLayer, self).__init__() with self.name_scope(): self.weights = self.params.get( weights , shape=(hidden_units, 0), allow_deferred_init=True) self.scales = self.params.get( scales , shape=scales.shape, init=mx.init.Constant(scales.asnumpy()), differentiable=False) def hybrid_forward(self, F, x, weights, scales): normapzed_data = F.broadcast_span(F.broadcast_sub(x, F.min(x)), (F.broadcast_sub(F.max(x), F.min(x)))) weighted_data = F.FullyConnected(normapzed_data, weights, num_hidden=self.weights.shape[0], no_bias=True) scaled_data = F.broadcast_mul(scales, weighted_data) return scaled_dataAdvertisements