- Apache MXNet - Discussion
- Apache MXNet - Useful Resources
- Apache MXNet - Quick Guide
- Apache MXNet - Python API Module
- Apache MXNet - Python API Symbol
- Apache MXNet - Python API autograd and initializer
- Apache MXNet - Python API gluon
- Apache MXNet - Python API ndarray
- Apache MXNet - KVStore and Visualization
- Apache MXNet - Gluon
- Apache MXNet - NDArray
- Apache MXNet - Python Packages
- Apache MXNet - Distributed Training
- Apache MXNet - Unified Operator API
- Apache MXNet - System Components
- Apache MXNet - System Architecture
- Apache MXNet - Toolkits and Ecosystem
- Apache MXNet - Installing MXNet
- Apache MXNet - Introduction
- Apache MXNet - Home
Selected Reading
- Who is Who
- Computer Glossary
- HR Interview Questions
- Effective Resume Writing
- Questions and Answers
- UPSC IAS Exams Notes
Python API Autograd and Initiapzer
This chapter deals with the autograd and initiapzer API in MXNet.
mxnet.autograd
This is MXNet’ autograd API for NDArray. It has the following class −
Class: Function()
It is used for customised differentiation in autograd. It can be written as mxnet.autograd.Function. If, for any reason, the user do not want to use the gradients that are computed by the default chain-rule, then he/she can use Function class of mxnet.autograd to customize differentiation for computation. It has two methods namely Forward() and Backward().
Let us understand the working of this class with the help of following points −
First, we need to define our computation in the forward method.
Then, we need to provide the customized differentiation in the backward method.
Now during gradient computation, instead of user-defined backward function, mxnet.autograd will use the backward function defined by the user. We can also cast to numpy array and back for some operations in forward as well as backward.
Example
Before using the mxnet.autograd.function class, let’s define a stable sigmoid function with backward as well as forward methods as follows −
class sigmoid(mx.autograd.Function): def forward(self, x): y = 1 / (1 + mx.nd.exp(-x)) self.save_for_backward(y) return y def backward(self, dy): y, = self.saved_tensors return dy * y * (1-y)
Now, the function class can be used as follows −
func = sigmoid() x = mx.nd.random.uniform(shape=(10,)) x.attach_grad() with mx.autograd.record(): m = func(x) m.backward() dx_grad = x.grad.asnumpy() dx_grad
Output
When you run the code, you will see the following output −
array([0.21458015, 0.21291625, 0.23330082, 0.2361367 , 0.23086983, 0.24060014, 0.20326573, 0.21093895, 0.24968489, 0.24301809], dtype=float32)
Methods and their parameters
Following are the methods and their parameters of mxnet.autogard.function class −
Methods and its Parameters | Definition |
---|---|
forward (heads[, head_grads, retain_graph, …]) | This method is used for forward computation. |
backward(heads[, head_grads, retain_graph, …]) | This method is used for backward computation. It computes the gradients of heads with respect to previously marked variables. This method takes as many inputs as forward’s output. It also returns as many NDArray’s as forward’s inputs. |
get_symbol(x) | This method is used to retrieve recorded computation history as Symbol. |
grad(heads, variables[, head_grads, …]) | This method computes the gradients of heads with respect to variables. Once computed, instead of storing into variable.grad, gradients will be returned as new NDArrays. |
is_recording() | With the help of this method we can get status on recording and not recording. |
is_training() | With the help of this method we can get status on training and predicting. |
mark_variables(variables, gradients[, grad_reqs]) | This method will mark NDArrays as variables to compute gradient for autograd. This method is same as function .attach_grad() in a variable but the only difference is that with this call we can set the gradient to any value. |
pause([train_mode]) | This method returns a scope context to be used in ‘with’ statement for codes which do not need gradients to be calculated. |
predict_mode() | This method returns a scope context to be used in ‘with’ statement in which forward pass behavior is set to inference mode and that is without changing the recording states. |
record([train_mode]) | It will return an autograd recording scope context to be used in ‘with’ statement and captures code which needs gradients to be calculated. |
set_recording(is_recording) | Similar to is_recoring(), with the help of this method we can get status on recording and not recording. |
set_training(is_training) | Similar to is_traininig(), with the help of this method we can set status to training or predicting. |
train_mode() | This method will return a scope context to be used in ‘with’ statement in which forward pass behavior is set to training mode and that is without changing the recording states. |
Implementation Example
In the below example, we will be using mxnet.autograd.grad() method to compute the gradient of head with respect to variables −
x = mx.nd.ones((2,)) x.attach_grad() with mx.autograd.record(): z = mx.nd.elemwise_add(mx.nd.exp(x), x) dx_grad = mx.autograd.grad(z, [x], create_graph=True) dx_grad
Output
The output is mentioned below −
[ [3.7182817 3.7182817] <NDArray 2 @cpu(0)>]
We can use mxnet.autograd.predict_mode() method to return a scope to be used in ‘with’ statement −
with mx.autograd.record(): y = model(x) with mx.autograd.predict_mode(): y = samppng(y) backward([y])
mxnet.intiapzer
This is MXNet’ API for weigh initiapzer. It has the following classes −
Classes and their parameters
Following are the methods and their parameters of mxnet.autogard.function class:
Classes and its Parameters | Definition |
---|---|
Bipnear() | With the help of this class we can initiapze weight for up-samppng layers. |
Constant(value) | This class initiapzes the weights to a given value. The value can be a scalar as well as NDArray that matches the shape of the parameter to be set. |
FusedRNN(init, num_hidden, num_layers, mode) | As name imppes, this class initiapze parameters for the fused Recurrent Neural Network (RNN) layers. |
InitDesc | It acts as the descriptor for the initiapzation pattern. |
Initiapzer(**kwargs) | This is the base class of an initiapzer. |
LSTMBias([forget_bias]) | This class initiapze all biases of an LSTMCell to 0.0 but except for the forget gate whose bias is set to a custom value. |
Load(param[, default_init, verbose]) | This class initiapze the variables by loading data from file or dictionary. |
MSRAPrelu([factor_type, slope]) | As name imppes, this class Initiapze the weight according to a MSRA paper. |
Mixed(patterns, initiapzers) | It initiapzes the parameters using multiple initiapzers. |
Normal([sigma]) | Normal() class initiapzes weights with random values sampled from a normal distribution with a mean of zero and standard deviation (SD) of sigma. |
One() | It initiapzes the weights of parameter to one. |
Orthogonal([scale, rand_type]) | As name imppes, this class initiapze weight as orthogonal matrix. |
Uniform([scale]) | It initiapzes weights with random values which is uniformly sampled from a given range. |
Xavier([rnd_type, factor_type, magnitude]) | It actually returns an initiapzer that performs “Xavier” initiapzation for weights. |
Zero() | It initiapzes the weights of parameter to zero. |
Implementation Example
In the below example, we will be using mxnet.init.Normal() class create an initiapzer and retrieve its parameters −
init = mx.init.Normal(0.8) init.dumps()
Output
The output is given below −
["normal", {"sigma": 0.8}]
Example
init = mx.init.Xavier(factor_type="in", magnitude=2.45) init.dumps()
Output
The output is shown below −
["xavier", {"rnd_type": "uniform", "factor_type": "in", "magnitude": 2.45}]
In the below example, we will be using mxnet.initiapzer.Mixed() class to initiapze parameters using multiple initiapzers −
init = mx.initiapzer.Mixed([ bias , .* ], [mx.init.Zero(), mx.init.Uniform(0.1)]) module.init_params(init) for dictionary in module.get_params(): for key in dictionary: print(key) print(dictionary[key].asnumpy())
Output
The output is shown below −
fullyconnected1_weight [[ 0.0097627 0.01856892 0.04303787]] fullyconnected1_bias [ 0.]Advertisements