Apache MXNet Tutorial

Selected Reading

Apache MXNet - Python API autograd and initializer

Python API Autograd and Initiapzer

This chapter deals with the autograd and initiapzer API in MXNet.

mxnet.autograd

This is MXNet’ autograd API for NDArray. It has the following class −

Class: Function()

It is used for customised differentiation in autograd. It can be written as mxnet.autograd.Function. If, for any reason, the user do not want to use the gradients that are computed by the default chain-rule, then he/she can use Function class of mxnet.autograd to customize differentiation for computation. It has two methods namely Forward() and Backward().

Let us understand the working of this class with the help of following points −

First, we need to define our computation in the forward method.

Then, we need to provide the customized differentiation in the backward method.

Now during gradient computation, instead of user-defined backward function, mxnet.autograd will use the backward function defined by the user. We can also cast to numpy array and back for some operations in forward as well as backward.

Example

Before using the mxnet.autograd.function class, let’s define a stable sigmoid function with backward as well as forward methods as follows −


class sigmoid(mx.autograd.Function):
   def forward(self, x):
      y = 1 / (1 + mx.nd.exp(-x))
      self.save_for_backward(y)
      return y
   
   def backward(self, dy):
      y, = self.saved_tensors
      return dy * y * (1-y)

Now, the function class can be used as follows −


func = sigmoid()
x = mx.nd.random.uniform(shape=(10,))
x.attach_grad()
with mx.autograd.record():
m = func(x)
m.backward()
dx_grad = x.grad.asnumpy()
dx_grad

Output

When you run the code, you will see the following output −


array([0.21458015, 0.21291625, 0.23330082, 0.2361367 , 0.23086983,
0.24060014, 0.20326573, 0.21093895, 0.24968489, 0.24301809],
dtype=float32)

Methods and their parameters

Following are the methods and their parameters of mxnet.autogard.function class −

Methods and its Parameters	Definition
forward (heads[, head_grads, retain_graph, …])	This method is used for forward computation.
backward(heads[, head_grads, retain_graph, …])	This method is used for backward computation. It computes the gradients of heads with respect to previously marked variables. This method takes as many inputs as forward’s output. It also returns as many NDArray’s as forward’s inputs.
get_symbol(x)	This method is used to retrieve recorded computation history as Symbol.
grad(heads, variables[, head_grads, …])	This method computes the gradients of heads with respect to variables. Once computed, instead of storing into variable.grad, gradients will be returned as new NDArrays.
is_recording()	With the help of this method we can get status on recording and not recording.
is_training()	With the help of this method we can get status on training and predicting.
mark_variables(variables, gradients[, grad_reqs])	This method will mark NDArrays as variables to compute gradient for autograd. This method is same as function .attach_grad() in a variable but the only difference is that with this call we can set the gradient to any value.
pause([train_mode])	This method returns a scope context to be used in ‘with’ statement for codes which do not need gradients to be calculated.
predict_mode()	This method returns a scope context to be used in ‘with’ statement in which forward pass behavior is set to inference mode and that is without changing the recording states.
record([train_mode])	It will return an autograd recording scope context to be used in ‘with’ statement and captures code which needs gradients to be calculated.
set_recording(is_recording)	Similar to is_recoring(), with the help of this method we can get status on recording and not recording.
set_training(is_training)	Similar to is_traininig(), with the help of this method we can set status to training or predicting.
train_mode()	This method will return a scope context to be used in ‘with’ statement in which forward pass behavior is set to training mode and that is without changing the recording states.

Implementation Example

In the below example, we will be using mxnet.autograd.grad() method to compute the gradient of head with respect to variables −


x = mx.nd.ones((2,))
x.attach_grad()
with mx.autograd.record():
z = mx.nd.elemwise_add(mx.nd.exp(x), x)
dx_grad = mx.autograd.grad(z, [x], create_graph=True)
dx_grad

Output

The output is mentioned below −


[
[3.7182817 3.7182817]
<NDArray 2 @cpu(0)>]

We can use mxnet.autograd.predict_mode() method to return a scope to be used in ‘with’ statement −


with mx.autograd.record():
y = model(x)
with mx.autograd.predict_mode():
y = samppng(y)
backward([y])

mxnet.intiapzer

This is MXNet’ API for weigh initiapzer. It has the following classes −

Classes and their parameters

Following are the methods and their parameters of mxnet.autogard.function class:

Classes and its Parameters	Definition
Bipnear()	With the help of this class we can initiapze weight for up-samppng layers.
Constant(value)	This class initiapzes the weights to a given value. The value can be a scalar as well as NDArray that matches the shape of the parameter to be set.
FusedRNN(init, num_hidden, num_layers, mode)	As name imppes, this class initiapze parameters for the fused Recurrent Neural Network (RNN) layers.
InitDesc	It acts as the descriptor for the initiapzation pattern.
Initiapzer(**kwargs)	This is the base class of an initiapzer.
LSTMBias([forget_bias])	This class initiapze all biases of an LSTMCell to 0.0 but except for the forget gate whose bias is set to a custom value.
Load(param[, default_init, verbose])	This class initiapze the variables by loading data from file or dictionary.
MSRAPrelu([factor_type, slope])	As name imppes, this class Initiapze the weight according to a MSRA paper.
Mixed(patterns, initiapzers)	It initiapzes the parameters using multiple initiapzers.
Normal([sigma])	Normal() class initiapzes weights with random values sampled from a normal distribution with a mean of zero and standard deviation (SD) of sigma.
One()	It initiapzes the weights of parameter to one.
Orthogonal([scale, rand_type])	As name imppes, this class initiapze weight as orthogonal matrix.
Uniform([scale])	It initiapzes weights with random values which is uniformly sampled from a given range.
Xavier([rnd_type, factor_type, magnitude])	It actually returns an initiapzer that performs “Xavier” initiapzation for weights.
Zero()	It initiapzes the weights of parameter to zero.

Implementation Example

In the below example, we will be using mxnet.init.Normal() class create an initiapzer and retrieve its parameters −


init = mx.init.Normal(0.8)
init.dumps()

Output

The output is given below −


 ["normal", {"sigma": 0.8}]

Example


init = mx.init.Xavier(factor_type="in", magnitude=2.45)
init.dumps()

Output

The output is shown below −


 ["xavier", {"rnd_type": "uniform", "factor_type": "in", "magnitude": 2.45}]

In the below example, we will be using mxnet.initiapzer.Mixed() class to initiapze parameters using multiple initiapzers −


init = mx.initiapzer.Mixed([ bias ,  .* ], [mx.init.Zero(),
mx.init.Uniform(0.1)])
module.init_params(init)

for dictionary in module.get_params():
for key in dictionary:
print(key)
print(dictionary[key].asnumpy())

Output

The output is shown below −


fullyconnected1_weight
[[ 0.0097627 0.01856892 0.04303787]]
fullyconnected1_bias
[ 0.]

Python API Autograd and Initiapzer

mxnet.autograd

Class: Function()

Methods and their parameters

Implementation Example

mxnet.intiapzer

Classes and their parameters

Implementation Example

友情链接