neograd.autograd package

Subpackages

Submodules

neograd.autograd.graph module

class neograd.autograd.graph.Graph[source]

Bases: object

Used to keep track of nodes and tensors

The graph is constructed during the forward pass, and used by the backward pass to calculate gradients through automatic differentiation

Parameters
  • graph (Graph or None) – Graph object that’s currently in use. If None, then the global _NG_GRAPH is used, else a specific graph object is used. Defaults to None

  • nodes_dict (dict) – Stores key-value pairs of tensors and their corresponding nodes in the graph

  • track (bool) – Whether the graph must track the tensor operations or not, ie if True, when any operation happens and a new result tensor is created, then the operands of the operation are added as parents to the result tensor and the result tensor is added as child to the operands, if False, none of these happens. Defaults to True

add_edge(result_node, operands)[source]

Creates an edge between two nodes

Adds edges between the result_node, which is created during an Operation, and the operands that produced the result. This means the result_node is added as a child of each of the operands and the result_node adds all operands as its parents

Parameters
  • result_node (Node) – node that is created in Operation.get_result_tensor

  • operands (list of Tensor) – All the operands for an Operation

add_node(node)[source]

Adds a Node to the graph

Creates an key-value pair in nodes_dict with the specified node as the value and its tens attribute as the key

Parameters

node (Node) – Node to be added to the graph

add_tensor(tens)[source]

Adds a Tensor to the graph

A new node is created for the Tensor and corresponding entry is made in nodes_dict

Parameters

tens (Tensor) – Tensor to be added

get_node(tens)[source]

Returns the Node corresponding to the Tensor

Parameters

tens (Tensor) – Tensor whose node is to be fetched

Returns

Node if found, else None

graph = None
remove_tensor(tens)[source]

Removes a Tensor from the graph

Pops the Tensor from nodes_dict

Parameters

tens (Tensor) – Tensor to be removed

reset_graph()[source]

Resets the whole graph

This is accomplished by setting nodes_dict to an empty dictionary Doing so, removes all the Tensors and their Nodes from the graph

reset_visited()[source]

Sets visited=False for each Node in the graph

zero_grad()[source]

Performs zero_grad on all the tensors in the graph

Iterates through nodes_dict and performs zero_grad on the tensors

neograd.autograd.node module

class neograd.autograd.node.Node(tens)[source]

Bases: object

Used as an abstraction to connect the tensors together and hold relationships

Each Tensor is assigned a Node and this Node monitors all the incoming edges(parents) and the outgoing edges(children)

Parameters
  • children (list of Node) – List of all Nodes which uses the current Node as an operand in an Operation

  • parents (list of Node) – List of all Nodes(operands) that has resulted in the creation of current Node

  • parent_broadcast_shape (tuple or None) – If the parent needs to be broadcasted from one shape to another, then the final broadcasted shape of the parent is stored here. If they cannot be broadcasted, then it is None

  • backward_fn (Operation.backward) – Sets the grad_fn of Tensor(operand) involved in the Operation

  • visited (bool) –

add_child(other)[source]

Adds a child to the Node

Parameters

other (Node) – The child Node

add_parent(other)[source]

Adds a parent to the Node

Parameters

other (Node) – The parent Node

are_children_visited()[source]

Checks if all children are visited

Returns

True if all children are visited else False

are_parents_visited()[source]

Checks if all parents are visited

Returns

True if all parents are visited else False

backward(retain_graph)[source]

Initiates backward pass starting from current Node

This first visits all the children to make sure that they aren’t included in sorted_tensors as they aren’t required as backward pass is being initiated from the current node.

Then it pops its corresponding Tensor from sorted_tensors (it is the first tensor) so that _backward can be called on it with calculate_grads=False, so that grads arent calculated for it, but allows flushing of all Tensors

Next it topologically sorts all Tensors starting from current Node then the Node corresponding to the Tensor is retreived, which is marked as visited and the Tensor’s backward pass is initiated.

Parameters

retain_graph (bool) – If the graph should be retained after backward pass or flushed after backward calculation

top_sort()[source]

Performs topological sort of all Nodes starting from current Node

Sorts the graph topologically, to perform backward pass efficiently, so that all the children’s is calculated before the current node’s gradient is calculated.

Sorting is done by first checking if all the children are visited, if they are, then the current node is added to sorted_tensors if not, then topological sort is performed on children

visit_all_children()[source]

Marks all children as visited

neograd.autograd.tensor module

class neograd.autograd.tensor.Tensor(data, requires_grad=False, requires_broadcasting=True)[source]

Bases: object

Wrapper around NumPy arrays

Parameters
  • data (int or float or list or np.ndarray) – The data to be stored or manipulated

  • requires_grad (bool) – Whether the Tensor requires gradient to be calculated or not Defaults to False

  • requires_broadcasting (bool) – Whether the Tensor needs to be broadcasted when some Operation is performed. Defaults to True. This attribute is present as there are some operations like Convolution for which the kernel shouldn’t be broadcasted to inputs shape

  • grad (np.ndarray) – The gradient value of the Tensor. Defaults to 0 if requires_grad else None

  • grad_fn – The function that is set in Operation.backward, that’ll be executed during backward pass to set the gradient of the Tensor

property T

Performs the transpose of the Tensor

Returns

Transpose of the Tensor

_backward(node, retain_graph, calculate_grads=True)[source]

The essence of autograd, final gradient calculations for the Tensor is performed here

The gradient of each child is taken as upper gradient, the backward_fn of the Node of the Tensor is executed to set the grad_fn of Tensor.

grad_fn is executed, the grad is then unbroadcasted, if Tensor has been broadcasted during the Operation. auto-removal of Tensor from the graph is performed when retain_graph is False

Parameters
  • node (Node) – The Node corresponding to the Tensor

  • retain_graph (bool) – Whether the graph needs to be retained or reset

  • calculate_grads (bool) – Whether gradients should be calculated or not, Defaults to True

accumulate_grad(grad)[source]

Accumulates gradients for the Tensor

Adds the gradient calculated with the overall gradient of the Tensor because if gradients are flowing into a Tensor from two different paths, they need to be summed up

Parameters

grad (np.ndarray) – The gradient to be added/accumulated

backward(upper_grad=1.0, retain_graph=False)[source]

Kicks off the backward pass to calculate gradients

Starts the gradient calculation for the backward pass from the Tensor, by calling the backward method of its corresponding Node

Parameters
  • upper_grad (int or float or list or np.ndarray) – The gradient with which to start the gradient calculation. Shape of upper_grad and shape of Tensor must be the same. Defaults to 1 as usually backward is called on a loss Tensor that has a scalar value

  • retain_graph (bool) – If the graph should be retained after backward pass or should be reset. Auto-removal of Tensors from the graph, which happens when the gradients of all tensors of node’s parents have been calculated will be turned off.

Raises
  • ValueError – If called on a Tensor that doesn’t have requires_grad

  • ValueError – If shapes of upper_grad and Tensor doesn’t match

property data

Returns the data present in the Tensor

Returns

Data in the Tensor

Return type

data (np.ndarray)

dot(other)[source]

Performs dot product of Tensor with another object

Parameters

other (int or float or list or np.ndarray) – The object that needs to be dotted with

Returns

Tensor of the result

exp()[source]

Performs exponentiation on the Tensor

Returns

Tensor of the result

flatten()[source]

Flattens the Tensor from any dimension to 1D

Returns

Flattened Tensor

reshape(new_shape)[source]

Reshapes the Tensor to the new shape

Parameters

new_shape (tuple) – The shape to which the Tensor should be reshaped to

Returns

Reshaped Tensor

set_grad_fn(grad_fn)[source]

Sets the grad_fn for the Tensor

If requires_grad is True, then Tensor.grad_fn is set to grad_fn else None

Parameters

grad_fn – Function that is set during execution of Operation.backward

property shape

Returns the shape of the Tensor

Returns

Shape of data in the Tensor

sum(axis=None)[source]

Performs sum of Tensor along an axis

Parameters

axis (None or int) – The axis along which it should be summed

Returns

Tensor of the result

zero_grad()[source]

Resets the grad of the Tensor to the defaults

neograd.autograd.utils module

neograd.autograd.utils._evaluate_grad_check(analytical_grads, calculated_grads, epsilon, print_vals)[source]

Evaluates the gradient check and indicates whether it has passed or not

Calculates the distance between the analytical and calculated gradients and if it is less than epsilon, then it has passed else failed

Parameters
  • analytical_grads (list of int or float) – Gradients that are calculated analytically by wiggling the parameters

  • calculated_grads (list of int or float) – Gradients that are calulated through backpropagation

  • epsilon (float) – The amount by which params need to be wiggled

  • print_vals (bool) – True if distance and verdict needs to be printed

Returns

Distance between analytical and calculated gradients

neograd.autograd.utils._wiggle_params(analytical_grads, calculated_grads, params, get_loss, epsilon)[source]

Changes the params value by epsilon and calculates the analytical gradient

First to each element in params.data epsilon is added and loss is calculated, similarly 2*epsilon is subtracted to get another loss and using these two analytical gradient is calculated and is appended to analytical_grads and the gradient in param is appended to calculated_grads

Parameters
  • analytical_grads (list of int or float) – Gradients that are calculated analytically by wiggling the parameters

  • calculated_grads (list of int or float) – Gradients that are calulated through backpropagation

  • params (list of Tensor) – All params that need to be wiggled

  • get_loss – function that is used to calculate the loss

  • epsilon (float) – The amount by which params need to be wiggled

neograd.autograd.utils.fn_grad_check(fn, inputs, params, targets=None, loss_fn=None, epsilon=1e-07, print_vals=True, **kwargs)[source]

Performs Gradient Check for a function

Implements Gradient Check for a function instead of a complete model Any params that are required to be gradient checked can be specified

Parameters
  • fn – Function to be gradient checked

  • inputs (list of Tensor) – inputs to the function

  • params (list of Tensor) – the params whose data can be wiggled to get the gradients

  • targets (Tensor) – targets of the function

  • loss_fn (Loss) – loss_fn to evaluate the function

  • epsilon (float) – The amount by which params need to be wiggled Defaults to 1e-7

  • print_vals (bool) – True if distance and verdict needs to be printed

  • **kwargs – Any kwargs to be passed to fn

Returns

Distance between analytical and calculated gradients

neograd.autograd.utils.get_graph()[source]

Returns graph that is in use and present in Graph.graph

If Graph.graph is None, then the global graph _NG_GRAPH is used

Returns

Graph object that is currently used

neograd.autograd.utils.grad_check(model, inputs, targets, loss_fn, epsilon=1e-07, print_vals=True)[source]

Performs Gradient Check

Implements Gradient Check, to make sure that backprop is calculating the right gradients. All the parameters in the model are checked.

If distance between backprop gradients and numerical gradients is less than epsilon, then the gradients are proper, if not there is an issue

Parameters
  • model (Model) – The Neural Network to be evaluated

  • inputs (Tensor) – Input data(No need for complete data, only sample enough)

  • targets (Tensor) – Targets

  • loss_fn (Loss) – Loss Function

  • epsilon (float) – The amount by which params need to be wiggled Defaults to 1e-7

  • print_vals (bool) – True if distance and verdict needs to be printed

Returns

Distance between analytical and calculated gradients

class neograd.autograd.utils.new_graph[source]

Bases: object

Creates a Graph object

Context Manager to create a new graph if required anywhere and under the circumstances where it shouldn’t interfere with the global _NG_GRAPH

After entering, Graph object created is set in Graph.graph. After exiting the Graph.graph is set back to None which implies that global _NG_GRAPH will be used

class neograd.autograd.utils.no_track[source]

Bases: object

Prevents tracking of Tensors

Context Manager to prevent creation of a backward graph, when gradient calculation is not required, for ex when testing a model after training it, you don’t need any backward pass

On entering, graph.track is set to False to indicate no tracking and on exiting, it is set back to True

Parameters

graph (Graph) – The current graph in use

neograd.autograd.utils.process_data(data)[source]

Checks and processes the data for storage in Tensor

Supported types for data - [int, float, list, np.ndarray] Elements in data should be float or be typecastable to float

Parameters

data (int or float or list or np.ndarray) – Data to be processed

Returns

Processed data

Raises
  • TypeError – If data or its elements aren’t typecastable to float

  • TypeError – If data is not instance of supported types

neograd.autograd.utils.unbroadcast_data(data, orig_data_shape, broadcasted_shape)[source]

Unbroadcasts the data to its original shape

If data(a np object) is broadcasted during an operation, then it is unbroadcasted here, where all axes where it was broadcasted are summed along those axes to give the original shape of the data. If broadcasted_shape is None, then the data is returned as is.

Parameters
  • data (np.ndarray) – Data to be unbroadcasted

  • orig_data_shape (tuple) – Original shape of data before broadcasting

  • broadcasted_shape (tuple) – Shape to which data has been broadcasted to

Returns

Data that is unbroadcasted

Module contents