Feed-Forward Neural Network

Class Reference

class pykitml.NeuralNetwork(layer_sizes, reg_param=0, config='leakyrelu-softmax-cross_entropy')

This class implements Feed-forward Neural Network.

__init__(layer_sizes, reg_param=0, config='leakyrelu-softmax-cross_entropy')
Parameters:
  • layer_sizes (list) – A list of integers describing the number of layers and the number of neurons in each layer. For e.g. [784, 100, 100, 10] describes a neural network with one input layer having 784 neurons, two hidden layers having 100 neurons each and a output layer with 10 neurons.
  • reg_param (int) – Regularization parameter for the network, also known as ‘weight decay’.
  • config (str) –

    The config string describes what activation functions and cost function to use for the network. The string should contain three function names seperated with ‘-’ character and should follow the order: '<hidden_layer_activation_func>-<output_layer_activation_func>-<cost_function>'. For e.g. 'relu-softmax-cross_entropy' tells the class to use relu as the activation function for input and hidden layers, softmax for output layer and cross entropy for the cost function.

    List of available activation functions: leakyrelu, relu, softmax, tanh, sigmoid, identity.

    List of available cost functions: mse (Mean Squared Error), cross_entropy (Cross Entropy), huber (Huber loss).

Raises:

AttributeError – If invalid config string.

feed(input_data)

Accepts input array and feeds it to the model.

Parameters:input_data (numpy.array) – The input to feed the model.
Raises:ValueError – If the input data has invalid dimensions/shape.

Note

This function only feeds the input data, to get the output after calling this function use get_output() or get_output_onehot()

get_output()

Returns the output activations of the model.

Returns:The output activations.
Return type:numpy.array
get_output_onehot()

Returns the output layer activations of the model as a one-hot array. A one-hot array is an array of bits in which only one of the bits is high/true. In this case, the corresponding bit to the neuron/node having the highest activation will be high/true.

Returns:The one-hot output activations array.
Return type:numpy.array
train(training_data, targets, batch_size, epochs, optimizer, testing_data=None, testing_targets=None, testing_freq=1, decay_freq=1)

Trains the model on the training data, after training is complete, you can call plot_performance() to plot performance graphs.

Parameters:
  • training_data (numpy.array) – numpy array containing training data.
  • targets (numpy.array) – numpy array containing training targets, corresponding to the training data.
  • batch_size (int) – Number of training examples to use in one epoch, or number of training examples to use to estimate the gradient.
  • epochs (int) – Number of epochs the model should be trained for.
  • optimizer (any Optimizer object) – See Optimizers
  • testing_data (numpy.array) – numpy array containing testing data.
  • testing_targets (numpy.array) – numpy array containing testing targets, corresponding to the testing data.
  • testing_freq (int) – How frequently the model should be tested, i.e the model will be tested after every testing_freq epochs. You may want to increase this to reduce training time.
  • decay_freq (int) – How frequently the model should decay the learning rate. The learning rate will decay after every decay_freq epochs.
Raises:

ValueError – If training_data, targets, testing_data or testing_targets has invalid dimensions/shape.

plot_performance()

Plots logged performance data after training. Should be called after train().

Raises:
  • AttributeError – If the model has not been trained, i.e train() has not been called before.
  • IndexError – If train() failed.
cost(testing_data, testing_targets)

Tests the average cost of the model on the testing data passed to the function.

Parameters:
  • testing_data (numpy.array) – numpy array containing testing data.
  • testing_targets (numpy.array) – numpy array containing testing targets, corresponding to the testing data.
Returns:

cost – The average cost of the model over the testing data.

Return type:

float

Raises:

ValueError – If testing_data or testing_targets has invalid dimensions/shape.

accuracy(testing_data, testing_targets)

Tests the accuracy of the model on the testing data passed to the function. This function should be only used for classification.

Parameters:
  • testing_data (numpy.array) – numpy array containing testing data.
  • testing_targets (numpy.array) – numpy array containing testing targets, corresponding to the testing data.
Returns:

accuracy – The accuracy of the model over the testing data i.e how many testing examples did the model predict correctly.

Return type:

float

r2score(testing_data, testing_targets)

Return R-squared or coefficient of determination value.

Parameters:
  • testing_data (numpy.array) – numpy array containing testing data.
  • testing_targets (numpy.array) – numpy array containing testing targets, corresponding to the testing data.
Returns:

r2score – The average cost of the model over the testing data.

Return type:

float

Raises:

ValueError – If testing_data or testing_targets has invalid dimensions/shape.

confusion_matrix(test_data, test_targets, gnames=[], plot=True)

Returns and plots confusion matrix on the given test data.

Parameters:
  • test_data (numpy.array) – Numpy array containing test data
  • test_targets (numpy.array) – Numpy array containing the targets corresponding to the test data.
  • plot (bool) – If set to false, will not plot the matrix. Default is true.
  • gnames (list) – List of string names for each class/group.
Returns:

confusion_matrix – The confusion matrix.

Return type:

numpy.array

nlayers

The number of layers in the network.

Example: Handwritten Digit Recognition (MNIST)

Dataset

MNIST - pykitml.datasets.mnist module

Training

import os.path

import pykitml as pk
from pykitml.datasets import mnist

# Download dataset
if not os.path.exists('mnist.pkl'):
    mnist.get()

# Load dataset
training_data, training_targets, testing_data, testing_targets = mnist.load()

# Create a new neural network
digit_classifier = pk.NeuralNetwork([784, 100, 10])

# Train it
digit_classifier.train(
    training_data=training_data,
    targets=training_targets,
    batch_size=50,
    epochs=1200,
    optimizer=pk.Adam(learning_rate=0.012, decay_rate=0.95),
    testing_data=testing_data,
    testing_targets=testing_targets,
    testing_freq=30,
    decay_freq=15
)

# Save it
pk.save(digit_classifier, 'digit_classifier_network.pkl')

# Show performance
accuracy = digit_classifier.accuracy(training_data, training_targets)
print('Train Accuracy:', accuracy)
accuracy = digit_classifier.accuracy(testing_data, testing_targets)
print('Test Accuracy:', accuracy)

# Plot performance graph
digit_classifier.plot_performance()

# Show confusion matrix
digit_classifier.confusion_matrix(training_data, training_targets)

Predicting

import random

import matplotlib.pyplot as plt
import pykitml as pk
from pykitml.datasets import mnist

# Load dataset
training_data, training_targets, _, _ = mnist.load()

# Load the trained network
digit_classifier = pk.load('digit_classifier_network.pkl')

# Pick a random example from testing data
index = random.randint(0, 9999)

# Show the test data and the label
plt.imshow(training_data[index].reshape(28, 28))
plt.show()
print('Label: ', training_targets[index])

# Show prediction
digit_classifier.feed(training_data[index])
model_output = digit_classifier.get_output_onehot()
print('Predicted: ', model_output)

Performance Graph

_images/neural_network_perf_graph.png

Confusion Matrix

_images/neural_network_confusion_matrix.png