Linear Regression
Class Reference
- class pykitml.LinearRegression(input_size, output_size, reg_param=0)
Implements linear regression.
- __init__(input_size, output_size, reg_param=0)
- Parameters:
input_size (int) – Size of input data or number of input features.
output_size (int) – Number of categories or groups.
reg_param (int) – Regularization parameter for the model, also known as ‘weight decay’.
- feed(input_data)
Accepts input array and feeds it to the model.
- Parameters:
input_data (numpy.array) – The input to feed the model.
- Raises:
ValueError – If the input data has invalid dimensions/shape.
Note
This function only feeds the input data, to get the output after calling this function use
get_output()orget_output_onehot()
- get_output()
Returns the output activations of the model.
- Returns:
The output activations.
- Return type:
numpy.array
- train(training_data, targets, batch_size, epochs, optimizer, testing_data=None, testing_targets=None, testing_freq=1, decay_freq=1)
Trains the model on the training data, after training is complete, you can call
plot_performance()to plot performance graphs.- Parameters:
training_data (numpy.array) – numpy array containing training data.
targets (numpy.array) – numpy array containing training targets, corresponding to the training data.
batch_size (int) – Number of training examples to use in one epoch, or number of training examples to use to estimate the gradient.
epochs (int) – Number of epochs the model should be trained for.
optimizer (any Optimizer object) – See Optimizers
testing_data (numpy.array) – numpy array containing testing data.
testing_targets (numpy.array) – numpy array containing testing targets, corresponding to the testing data.
testing_freq (int) – How frequently the model should be tested, i.e the model will be tested after every
testing_freqepochs. You may want to increase this to reduce training time.decay_freq (int) – How frequently the model should decay the learning rate. The learning rate will decay after every
decay_freqepochs.
- Raises:
ValueError – If
training_data,targets,testing_dataortesting_targetshas invalid dimensions/shape.
- r2score(testing_data, testing_targets)
Return R-squared or coefficient of determination value.
- Parameters:
testing_data (numpy.array) – numpy array containing testing data.
testing_targets (numpy.array) – numpy array containing testing targets, corresponding to the testing data.
- Returns:
r2score – The average cost of the model over the testing data.
- Return type:
float
- Raises:
ValueError – If
testing_dataortesting_targetshas invalid dimensions/shape.
- cost(testing_data, testing_targets)
Tests the average cost of the model on the testing data passed to the function.
- Parameters:
testing_data (numpy.array) – numpy array containing testing data.
testing_targets (numpy.array) – numpy array containing testing targets, corresponding to the testing data.
- Returns:
cost – The average cost of the model over the testing data.
- Return type:
float
- Raises:
ValueError – If
testing_dataortesting_targetshas invalid dimensions/shape.
Example: Predicting Fish Length
Dataset
Fish Length - pykitml.datasets.fishlength module
Training Model
import pykitml as pk
from pykitml.datasets import fishlength
# Load the dataset
inputs, outputs = fishlength.load()
# Normalize inputs
array_min, array_max = pk.get_minmax(inputs)
inputs = pk.normalize_minmax(inputs, array_min, array_max)
# Create polynomial features
inputs_poly = pk.polynomial(inputs)
# Normalize outputs
array_min, array_max = pk.get_minmax(outputs)
outputs = pk.normalize_minmax(outputs, array_min, array_max)
# Create model
fish_classifier = pk.LinearRegression(inputs_poly.shape[1], 1)
# Train the model
fish_classifier.train(
training_data=inputs_poly,
targets=outputs,
batch_size=22,
epochs=200,
optimizer=pk.Adam(learning_rate=0.02, decay_rate=0.99),
testing_freq=1,
decay_freq=10
)
# Save model
pk.save(fish_classifier, 'fish_classifier.pkl')
# Plot performance
fish_classifier.plot_performance()
# Print r2 score
print('r2score:', fish_classifier.r2score(inputs_poly, outputs))
Predict length of fish that is 28 days old at 25C
import numpy as np
import pykitml as pk
from pykitml.datasets import fishlength
# Predict length of fish that is 28 days old at 25C
# Load the dataset
inputs, outputs = fishlength.load()
# Load the model
fish_classifier = pk.load('fish_classifier.pkl')
# Normalize inputs
array_min, array_max = pk.get_minmax(inputs)
input_data = pk.normalize_minmax(np.array([28, 25]), array_min, array_max)
# Create plynomial features
input_data_poly = pk.polynomial(input_data)
# Get output
fish_classifier.feed(input_data_poly)
model_output = fish_classifier.get_output()
# Denormalize output
array_min, array_max = pk.get_minmax(outputs)
model_output = pk.denormalize_minmax(model_output, array_min, array_max)
# Print result
print(model_output)
Performance Graph