Nearest Neighbor¶
Class Reference¶
-
class
pykitml.
NearestNeighbor
(inputs_size, output_size, no_neighbors=1)¶ This class implements nearest neighbor classifier.
-
__init__
(inputs_size, output_size, no_neighbors=1)¶ Parameters: - input_size (int) – Size of input data or number of input features.
- output_size (int) – Number of categories or groups.
- no_neighbors (int) – The number of nearest neighbors to consider.
-
feed
(input_data)¶ Accepts input array and feeds it to the model.
Parameters: input_data (numpy.array) – The input to feed the model. Raises: ValueError
– If the input data has invalid dimensions/shape.Note
This function only feeds the input data, to get the output after calling this function use
get_output()
orget_output_onehot()
-
get_output
()¶ Returns the output activations of the model.
Returns: The output activations. Return type: numpy.array
-
get_output_onehot
()¶ Returns the output layer activations of the model as a one-hot array. A one-hot array is an array of bits in which only one of the bits is high/true. In this case, the corresponding bit to the neuron/node having the highest activation will be high/true.
Returns: The one-hot output activations array. Return type: numpy.array
-
train
(training_data, targets)¶ Trains the model on the training data.
Parameters: - training_data (numpy.array) – numpy array containing training data.
- targets (numpy.array) – numpy array containing training targets, corresponding to the training data.
-
accuracy
(testing_data, testing_targets)¶ Tests the accuracy of the model on the testing data passed to the function. This function should be only used for classification.
Parameters: - testing_data (numpy.array) – numpy array containing testing data.
- testing_targets (numpy.array) – numpy array containing testing targets, corresponding to the testing data.
Returns: accuracy – The accuracy of the model over the testing data i.e how many testing examples did the model predict correctly.
Return type: float
-
r2score
(testing_data, testing_targets)¶ Return R-squared or coefficient of determination value.
Parameters: - testing_data (numpy.array) – numpy array containing testing data.
- testing_targets (numpy.array) – numpy array containing testing targets, corresponding to the testing data.
Returns: r2score – The average cost of the model over the testing data.
Return type: float
Raises: ValueError
– Iftesting_data
ortesting_targets
has invalid dimensions/shape.
-
confusion_matrix
(test_data, test_targets, gnames=[], plot=True)¶ Returns and plots confusion matrix on the given test data.
Parameters: - test_data (numpy.array) – Numpy array containing test data
- test_targets (numpy.array) – Numpy array containing the targets corresponding to the test data.
- plot (bool) – If set to false, will not plot the matrix. Default is true.
- gnames (list) – List of string names for each class/group.
Returns: confusion_matrix – The confusion matrix.
Return type: numpy.array
-
Example: Classifying Iris¶
Dataset
Iris - pykitml.datasets.iris module
Training
import pykitml as pk
from pykitml.datasets import iris
# Load iris data set
inputs_train, outputs_train, inputs_test, outputs_test = iris.load()
# Create model
neighbor_iris_classifier = pk.NearestNeighbor(4, 3)
# Train the model
neighbor_iris_classifier.train(
training_data=inputs_train,
targets=outputs_train,
)
# Save it
pk.save(neighbor_iris_classifier, 'neighbor_iris_classifier.pkl')
# Print accuracy
accuracy = neighbor_iris_classifier.accuracy(inputs_train, outputs_train)
print('Train accuracy:', accuracy)
accuracy = neighbor_iris_classifier.accuracy(inputs_test, outputs_test)
print('Test accuracy:', accuracy)
# Plot confusion matrix
neighbor_iris_classifier.confusion_matrix(inputs_test, outputs_test,
gnames=['Setosa', 'Versicolor', 'Virginica'])
Predict type of species with sepal-length, sepal-width, petal-length, petal-width: 5.8, 2.7, 3.9, 1.2
import numpy as np
import pykitml as pk
# Predict type of species with
# sepal-length sepal-width petal-length petal-width
# 5.8, 2.7, 3.9, 1.2
input_data = np.array([5.8, 2.7, 3.9, 1.2])
# Load the model
neighbor_iris_classifier = pk.load('neighbor_iris_classifier.pkl')
# Get output
neighbor_iris_classifier.feed(input_data)
model_output = neighbor_iris_classifier.get_output_onehot()
# Print result
print(model_output)
Confusion Matrix