Principal Component Analysis
Class Reference
- class pykitml.PCA(data_points, no_components)
This class implements Principle Component Analysis.
- __init__(data_points, no_components)
This class implements Principle Component Analysis, used for dimensionality reduction.
- Parameters:
data_points (numpy.array) – The dataset to perform PCA i.e. dimensionality reduction on.
no_components (int) – Number of principle components to use.
- transform(data_points)
Transforms the input dataset to lower dimensions.
- Parameters:
data_points (numpy.array) – The input dataset.
- Returns:
transformed_data_points – The transformed input.
- Return type:
numpy.array
- inverse_transform(pca_points)
Gets the original dataset from transformed points.
- Parameters:
pca_points (numpy.array) – The transformed points.
- retention
Returns the amount of variance retained, between 0 and 1.
Example: Compressing Fashion MNIST dataset
import os.path
import random
import matplotlib.pyplot as plt
import pykitml as pk
from pykitml.datasets import mnist
# Download dataset
if not os.path.exists('mnist.pkl'):
mnist.get()
# Load dataset
training_data, _, _, _ = mnist.load()
# Train PCA, reduce 784 dimensions to 250 dimensions
pca = pk.PCA(training_data, 250)
print('Variance retention:', pca.retention)
# Pick random datapoints
indices = random.sample(range(1, 1000), 16)
examples = training_data[indices]
# Show the original images
plt.figure('Original', figsize=(10, 7))
for i in range(1, 17):
plt.subplot(4, 4, i)
plt.imshow(examples[i-1].reshape((28, 28)), cmap='gray')
# Transform the example and compress
transformed_examples = pca.transform(examples)
# Inverse transform and recover the examples
recovered_examples = pca.inverse_transform(transformed_examples)
# Show the inverse transformed examples
plt.figure('Recovered', figsize=(10, 7))
for i in range(1, 17):
plt.subplot(4, 4, i)
plt.imshow(recovered_examples[i-1].reshape((28, 28)), cmap='gray')
# Show results
plt.show()
Original/Uncompressed
Recovered/Compressed