Normalization/Feature-scaling
Min-Max Normalization
- pykitml.get_minmax(array)
Returns two row arrays, one array containing minimum values of each column and another one with maximum values.
- Parameters:
array (numpy.array) – The array to get minimum and maximum values for.
- Returns:
array_min (numpy.array) – Array containing minimum values of each column.
array_max (numpy.array) – Array containing maximum values of each column.
- pykitml.normalize_minmax(array, array_min, array_max, cols=[])
Normalizes columns of the array to between 0 and 1 using min-max normalization.
- Parameters:
array (numpy.array) – The array to normalize.
array_min (numpy.array) – Array containing minimum values of each column.
array_max (numpy.array) – Array containing maximum values of each column.
cols (list) – The columns to normalize. If the list is empty (default), all columns will be normalized.
- Returns:
The normalized array.
- Return type:
numpy.array
Note
You can use
get_minmax()function to getarray_minandarray_maxparameters.
- pykitml.denormalize_minmax(array, array_min, array_max, cols=[])
Denormalizes columns of a min-max normalized array.
- Parameters:
array (numpy.array) – The array to denormalize.
array_min (numpy.array) – Array containing minimum values of each column.
array_max (numpy.array) – Array containing maximum values of each column.
cols (list) – The columns to normalize. If the list is empty (default), all columns will be denormalized.
- Returns:
The denormalized array.
- Return type:
numpy.array
Note
You can use
get_minmax()function to getarray_minandarray_maxparameters.
Example
>>> import numpy as np
>>> import pykitml as pk
>>> a = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [13, 14, 15, 16]])
>>> min_array, max_array = pk.get_minmax(a)
>>> normalized_a = pk.normalize_minmax(a, min_array, max_array)
>>> normalized_a
array([[0. , 0. , 0. , 0. ],
[0.33333333, 0.33333333, 0.33333333, 0.33333333],
[0.66666667, 0.66666667, 0.66666667, 0.66666667],
[1. , 1. , 1. , 1. ]])
>>> pk.denormalize_minmax(normalized_a, min_array, max_array)
array([[ 1., 2., 3., 4.],
[ 5., 6., 7., 8.],
[ 9., 10., 11., 12.],
[13., 14., 15., 16.]])
You can also only normalize/denormalize specific columns,
>>> import numpy as np
>>> import pykitml as pk
>>> a = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [13, 14, 15, 16]])
>>> min_array, max_array = pk.get_minmax(a)
>>> normalized_a = pk.normalize_minmax(a, min_array, max_array, cols=[0, 2])
>>> normalized_a
array([[ 0. , 2. , 0. , 4. ],
[ 0.33333333, 6. , 0.33333333, 8. ],
[ 0.66666667, 10. , 0.66666667, 12. ],
[ 1. , 14. , 1. , 16. ]])
>>> pk.denormalize_minmax(normalized_a, min_array, max_array, cols=[0, 2])
array([[ 1., 2., 3., 4.],
[ 5., 6., 7., 8.],
[ 9., 10., 11., 12.],
[13., 14., 15., 16.]])
Mean Normalization
- pykitml.get_meanstd(array)
Returns two row arrays, one array containing mean of each column and another one with standard deviation of each column.
- Parameters:
array (numpy.array) – The array to get mean and standard deviation values for.
- Returns:
array_mean (numpy.array) – Array containing mean values of each column.
array_stddev (numpy.array) – Array containing standard deviation values of each column.
- pykitml.normalize_mean(array, array_mean, array_stddev, cols=[])
Normalizes columns of the array with mean normalization.
- Parameters:
array (numpy.array) – The array to normalize.
array_mean (numpy.array) – Array containing mean values of each column.
array_stddev (numpy.array) – Array containing standard deviation values of each column.
cols (list) – The columns to normalize. If the list is empty (default), all columns will be normalized.
- Returns:
The normalized array.
- Return type:
numpy.array
Note
You can use
get_meanstd()function to getarray_meanandarray_stddevparameters.
- pykitml.denormalize_mean(array, array_mean, array_stddev, cols=[])
Denormalizes a mean normalized array.
- Parameters:
array (numpy.array) – The array to denormalize.
array_mean (numpy.array) – Array containing mean values of each column.
array_stddev (numpy.array) – Array containing standard deviation values of each column.
- Returns:
The denormalized array.
- Return type:
numpy.array
Note
You can use
get_meanstd()function to getarray_meanandarray_stddevparameters.
Example
>>> import numpy as np
>>> import pykitml as pk
>>> a = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [13, 14, 15, 16]])
>>> array_mean, array_stddev = pk.get_meanstd(a)
>>> normalized_a = pk.normalize_mean(a, array_mean, array_stddev)
>>> normalized_a
array([[-1.34164079, -1.34164079, -1.34164079, -1.34164079],
[-0.4472136 , -0.4472136 , -0.4472136 , -0.4472136 ],
[ 0.4472136 , 0.4472136 , 0.4472136 , 0.4472136 ],
[ 1.34164079, 1.34164079, 1.34164079, 1.34164079]])
>>> pk.denormalize_mean(normalized_a, array_mean, array_stddev)
array([[ 1., 2., 3., 4.],
[ 5., 6., 7., 8.],
[ 9., 10., 11., 12.],
[13., 14., 15., 16.]])
You can also only normalize/denormalize specific columns,
>>> import numpy as np
>>> import pykitml as pk
>>> a = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [13, 14, 15, 16]])
>>> array_mean, array_stddev = pk.get_meanstd(a)
>>> normalized_a = pk.normalize_mean(a, array_mean, array_stddev, cols=[0,2])
>>> normalized_a
array([[-1.34164079, 2. , -1.34164079, 4. ],
[-0.4472136 , 6. , -0.4472136 , 8. ],
[ 0.4472136 , 10. , 0.4472136 , 12. ],
[ 1.34164079, 14. , 1.34164079, 16. ]])
>>> pk.denormalize_mean(normalized_a, array_mean, array_stddev, cols=[0,2])
array([[ 1., 2., 3., 4.],
[ 5., 6., 7., 8.],
[ 9., 10., 11., 12.],
[13., 14., 15., 16.]])