Dimensionality Reduction for Machine Learning

Last modified: 2023-08-20

Data Processing Machine Learning

Dimensionality Reduction is a data processing to make machine learning models easier to train.

PCA (Principal Component Analysis)

Reference: https://www.kaggle.com/code/jonbown/ai-ctf-submissions?scriptVersionId=105606691&cellId=42

we use PCA to find the optimal dimensions for data.

import numpy as np
from sklearn.decomposition import PCA

data = np.load("example.npy")

for i in range(1, 10):
	pca = PCA(n_components=i)
	principal_components = pca.fit_transform(data)
	print(pca.explained_variance_ratio_)