data:image/s3,"s3://crabby-images/c55d0/c55d094512e27c6aae274aa50ac71cb140452e76" alt="Numerical Computing with Python"
Curse of dimensionality with 1D, 2D, and 3D example
A quick analysis has been done to see how distance 60 random points are expanding with the increase in dimensionality. Initially, random points are drawn for one-dimension:
# 1-Dimension Plot >>> import numpy as np >>> import pandas as pd >>> import matplotlib.pyplot as plt >>> one_d_data = np.random.rand(60,1) >>> one_d_data_df = pd.DataFrame(one_d_data) >>> one_d_data_df.columns = ["1D_Data"] >>> one_d_data_df["height"] = 1 >>> plt.figure() >>> plt.scatter(one_d_data_df['1D_Data'],one_d_data_df["height"]) >>> plt.yticks([]) >>> plt.xlabel("1-D points") >>> plt.show()
If we observe the following graph, all 60 data points are very nearby in one-dimension:
data:image/s3,"s3://crabby-images/34091/34091c67ebc33a8d11003a844b7656d45f5d573c" alt=""
Here we are repeating the same experiment in a 2D space, by taking 60 random numbers with x and y coordinate space and plotted them visually:
# 2- Dimensions Plot >>> two_d_data = np.random.rand(60,2) >>> two_d_data_df = pd.DataFrame(two_d_data) >>> two_d_data_df.columns = ["x_axis","y_axis"] >>> plt.figure() >>> plt.scatter(two_d_data_df['x_axis'],two_d_data_df["y_axis"]) >>> plt.xlabel("x_axis");plt.ylabel("y_axis") >>> plt.show()
By observing the 2D graph we can see that more gaps have been appearing for the same 60 data points:
data:image/s3,"s3://crabby-images/b52eb/b52eb05195f1c4d7951fa087941f4a4d99eab055" alt=""
Finally, 60 data points are drawn for 3D space. We can see a further increase in spaces, which is very apparent. This has proven to us visually by now that with the increase in dimensions, it creates a lot of space, which makes a classifier weak to detect the signal:
# 3- Dimensions Plot >>> three_d_data = np.random.rand(60,3) >>> three_d_data_df = pd.DataFrame(three_d_data) >>> three_d_data_df.columns = ["x_axis","y_axis","z_axis"] >>> from mpl_toolkits.mplot3d import Axes3D >>> fig = plt.figure() >>> ax = fig.add_subplot(111, projection='3d') >>> ax.scatter(three_d_data_df['x_axis'],three_d_data_df["y_axis"],three_d_data_df ["z_axis"]) >>> plt.show()
data:image/s3,"s3://crabby-images/3654e/3654ee87076f76743f331f8432a7d6db8c580f6e" alt=""