Why: real-world data are typically noisy, enormous in volume, and may originate from a hodgepodge of heterogeneous sources.
mean; median; mode(most common value); distribution;
Knowing such basic statistics regarding each attribute makes it easier to fill in missing values, smooth noisy values, and spot outliers during data preprocessing.
原文地址:https://www.cnblogs.com/dulun/p/12293674.html
时间: 2024-10-08 15:56:03