The data set we use here is a simulated data that resembles closely to real-world data, and it can be downloaded here. It requires substantial data manipulation, including cleaning, binning, imputing and one-hot encoding before the machine learning model can train on it. Cleaning Thorough EDA in this post revealed several spelling mistakes, and we …
Continue reading Cleaning, Binning, Imputing and One-hot Encoding