Balanced Bootstrapping with Random Forest for Imbalanced Data Sets

The curse of imbalanced data refers to machine learning models trained to predict outcomes which are majority, and neglecting the minority. This is a common problem in fraud detection, cancer classification, etc. In those examples, although the occurrence of the positive outcome is rare, it is highly crucial that the machine learning model is able …

Continue reading Balanced Bootstrapping with Random Forest for Imbalanced Data Sets

Machine Learning Model Building, Selection and Hyperparameter Tuning

Selecting the appropriate machine learning model for the data set requires knowledge and insights about the data acquired from the EDA stage. The clean data used for this exercise can be downloaded here and the post on how we cleaned this data is here. In our bivariate EDA, we learned that there isn't strong correlation …

Continue reading Machine Learning Model Building, Selection and Hyperparameter Tuning