Computer Vision News - March 2020
3 Summary A Step Towards Explainability 2 dt.columns = ['age', 'sex', 'chest_pain_type', 'resting_blood_pressure', 'cholesterol', 'fasting_blood_sugar', 'rest_ecg', 'max_heart_rate_achieved', 'exercise_induced_angina', 'st_depression', 'st_slope', 'num_major_vessels', 'thalassemia', 'target'] dt['sex'][dt['sex'] == 0] = 'female' dt['sex'][dt['sex'] == 1] = 'male' dt['fasting_blood_sugar'][dt['fasting_blood_sugar'] == 0] = 'lower than 120mg/ml' dt['fasting_blood_sugar'][dt['fasting_blood_sugar'] == 1] = 'greater than 120mg/ml' Using the dataframe syntax, allow to match and replace variables with a descriptive naming or explanation. Using the command dt.datatypes, it is possible to discover that some of the data types don’t contain the right values. This can be adjusted by converting as follows: Some of the variables can be dropped to make the dataset simpler. You can replace the male/female with just one variable (female) with 1 as being true and 0 as false and drop the first category of each column. Model and predictions Now let’s train the model using a Random Forest Classifier. As a reminder, a random forest is a “meta estimator that fits a number of decision tree classifiers on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting. The sub-sample size is always the same as the original input sample size but the samples are drawn with replacement if bootstrap=True (default)” [definition from SciKit Learn]. After fitting the model, we are going to extract its consequent decision tree. Feel free to change parameters in the export function call, by looking at the documentation, if you would like to change the way things are displayed! dt['sex'] = dt['sex'].astype('object') dt['chest_pain_type'] = dt['chest_pain_type'].astype('object') dt['fasting_blood_sugar'] = dt['fasting_blood_sugar'].astype('object') dt['rest_ecg'] = dt['rest_ecg'].astype('object') dt['exercise_induced_angina'] = dt['exercise_induced_angina'].astype('object') dt['st_slope'] = dt['st_slope'].astype('object') dt['thalassemia'] = dt['thalassemia'].astype('object')
Made with FlippingBook
RkJQdWJsaXNoZXIy NTc3NzU=