Computer Vision News - March 2020
3 Summary A Step Towards Explainability 19 import numpy as np import pandas as pd import matplotlib.pyplot as plt import seaborn as sns #for plotting from sklearn.ensemble import RandomForestClassifier #for the model from sklearn.tree import DecisionTreeClassifier from sklearn.tree import export_graphviz #plot tree from sklearn.metrics import roc_curve, auc #for model evaluation from sklearn.metrics import classification_report #for model evaluation from sklearn.metrics import confusion_matrix #for model evaluation from sklearn.model_selection import train_test_split #for data splitting import eli5 #for purmutation importance from eli5.sklearn import PermutationImportance import shap #for SHAP values from pdpbox import pdp, info_plots #for partial plots np.random.seed(123) #ensure reproducibility pd.options.mode.chained_assignment = None #hide any pandas warnings This should display the first five entries of the dataset, for exploration. This provides with an easy, clean way to visualise the data. The meaning of the parameters is not always obvious though, so let’s explain before going further: sex : (0: female) cp: Chest pain experienced (1-4: typical angina - asymptomatic) trestbps: The subject’s resting blood pressure (mm Hg on hospital admission) chol: Cholesterol measurement in mg/dl fbs: Fasting blood sugar (> 120 mg/dl, 1 = true; 0 = false) age sex cp tres tbp s chol fbs rest ecg thal ach exa ng old pea k slop e ca thal targ et 0 63 1 3 145 233 1 0 150 0 2.3 0 0 1 1 1 37 1 2 130 250 0 1 187 0 3.5 0 0 2 1 2 41 0 1 130 204 0 0 172 0 1.4 2 0 2 1 3 56 1 1 120 236 0 1 178 0 0.8 2 0 2 1 4 57 0 0 120 354 0 1 163 1 0.6 2 0 2 1 5 57 1 0 140 192 0 1 148 0 0.4 1 0 1 1 Now you just need to load the data: dt = pd.read_csv("../data/heart_init.csv") dt.head(5)
Made with FlippingBook
RkJQdWJsaXNoZXIy NTc3NzU=