Computer Vision News - January 2024

33 Making MetaDataCount Computer Vision News amount of sensitive encoding (e.g. age) with an intervention. She discussed that the encoding of a sensitive attribute is not predictive of a model’s fairness characteristics. She showed that subsampling can be an effective mitigation tool. Finally, she showed that with their method, when shortcutting is not the (main) source of unfairness, fairer models could be selected. The speakers of the second talk were Rhys Compton and Lily Zhang. Rhys Compton holds a Master’s in Computer Science from New York University (NYU), and Lily Zhang is a PhD candidate at NYU in the Center for Data Science. Rhys Compton along with Lily Zhang talked about their work “When More is Less”. They discussed how introducing additional datasets can hurt performance by introducing spurious correlations. They found that hospital signal is strongly embedded in chest X-ray, models were able to predict source hospital of an image with 98% accuracy! They discussed how balancing can help but does not always improve classification performance. Balancing seemed to be most beneficial when datasets had significantly different disease prevalence. We are planning to hold our next webinar in March 2024, sign up for our newsletter if you want to stay updated! Enzo Ferrante is a faculty researcher at Argentina’s National Research Council where he leads the Machine Learning for Biomedical Image Computing lab. Enzo presented their journey when creating a large-scale dataset of segmentations (CheXMask) for several publicly available chest X-ray datasets. He discussed some of the challenges they encountered for the creation of a derived dataset. He highlighted the importance of providing quality assessment (QA) metrics when leveraging automatic annotations, and emphasized the importance of expert validation. They also showed that QA metrics can be used as surrogates to audit for fairness in new populations without ground-truth annotations.