Computer Vision News - January 2024

19 Data Availability and Quantity in Medical AI Projects (part 1) Computer Vision News There are many vendors of CT devices, and each one offers many different models. Each one generates scans of different quality, responding to different criteria like price, type, or form factor: the same anatomy (or the same patient) will be presented with different values, region of interest or quality by two machines that performed the scan, whether they belong to the same generation of devices or to a different one. In addition, specific requirements for a screening protocol could be different in various hospitals based on internal rules. Data is largely available in hospitals, but in every hospital, there is not much of a variety of acquisition devices. Equipment may have been used in a given for many years, and the quality of CT performed by a 20 years old machine is not the same as that offered by a new device. On the other hand, it is easier to get data from a restricted number of hospitals, but in that case the AI model will be trained on very specific data, lacking the required variety to cover a whole spectrum of devices. Ideally, a neural network would be trained on data from many tens of different sources, with different manufacturers and different models. However, the acquisition of data from each hospital is a long process, that requires significant resources. More often than not, the acquired data is limited and this is very challenging for medical AI development. AI models trained on a limited dataset may perform well on similarly acquired scans, but not on different scans performed or by those using different protocols in other hospitals. Working with the expertise of RSIP Vision enables us to mitigate this challenge in many ways. Our R&D team has put in place modality and application-specific data augmentation tasks. Data augmentation is a frequently performed task, though in this case, we do it in a different way: we define a specific set of augmentation for this task, so that the resulting dataset will still make sense from a clinical point of view. “We work closely with experts like radiologists, orthopedists, and ultrasound specialists to find the proper range and the parameters for the augmentations,” said Ilya Kovler, CTO at RSIP Vision. “We also use transfer learning from the pretrained models.”