This article was first published on Computer Vision News of March 2022.
Francesco Ciompi is Assistant Professor of Computational Pathology in the Department of Pathology at Radboud University Medical Center, and part of the Diagnostic Image Analysis Group. Mart van Rijthoven is a PhD student under his supervision, Witali Aswolinskiy is a postdoc in his group working on whole-slide image classification, and Leslie Tessier is a resident pathologist and PhD student in the same research group, under the supervison of Prof. Jeroen van der Laak. All four are co-organizing the first TIGER (Tumor InfiltratinG lymphocytes in breast cancER) Grand Challenge, which runs until the end of April. They are here to tell us what it is all about.
Breast cancer is the most common form of cancer worldwide and is the leading cause of cancer-related death in women. But not all breast cancers are the same. Some patients will have a better prognosis than others. The key question for clinicians is how best to treat each patient to ensure the most favorable outcome.
In classifying breast cancer, three main tests are carried out for the hormonal receptors, estrogen receptor (ER) and progesterone receptor (PR), and the human epidermal growth factor receptor (Her2).
“If the tumor is positive to any of these tests, it gets a label,” Leslie tells us. “If it’s negative to all three tests, it’s called a triple negative. Her2 positive and triple negative results have the worst prognosis.”
This challenge focuses on a specific target: tumor-infiltrating lymphocytes (TILs). Lymphocytes are immune system cells able to kill tumor cells under certain circumstances. Patients with a higher concentration of lymphocytes in the tumor region seen on histopathology slides tend to have a better prognosis. The way this is determined currently is by pathologists looking at these cells and coming up with a personal and subjective assessment. However, different pathologists will come up with different results. Even the same pathologist can make different assessments of the same case over time, so TILs assessments are not used routinely in clinics.
There are two main goals of this challenge. One is to develop AI models to automate TILs assessments. The second is to validate this assessment on an extensive test dataset. Ultimately, the aim is to make TILs assessments more objective and reproducible to enable their use more confidently in clinical practice.
The training dataset has been released in two formats – one is oriented towards the computational pathology community, while one is more compatible with the type of work that computer vision people do.
“We’ve released all the detections in COCO format,” Mart tells us. “The COCO format is widely used in many benchmarks to check if detection models work better than other models. Also, we normally train with huge whole-slide images, often 100,000 by 100,000 pixels, but on ImageNet and other databases, you generally see smaller images. We have cropped the regions out of these whole-slide images and cropped the masks so that people don’t have to deal with these big annotations.”
The team created specific libraries (e.g., the WholeSlideData Python package) to make it easy for participants to navigate and connect standard convolutional neural networks or deep learning approaches to the data.
“From the start, we designed this challenge to be inclusive,” Francesco points out. “We wanted scientists and researchers in medical image analysis or computational pathology to take part, but also the wider computer vision community. Clinical application is the kind of goal that we can only reach if we do it collaboratively!”
TIGER is a complex challenge with multiple components. Two leaderboards evaluate different aspects of what the participants develop. One looks at the performance of algorithms designed to segment tissue parts, including the tumor and the connective tissue associated with it, and algorithms designed to detect lymphocytes. The second leaderboard considers the TIL score. Participants have to develop ideas for converting what they have segmented and detected into a biomarker. The team will run this biomarker on a large set of independent test slides of breast cancer patients whose outcome is known. Then they will perform a survival analysis to see if the TIL score can predict recurrence. Recurrence is the most crucial factor to know clinically in terms of treatment. It is the question that every cancer patient asks their oncologist: Will the cancer return, or am I cured?
Until recently, chemotherapy and hormone therapy have been the primary treatments for breast cancer patients. Now, there is growing interest in immunotherapy. Treatment plans might change over time, but either way, there will likely still be a role for TILs.
“For me, the goal of this challenge is perfectly in line with my current research work,” Witali tells us. “One of my projects is about trying to predict whether chemotherapy will work for breast cancer patients, and TILs are a biomarker for that. When I tried to develop and discover novel biomarkers, I ended up with something that correlates with the TILs.”
Francesco, Witali, Mart and Leslie are keen to stress that they are part of a wider multidisciplinary team of organizers. They are on the technical side, but pathologists have annotated the data, engineers have built and maintained the web-based platform, and multiple clinical centers have provided data.
The concept for the challenge originated from a collaboration between Francesco’s research group and the International Immuno-Oncology Working Group, a multidisciplinary group of clinicians led by Roberto Salgado, a pathologist working in Belgium. The TIGER challenge is a close collaboration between Roberto’s group, the Radboud University Medical Centre, and Amazon AWS.
Amazon AWS is sponsoring the challenge by giving access to computing power to run the algorithms and offering a prize for the top three methods in both leaderboards. In total, there is $13,000 in AWS credits available to the winners.
“This is just the first step,” Francesco adds. “There are more questions we can address in future with follow-up challenges. For us, it’s imperative first to show that AI can drive the quantification of TILs, then that TILs have a prognostic value on an independent test set. This challenge is the first step to eventually applying them in clinical practice.”
Keep reading the Medical Imaging News section of our magazine.
Read about RSIP Vision’s R&D work on Medical Image Analysis.