MICCAI 2023 Daily – Monday

Best Oral and Poster Presentations Women in Science with Sandy Engelhardt A publication by DAILY October 8-12 Exclusive Interview with Keynote Yann LeCun

Yann LeCun will be keynote speaker on Tuesday. He was so kind as to give a second interview to Ralph, 5 days ago in Paris. Yann, thank you very much for being with us again. When we talked five years ago, you told me you had a clear plan for the next few years. Did you stick to it? The plan hasn’t changed very much – the details have changed, and we’ve made progress, but the original plan is still the same. The original plan was the limitation of current AI systems is that they’re not capable of understanding the world. You need a system that can understand the world if you want it to be able to plan. You need to imagine in your head what the consequences of your actions might be, and for this, you need a world model. I’ve been advocating for this for a long time. This is not a new idea. The concept is very old, from optimal control, but using machine learning to learn the world models is the big problem. Back when we talked, I can’t remember if I’d made the transition between what I called latent variable generative models and what I’m advocating now, which I call JEPA, so joint embedding predictive architectures. I used to think that the proper way to do this would be to train a system on videos to predict what will happen in the video, perhaps as a consequence of some action being taken. If you have 2 DAILY MICCAI Monday Exclusive Interview

a system that can predict what’s going to happen in the video, then you can use that system for planning. I’ve been playing with this idea for almost 10 years. We started working on video prediction at FAIR in 2014/15. We had some papers on this. Then, we weren’t moving very fast. We had Mikael Henaff and Alfredo Canziani working on a model of this type that could help plan a trajectory for self-driving cars, which was somewhat successful. But then, we made progress. We realized that predicting everything in a video was not just useless but probably impossible and even hurtful. I came up with this new idea derived from experimental results. The results are such that if you want to use self-supervised learning from images to train a system to run good representations of images, the generative methods don’t work. The methods are based on essentially corrupting an image and then training a neural network to recover the original image. Large language models are trained this way. You take a text, corrupt it, and then train a system to reconstruct it. When you do this with images, it doesn’t work very well. There are a number of techniques to do this, but they don’t work very well. The most successful is probably MAE, which means masked autoencoder. Some of my colleagues at Meta did that. What really works are those joint embedding architectures. You take an image and a corrupted version of the image, run them through encoders, and train the encoders to produce identical representations for those two images so that the representation produced from the corrupted image is identical to that from the uncorrupted image. In the case of a video, you take a segment of video and the following segment, you run them through encoders, and you want to predict the representation of the following segment from the representation of the previous segment. It’s no longer a generative model because you’re not predicting all the missing pixels; you’re predicting a representation of them. The trick is, how do you train something like this while preventing it from collapsing? It’s easy for this system to collapse, ignore the input, and always predict the same thing. That’s the question. So, we did not get to solve the exact problem we wanted? It was the wrong problem to solve. The real problem is to learn how the world works from video. The original approach was a generative model that predicts the next video frames. We couldn’t get this to work. Then, we discovered a bunch of methods that allow one of those joint embedding systems to learn when they’recollapsing. There are a number 3 DAILYMonday Yann LeCun “This is, I think, the future of AI systems. Computer vision has a very important role to play there!” MICCAI

4 DAILY of those methods. There’s one called BYOL from DeepMind – Bootstrap Your Own Latent. There are things like MoCo. There have been a number of contrastive methods to do this. I probably had the first paper on this in 1993, on a Siamese neural network. You train two identical neural nets to produce identical representations for things you know are semantically identical and then push away the outputs for dissimilar things. More recently, there’s been some progress with the SimCLR paper from Google. Then, I became somewhat negative about those contrastive methods because I don’t think they scale very well. A number of non-contrastive methods appeared about four years ago. One of them is BYOL. Another one, which came from my group at FAIR, is called Barlow Twins, and there are a number of others. Then, we came up with two other ones called VICReg and I-JEPA, or Image JEPA. Another group at FAIR worked on something called DINOv2, which works amazingly well. Those are all different ways of training a joint embedding architecture with two parallel networks and predicting the representation of one from the representation of the other. DINOv2 is applied to images, VICReg is applied to images and short videos, I-JEPA to images, and now we’re working on something called V-JEPA or Video JEPA, a version of this for video. We’ve made a lot of progress. I’mvery optimistic about wherewe’regoing. You have long been a partisan of the double affiliation model. Would you suggest young people today consider a career with hats in academia and industry, or would your advice for this generation be a little bit different? I wouldn’t advise young people at the beginning of their career to wear two hats of this type because you have to focus on one thing. In North America, if you go into academia, you have to focus on getting tenure. In Europe, it’s different, but you have to focus on building your group, your publications, your students, your brand, your research project. You can’t do this if you split your time. llll Monday Yann’s interview with Ralph in 2018 Exclusive Interview MICCAI

Once you’re more senior, then it’s a different thing. Frankly, it’s only in the last 10 years that I’ve been straddling the fence in a situation where I’m pretty senior and can choose what I want to work on. At FAIR, we don’t take part-time researchers who are also faculty if they’re not tenured. Even the tenured, we tend only to take people who are quite senior, well established, and sometimes only for a short time, for a few years or something like that. It’s not for everyone. It depends on which way you want to have an impact and whether you like working with students. In industry, you tend to be more hands-on, whereas in a university, you work through students generally. There are pluses and minuses. You are one of the well-known scientists in our community who does not shy away from talking to younger and less experienced people on social media, in articles, and at venues like ICCV and MICCAI. Do you also learn from these exchanges? The main reason for doing it is to inspire young people to work on interesting things. I’ve been here at ICCV for about an hour and a half, and about 100 people came to take selfies with me. I don’t turn them down because they’re so enthusiastic. I don’t want to disappoint them. I think we should encourage enthusiasm for science and technology from young people. I find that adorable. I want to encourage it. I want to inspire people to work on technology that will improve the human condition and make progress in knowledge. That’s my goal. It’s very indirect. Sometimes, those people get inspired. Sometimes, that puts them on a good trajectory. That’s why I don’t shyaway. There are a lot of exchanges about the potential benefits and risks of AI, for example. The discussions I’ve had on social media about this have allowed me to think about things I didn’t think of spontaneously and answer questions I didn’t know people were asking themselves. It makes my argument better to have these discussions on social media and have them in public as well. I’ve held public debates about the risks of AI with various people, including Yoshua Bengio and people like that. I think it’s useful. Those are the discussions we need to have between well-meaning, serious people. The problem with social media is that there’s a lot of noise and people who don’t know anything. I don’t think we should blame people for not knowing; I think we should blame people for being dishonest, not for not knowing things. I’ma professor. My job is to educate people. I’m not going to blame them for not knowing something! You started in a place where you 5 DAILY Monday Yann LeCun MICCAI

knew every single scientist in your field. Now, you are meeting thousands and cannot learn all their names. What is your message to our growing community? A number of different messages. The first one is there are a lot of applications of current technologies where you need to tweak an existing technique and apply it to an important problem. There’s a lot of that. Many people who attend these conferences are looking for ideas for applications they’re interested in medicine, environmental protection, manufacturing, transportation, etc. That’s one category of people – essentially AI engineers. Then, some people are looking for new methods because we need to invent new methods to solve new problems. Here’s a long-term question. The success we’ve seen in natural language manipulation and large language models – not just generation but also understanding – is entirely due to progress in selfsupervised learning. You train some giant transformer to fill in the blanks missing from a text. The special case is if the blank is just the last word. That’s how you get autoregressive LLMs. Self-supervised learning has been a complete revolution in NLP. We’ve not seen this revolution in vision yet. A lot of people are using self-supervised learning. A lot of people are experimenting with it. A lot of people are applying it to problems where there’s not that much data, so you need to pre-train on whatever data you have available or synthetic data and then fine-tune on whatever data you have. So, some progress in imaging. I’m really happy about this because I think that’s a good thing, but the successful methods aren’t generative. The kind of methods that work in these cases aren’t the same kind of methods that work in NLP. In my opinion, the idea that you’re going to tokenize your video or learn to predict the tokens is not going anywhere. We have to develop specific techniques for images because images and video are considerably more complicated than language. Language is discrete. It makes it simple, particularly when having to handle uncertainty. Vision is very challenging. We’ve made progress. We have good techniques now that do selfsupervised learning from images. The next step is video. Once we figure out a recipe to train a system to learn good representations of the world from video, we can also train it to learn predictive world models: Here’s the state of the world at time T. Here’s an action I’m taking. What’s going to be the state of the world at time T+1? If we have that, we can have machines that can plan, which means they can reason and figure out a sequence of actions to arrive at a goal. I call this objective-driven AI. This is, I think, the future of AI systems. Computer vision has a very important role to play there. That’s what I’mworking on. My entire research is entirely focused on this! 6 DAILY Monday Exclusive Interview MICCAI

Anne-Marie’s picks of the day(Monday): Anne-Marie Rickmann is a PhD candidate at the Ludwig Maximilian’s University of Munich, and affiliated researcher at the department of radiology at the Technical University of Munich, supervised by Christian Wachinge. “My research focuses on deep learning methods for medical image segmentation. I am particularly interested in the use of 3D data and one focus of my work during my PhD was developing cortical surface reconstruction techniques using deep learning. (Oral 3) LOTUS: Learning to Optimize Task-based US representations (Oral 4) From Tissue to Sound: Model-based Sonification of Medical Imaging (Oral 4) Detecting the Sensing Area of a Laparoscopic Probe in Minimally Invasive … READ OUR FULL REVIEW ON THE NEXT PAGE! (M-01-031) Cortical analysis of heterogeneous clinical brain MRI scans for large … (M-01-083) Neural Pre-Processing: A Learning Framework for End-to-end Brain MRI … (M-01-103) SegmentOR: Obtaining Efficient Operating Room Semantics Through … Learn more about Anne-Marie’s work! Don’t miss: Madeleine is presenting today her own work: M-01-139 Vertex Correspondence in Cortical Surface Reconstruction. Don't forget to visit her during poster session 1, she’s great! As a computer scientist with a unique background as a clinical nurse, I have a deep appreciation for the clinical implications inherent in our research. This appreciation drives my commitment to developing solutions that not only advance the forefront of medical image processing but also have a direct impact on improving patient care and diagnosis.” Oral: Posters: For today, Monday 9 7 DAILY Monday Anne-Marie’s Picks MICCAI

Cancer remains a significant global challenge, with one diagnosis every two minutes in the UK alone. Due to a lack of reliable intraoperative visualization tools, surgeons often rely on a sense of touch or the naked eye to distinguish between cancerous and healthy tissue. Despite advances in preoperative imaging methods such as PET, CT, and MRI, pinpointing the precise location of cancerous tissue during surgery remains a formidable task. Recently, minimally invasive surgery has garnered increasing attention for its potential to minimize blood loss and shorten recovery times. However, this approach presents another unique challenge for surgeons as they lose tactile feedback, making it even more difficult to locate cancerous tissue accurately. Lightpoint Medical Ltd. has introduced a miniaturized cancer detection probe namedSENSEI. This advanced tool, the first of its kind, leverages the cancer-targeting capabilities of nuclear agents typically used in nuclear imaging. By detecting the emitted gamma signal from a radiotracer that accumulates in cancerous tissue, surgeons gain real-time insights into the location of cancer during surgery. “This probe can be inserted into the human abdomen and then grasped by a surgical tool,” Baoru tells us. “However, using this probe presents a visualization challenge because it’s non-imaging and is air-gapped from the tissue, so it’s challenging for the surgeon to locate the probesensing area on the tissue surface. Determining the sensing area is crucial because we can have some signal potentially indicating the cancerous tissue and the affected lymph nodes.” Geometrically, the sensing area is Baoru Huang is a PhD candidate at the Hamlyn Centre, Imperial College London, supervised by Daniel Elson and Stamatia (Matina) Giannarou. Her work explores an innovative and groundbreaking visualization technique that holds significant promise for advancing cancer surgery. She speaks to us ahead of her oral and poster presentations this afternoon. Detecting the Sensing Area of a Laparoscopic Probe in Minimally Invasive Cancer Surgery 8 DAILY MICCAI Monday Oral Presentation

defined as the intersection point between the gamma probe axis and the tissue surface in 3D space but then projected onto the 2D laparoscopic image. It’s not trivial to determine this using traditional methods due to the lack of textural definition of tissues and per-pixel ground truth depth data. Also, it’s challenging to acquire the probe pose during the surgery. To address this challenge, Baoru redefined the problem from locating the intersection point in 3D space to finding it in 2D. “The problem is to infer the intersection point between probe access and the tissue surface,” she continues. “To provide the sensing area visualization ground truth, we modified a non-functional SENSEI probe by adding a DAQ-controlled cylindrical miniaturized laser module. This laser module emitted a red beam visible as red dots on the tissue surface to optically show the sensing area on the laparoscopic images, which is also the probe axis and the tissue surface intersection point. This way, we can keep the adapted tool visually identical to the real probe by inserting a laser module inside. We did no modification to the probe shell itself.” Baoru’s solution involves a multi-faceted approach. Firstly, she modified the probe. Then, she built a hardware platform for data collection and a software platform for the learning algorithm to facilitate the final sensing area detection results. With this setup, it is possible to find the laser module on the tissue surface, but the red dot is too weak compared with the laparoscope light. To solve this, she used a shutter system to control the laparoscope’s illumination, closing it when the laser is turned on and opening it when it is turned off. This 9 DAILY MICCAI Monday Baoru Huang

This process ensures the laser point is visible on the tissue surface despite the ambient lighting conditions. “Our network includes two branches,” she explains. “For the first branch, the images fed to the network were the ‘laser off’ stereo RGB images, but crucially, the intersection points for these images were known a priori from the paired ‘laser on’ images. Then, we use the PCA, the Principal Component Analysis, to extract the central axis of the probe on the 2D. Then, we want to feed this information to the second branch. We sampled 50 points along this axis as an extra input dimension.” The network employed ResNet and Vision Transformer as backbones, and the principal points were learned through either a multi-layer perceptron (MLP) or a long short-term memory (LSTM) network. These features from both branches were then concatenated for regressing the intersection point, with the network being trained end-to-end using the mean square error loss. “Since it’s important to report the errors in 3D and millimeters, we also recorded the ground truth depth data, just for evaluation, for all frames,” Baoru adds. “We used a custom-developed structured lighting system and the corresponding algorithms developed by us. With 23 different patterns projected onto one frame, we can get the depth map for this frame. We’ve released this dataset to the community.” Overall, what makes this work truly special is its innovative use of the gamma probe to detect gamma signals and locate cancerous tissue, enhancing the accuracy of resection and diagnosis. Moreover, its ability to enhancing 10 DAILY MICCAI Monday Oral Presentation

transform a 3D problem into a 2D one without acquiring highly accurate ground truth depth data or precise probe pose estimation sets a new benchmark in the field. The simplified network design allows for real-time application during minimally invasive surgery, achieving an impressive inference time of 50 frames per second. Originally from China, Baoru has been in the UK for eight years and completed her bachelor’s degree, master’s degree, and PhD there. “I really enjoy it –to be honest, I like the weather!” she laughs. Finally, we cannot let Baoru go without asking her about her experiences working with Matina Giannarou, her mentor and now second supervisor at the Hamlyn Centre, who is also a good friend of this magazine. “Matina has lots of enthusiasm for research,” she reveals. “As a female researcher, she gave me tips on balancing research and life and grabbing every chance you can. She’s been the Winter School chair for many years, and then in 2020, she asked me to be the mentor, and since 2021, she’s asked me to be co-chair of the Hamlyn Winter School. She’s really encouraged me.” To learn more about Baoru’s work, visit Poster 1 this afternoon at 13:0014:30 in the Poster Hall and Oral 4 at 14:30-16:00 in Ballroom A– Parallel Hall. 11 DAILY MICCAI Monday Baoru Huang

Lung cancer is the leading cause of cancer-related death in the United States. Early detection through screening is crucial, as the disease is most treatable during this phase. Screening typically involves a technician performing a low-dose CT scan (LDCT) to detect lung abnormalities. However, the conventional workflow involves several manual positioning steps. Firstly, the patient must be correctly positioned on the examination table, ensuring optimal orientation. Before the actual CT scan, a scout scan, often a topogram in either AP or lateral orientation, or both, is performed. This scout scan serves two crucial purposes: defining the bounds of the organ of interest (the lungs) and estimating the patient-specific dose profile, essential for maintaining patient safety. While LDCTs are obtained in a single breath-hold and do not require contrast agents, the scout scan consumes a substantial portion of the workflow time. “We propose to get rid of the scout scan by introducing a new method for estimating the position of thepatient’s internal anatomyand the isocenter of the patient to be able to position the table,” Brian begins. “Most importantly, we need to estimate the patient’s specific Water Equivalent Diameter (WED), which is a measure of the patient’s internal attenuation that is used to compute dose.” The proposed technique draws from a training dataset of over 60,000 patients to estimate the internal dose profile. One of the highlights of this method is its 12 DAILY MICCAI Monday Poster Presentation Automated CT Lung Cancer Screening Workflow using 3D Camera Brian Teixeira is a Senior Research Scientist for Siemens Healthineers. He speaks to us about his work on automating lung cancer screening ahead of his poster this afternoon.

unprecedented accuracy. It boasts a relative WED error rate of 4%, well within the International Electrotechnical Commission (IEC) acceptance criteria of 10%. The primary motivation behind this work is to expedite the screening process and increase its efficiency, aiming to perform more screenings within the time available, ultimately making the procedure more accessible and affordable. “This is something that’s been in the works for some time now,” Brian tells us. “Now, we’re introducing it because we’ve reached a technological point where we can present it, but I’vebeen working on this project almost since I joined Siemens seven years ago. We started with all the smaller tasks, like guessing the patient’s internal anatomy. This was a brainstorming idea we had as a group together with our colleagues from the CT department of Siemens.” While the technology addressed most of the challenges, two significant hurdles had to be overcome: integrating the vast, non-aligned, and heterogeneous CT data from different sources into the training dataset and ensuring the model’s real-time accuracy during scans. Both challenges were effectively tackled through the incorporation of an AutoDecoder, eliminating the problem of convolutional networks taking different-sized images or sequences into inputs. 13 DAILY MICCAI Monday Brian Teixeira

“In the end, we want the dose profile along the craniocaudal axis, so from head to feet,” Brian continues. “In this particular case, limited to the lung region. We need to have all of our scans covering the lungs. We can align them on the lung top. As all of these scans are aligned, whether they go from head to abdomen or cover limited parts of the top of the lung, we can use all this data. Once this is done, we can query the points along the axis and get our attenuation. The good thing is that because of this AutoDecoder approach and this latent vector corresponding to a patient, we can get the actual ground truth of the attenuation for the part we already scanned while the scan is happening. We can refine this latent vector in real-time to ensurewe’reas close as possible to the patient.” While many people are familiar with Siemens Healthineers as a prominent manufacturer in the medical field, there’s more to it than meets the eye, particularly within the MICCAI community. The company boasts a substantial research and development department committed to pushing the boundaries of technology. It pioneers state-of-the-art innovations, not only in hardware but also in software development. Regarding the next steps for this work, we invite Brian to dream– what would he add or change about it if he had a magic wand? “The part we’re missing, and it’s very challenging, is that our model doesn’t work well if patients have metallic implants,” he reveals. “Our model relies 14 DAILY MICCAI Monday Poster Presentation

on camera data. In some inpatient settings, with trauma cases, if a patient comes into the room with many medical devices on them, the camera is heavily occluded, and it will have trouble estimating the landmarks and the patient profile. That’s still a limitation right now. We’d gladly use your magic wand to fix it! In the meantime, we’ll just have to figure it out for ourselves.” Finally, with so many papers competing for the community’s attention this year, we ask Brian why he thinks you should take the time to visit his poster. “The question of why this work is critical and why it’s difficult is hard to convey in eight pages of a paper while covering the whole technology part,” he answers. “If people want to understand more about the challenges and why it’s so important to reduce the workflow time for this approach, they should come along and discuss it with me at the poster. I will happily share my experiences!” To learn more about Brian’s work, visit Poster 2 this afternoon at 16:0017:30 in the Poster Hall. 15 DAILY MICCAI Monday Brian Teixeira MICCAI Daily Publisher: RSIP Vision Copyright: RSIP Vision Editor: Ralph Anzarouth All rights reserved Unauthorized reproduction is strictly forbidden. Our editorial choices are fully independent from MICCAI, the MICCAI Society and MICCAI 2023 organizers.

Chest X-rays are a common diagnostic tool in the medical field, allowing healthcare professionals to detect various pathologies within the chest area. However, merely identifying the presence of a condition like pneumonia is not always sufficient; localization of abnormalities is crucial for accurate diagnosis and treatment planning. “The main problem is there’s very little data available for those localizations,” Philip begins. “There are very few datasets, and they need to be hand-labelled, which is challenging and time-consuming. You typically approach that by using weakly supervised object detection methods, which means you use classification labels to train an object detector. However, this isn’t optimal because it’s hard to localize based on classification labels alone. It doesn’t work well without bounding boxes, but bounding boxes for pathologies are expensive. You need a well-trained radiologist to spot the pathologies correctly.” Why do we need to improve localization in chest X-rays? The answer is twofold. First, from a research perspective, chest X-ray data is abundant and readily accessible, making it an attractive choice for developing and testing algorithms. Second, in a clinical context, chest X-rays are cost-effective and widely available in most healthcare facilities. A rapid detection algorithm that provides preliminary insights or assists radiologists and other medical professionals could be invaluable, particularly in emergencies. 16 DAILY MICCAI Monday Poster Presentation Anatomy-Driven Pathology Detection on Chest X-rays Philip Müller is a PhD student at Daniel Rueckert’s Lab for AI in Medicine at TU Munich. His work proposes a pathology detection localization method in chest X-rays. He speaks to us ahead of his poster this afternoon.

Among the various methods explored to improve localization in chest Xrays, some have shown promise, while others have fallen short of expectations. “Methods that typically work well on natural images don’t work at all on chest X-rays because they’re based on unsupervised bounding box proposals, which often use some form of edge detection not well suited to box proposals for pathologies,” Philip tells us. “Other methods that work on chest X-rays are CAM-based models, so class activation mapping, where you train a classifier and then look at which patches would be classified by that. They work to some degree but still are not working well.” One approach Philip discovered holds promise is using anatomic regions as bounding boxes. Anatomic regions are specific areas within the chest, such as regions in the lungs or the cardiac silhouette. It is a simpler and less costly task, making it feasible for medical students to contribute. The speed at which object detection models like a Faster R-CNN can be trained with just a few hundred samples makes annotating anatomic regions even more appealing. By combining classification labels extracted automatically from radiology reports with some annotated anatomic regions, researchers can detect the anatomic regions for a vast dataset easily and cheaply. “I tried two different things with this work,” Philip reveals. “The first one is using image-level classification labels and those anatomic regions, which already showed promising results and improved on those weakly supervised 17 DAILY MICCAI Monday Philip Müller

18 DAILY MICCAI Monday Poster Presentation methods. But the results get much better when it’s known which anatomic regions the classified pathologies belong to. While this seems like a lot of work, it actually isn’t because when you have the radiology reports, you can automatically extract where those belong in some semi-automated way with some rules.” He is keen to point out that the groundwork for this approach, including the use of anatomic regions and information extraction from radiology reports, has been laid by others. The Chest ImaGenome dataset, derived from the MIMIC-CXR dataset, serves as a valuable foundation for this innovative work. Regarding the next steps, while the current methodology relies on exact anatomical regions for localization, there are inherent limitations in spotting small pathologies. “While with weighted box fusion, there’s a trick to allow it to merge different bounding boxes and be a bit more precise on the localization of the pathology, there’s no way that it can spot small pathologies that only cover a very small part of the anatomic region,” Philip explains. “It’s just impossible by the design of the method! It might not be too problematic in some cases because a rough localization is still useful, but it’s a restriction.” To address this limitation, he envisions a more integrated approach, aiming to combine all the pipeline components more effectively. The process involves multiple steps, from anatomical region detection to exact anatomical region localization and pathology classification. He proposes directly using text embeddings of classes and anatomic regions as DETR (DEtection TRansformer) tokens. “I embed the name of the anatomic region using some pre-trained language model,” he explains. “Then I go, okay, please locate the anatomic region. Here, I have supervision for it, so I train it. On the other hand, here’s

my pathology. Please locate it. I don’t have localization or bounding box supervision, but I have classification supervision, so I classify this feature. That’s roughly the way I want to go. I want to integrate everything a bit more.” In his regular work, Philip works with images and text, combining them in different ways. His focuses include multimodal learning, NLP, and weakly supervised object detection. He has had some contrastive learning works and still uses some of those ideas. Currently, he is already busy with the extension to this paper. Will we see him presenting it at MICCAI next year? “Yeah, maybe,”he responds modestly. “We’ll see! I hope it works well.” Sadly, Philip could not make it to MICCAI this year, but this paper is being presented by its second author, Felix Meissen. To learn more about Philip’s work, visit Poster 2 this afternoon at 16:0017:30 in the Poster Hall. 19 DAILY MICCAI Monday Philip Müller Did you read Yann LeCun’s interview on page 2? It is fascinating ☺

Reproducibility ensures that the results we obtain today can be reliably reproduced tomorrow. There has been a growing awareness of reproducibility concerns across various scientific domains in recent years. In medical imaging, the increasing complexity of software and research pipelines has impacted the ability to replicate scientific findings across different timeframes and research teams consistently. This issue is particularly important to Sorina because she manages the Virtual Imaging Platform (VIP), a web portal that allows researchers worldwide to execute medical imaging applications as a service. Originally from Romania, Sorina came to France at 19 years old to study after her A-levels and settled there. She now enjoys the double nationality of Romanian and French. A few years ago, becoming aware of sources of non-reproducibility and how important it was for science in general, she thought: What if the results produced with VIP are not always reproducible? How can we help our users improve the reproducibility of their results? Driven by these questions, with colleagues from CREATIS, IPHC in Strasbourg, and Concordia University in Canada, she embarked on a project called ReproVIP, which secured funding from the French National Research Agency. “Together, our initial aim was evaluating and improving the reproducibility of results obtained with VIP,” she tells us. “Gaining experience from that, we thought it would be nice to disseminate our findings and help researchers beyond VIP benefit and understand what the main challenges and pitfalls are, why we have reproducibility issues, how to evaluate, and how to improve the reproducibility of our results. That’s why we submitted an abstract for this tutorial at MICCAI, and we’re extremely happy it was accepted.” Reproducibility is an essential aspect 20 DAILY MICCAI Monday MICCAI Tutorial Sorina Pop (center) is a Research Engineer at CNRS, the National Centre for Scientific Research in France, working at the CREATIS laboratory on medical imaging. Her tutorial on computational reproducibility takes place on Thursday morning, and she is here to tell us more about it Make your Results Reproducible with the Virtual Imaging Platform (VIP)

of research. It goes beyond simply achieving reproducible results; it involves understanding the risks, improving research practices, and building trust in scientific outcomes. In some ways, it goes hand in hand with explainability. If works are explainable and transparent, they tend also to be reproducible. However, this is not always the case, and reproducibility is an umbrella term with many aspects behind it. 21 DAILY MICCAI Monday Virtual Imaging Platform (VIP) Differences in tumor segmentation outputs obtained with two different versions of Brain Tumor Segmentation (BraTS) pipeline on the same input image, as presented in [desligneris2023]. “This tutorial tackles one small aspect of reproducibility: computational reproducibility,” Sorina reveals. “You can have explainability and still don’t get computational reproducibility because lots of complex things happen at the computational level, and many younger researchers aren’t aware of that. If you have explainability, you’ll say, I obtained these results using this piece of software with this input data. You have the software, all the parameters, and all the data you need, but is that enough to have bitwise reproducibility? Are you able to obtain exactly the same results at

22 DAILY MICCAI Monday theend?” In this tutorial, she hopes to take this a step further by saying that it depends on the degree of explainability. One of the solutions for computational reproducibility is a system called Guix that allows perfect explainability of the software you are using. However, Sorina says it is a complex technical solution, and not all researchers are ready. The tutorial aims to cater to a broad audience, beginning with an introduction to the fundamentals of reproducibility. “We can’t cover everything, but we’ll try to be methodological and explain which aspects of reproducibility we’ll tackle,” she clarifies. “Then we’ll go into more depth and will be hands-on. The participants must bring their laptops, but we’ll give them all the necessary tools. By the end of the tutorial, we hope they’ll have achieved a minimum level of knowledge about the theoretical questions on reproducibility and the tools and best practices they can adopt to improve their practices.” Although MICCAI is a hybrid event this year, including many of its satellite events, the hands-on elements of this tutorial will only be available for in-person delegates, and Sorina is keen to point out that physical attendance is highly preferable. The tools and best practice examples on offer will have broad applicability. “We’ll be happy to get feedback after this tutorial to see what the participants thought,” she adds. “If there’s a demand, we could bring it to other communities, but I haven’t thought of it yet!” The Make your Results Reproducible with the Virtual Imaging Platform tutorial is on Thursday morning in the Satellite Rooms at Level 1. Sources of Variability in Computer-Based Research Differences in tumor segmentation outputs obtained with two different versions of Brain Tumor Segmentation (BraTS) pipeline on the same input image, as presented in [desligneris2023]. MICCAI Tutorial

The MICCAI Educational Initiative is a project organized by the MSB (MICCAI Student Board) where we collect materials that can be helpful to new medical imaging students. All submissions go through peer review, for which we receive the support of a reviewer pool made out of MICCAI members, and the finalists are presented and voted for at the yearly conference. This year, voting will proceed during our lunch event on October 9th, don't miss it! We also welcome you to join our other events, which include the Academia & Industry event, MICCAI soccer game, morning runs, and two cultural walking tours! 23 DAILY MICCAI Monday Educational Challenge Finalists

Sandy Engelhardt is an Assistant Professor at Heidelberg University Hospital working in the field of MICCAI. Sandy, what is your work about? My group is based in a hospital, and we are very much focused on translational aspects and applicational aspects. One of our major application fields is cardiovascular diseases. The group I am leading is called Artificial Intelligence in Cardiovascular Medicine. We deal with different applications in this area, ranging from medical image processing, for example, for cardiac MRI and CT, and, of course, with convolutional neural networks and AI. Also, my background initially started with computer-assisted surgery. I have been building an assistance system for cardiac surgery during my PhD. Did you choose the cardiology field, or did it come to you somehow? Somehow, it came to me, but I am very happy to work in this area [she laughs] because I think it is underexplored. In MICCAI, I see an overemphasis on cancer application, but I think cardiovascular diseases are actually, in Western countries, the leading cause of mortality and morbidity, so it is very important that we deal with that in a much deeper way. It is a major cause of death in Western countries. We still do not know much about certain kinds of diseases in this field and how to treat them in a good way. Therefore, it is a privilege to work in this area. I have incredibly nice colleagues from cardiology and cardiac surgery, so we are very much 24 DAILY MICCAI Monday Women in Science Read 100 FASCINATING interviews with Women in Science! … Persistence is, I think, one of the strongest properties you need to bring…

25 DAILY MICCAI Monday Sandy Engelhardt based in a nice environment. What has been your biggest satisfaction to date with your work in this area? Oh, whoa, my biggest satisfaction in this area? That is really hard to choose because we are involved in many different projects. Okay, tell me one! I don’t want to choose one. I am very privileged that I can lead a group of people and that I am in this position. My group is around 20 people, and I completely enjoy working with young people from different disciplines and trying to guide them on their way to success. We are dealing with several applications and topics. One topic, for example, which we try to follow very rigorously is to build up an infrastructure between different hospitals in Germany where we can train deep neural networks in a federated sense, with federated learning. We put much emphasis on this being a long-lasting infrastructure, meaning that it can be used for many different tasks. This is an infrastructure that connects the leading cardiology departments, hospitals, and heart centers in Germany. It is really nice to see that we got it up and running and work with many different people from many different hospitals in Germany.

I have a really nice PhD student who works in this area also. They are very lucky to have you. How does a young, talented woman become a manager of a group of 20 people? How did that happen? Certainly, I don’t know. You grow in that role. At some point, you realize your group is that big! [Sandy laughs] I would not call it a miracle, but sometimes, it feels like that to me. For me, I find it really cool to lead a team, instill a team spirit, and support young people. The biggest joy I have is when I see how much they grow, how much they learn, and how much they are eager to learn. Did you know from the start that your main task would be to instill that in the group, or is it something you learned on the job? Working in different groups myself, I realized how important it is. You can do science on an individual level, somehow, maybe have your own topic and not deal with your colleagues much, but I think that is not really enjoyable. Getting support from your colleagues during a PhD is a very important step. It is a very important thing to have. Otherwise, you perhaps will not succeed. Even more than getting support from your supervisors? No, I think it is equally important. There are no hierarchies in our group. I try to be available as best as possible to the members of my group, and I think what I realized is that the most important thing is communication. When you are a team leader, of course, you have to make the decisions at some point. I try to make them democratic in some sense, asking, what is your opinion on that? But in the end, you are the one that is making the decision. You bear the responsibility for the result. I have the responsibility, exactly! Sometimes, you just do not think of all the consequences. Single people in your group can be set back because you made a decision. In that case, I want that they come to me and talk to me about it if it is really something where they feel not good about it. I am happy to say 26 DAILY MICCAI Monday Women in Science

27 DAILY MICCAI that this really works. The people in my group really have trust, and they come, and they discuss it with me. That proves to me that there is some kind of trustful relation in our group. Of all the scientists you have met during your formative years, which one do you feel you learned the most from? I think it is super hard to give names, but I always enjoy the people most that do brilliant science but also have great personalities. People you can really look up to. Certainly, we have brilliant people in our community, but sometimes – yeah, I don’t want to finish that sentence! [she laughs] Sometimes you see some setbacks in their personality. I must say, I admire those people who also have the same ideas of creating a nice group but are in some way also, from a scientific point of view, very brilliant. Can you give us one learning that you think should be taught in every school? In my field, I have one, for instance, but maybe you do in your field, too. What is yours? I did an MBA, and there is one video that I think every single MBA should see when starting their master’s. It is a video with Christine Bourron about the ups and downs of the life of an entrepreneur. I love that video. It is so inspiring. Thank you for asking me about it, but I am sure that our readers are more eager to learn about your favorite learning. If someone new comes into our group, like a new PhD student, I usually try to arrange some time to sit with them and tell them how our group works and what our philosophy is, etc. What I used to say to these people is that often it is wrong to go for quick successes. You have to be persistent to really finish something. Often, you will have setbacks and have something that will not work out. The successful people don’t stop there. You always have to be flexible in your mind to adjust the original plan based on the experience you have had, and then, eventually, you reach a result you have not expected before, but you reach a result, and maybe that is even better than you thought in the beginning. Monday Sandy Engelhardt

But the most important thing is persistence. If you stop along the way and if you become very negative about the work, it will not work out well. Persistence is, I think, one of the strongest properties you need to bring. Knowing you, I am not surprised that the most important thing you talk about is the people side rather than the scientific side. I would like to learn about your future. Where do you hope to go with what you are doing now? Working in a hospital, we see the patients every day around us. For example, our offices are on level two, and all the patients are on the other levels, but nevertheless, when we go to our offices, we see them. I really look forward to the day when we can, for example, apply the deep neural networks, the algorithms that we develop, for a more detailed diagnosis or for treatment recommendation. If this is really applied in the treatment process, this is something I look forward to. I have done a little bit of this in my PhD. The assistant system that we developed for cardiac surgery was applied in patients. It is so rewarding to see it. I think with all the new developments we do in the field of deep learning, we have the chance to apply it and see actually the benefit it can bring. I think these are the next major steps that are needed in our field. For someone as sensitive as you are to other people, is it tough sometimes to work around human people suffering? The patient is quite abstract if you just see the medical image. It is not like you know something about the patient, about their relatives or something. Of course, when you go to the OR and you see a major surgery happening to the patient, sometimes it is on your mind– oh, 28 DAILY MICCAI Monday Women in Science

29 DAILY MICCAI there is a family behind – but you should not think like that. I know it sounds crude, but it is more like an object, and you try to improve the situation somehow. So, you encourage even the most sensitive of young scholars now not to be afraid of getting into the medical field? There are still so many problems that need to be solved. Yeah! [she laughs] Your word for the community. Especially at MICCAI, I think we have the responsibility to bring everything that is developed in AI and computer vision closer to the patient. I don’t see this done in many research groups. Often, evaluation is not done in a rigorous way. People do just simple dataset splits, then evaluate the algorithm on this dataset split and say, okay, it has an accuracy of X and Y, and it works in this setting, but often in the real world, it does not really work. That is still a major problem. When I am at MICCAI, I see really exciting work, and you go through the posters, and often you see that validation was not conducted in the right way or that the test set is so limited that actually, you cannot really say something about the applicability of it. I think that it is still sad that we are at this point because we want to have an impact with our work, right? We really must work towards increasing the performance for clinicians. Monday Sandy Engelhardt Read 100 FASCINATING interviews with Women in Science!

30 DAILY MICCAI Monday ASMUS: Above, awesome Sarina Thomas from University of Oslo presented her joint work with GE Healthcare on improving view recognition by incorporating graph neural networks and diffusion models. MLMI: Below, Nassar Navab at MLMI Workshops

Do you enjoy MICCAI 2023? Do you enjoy reading our dailies? MICCAI 2023 does not end here! Like every year, Computer Vision News will publish the BEST OF MICCAI in the issue of November. Yes, in just 3 weeks! GET THE BEST OF MICCAI! Subscribe for free and get the BEST OF MICCAI in your mailbox. Computer vision News. Meet the scientist. 31 DAILY MICCAI BEST OF MICCAI 2023

RkJQdWJsaXNoZXIy NTc3NzU=