Q&A with Santiago Estrada and Martin Reuter


By Emilie McKinnon

The DeepMI lab. Top (left to right): Prof. Martin Reuter, Leonie Henschel, Emad Bahrami-Rad, Kersten Diers, Christian Ewert. Bottom (left to right): Saikat Roy, Dr. David Kugler, Santiago Estrada, Clemens Pollak.

June’s MRM Highlights Pick interview is with Santiago Estrada and Martin Reuter, researchers at the German Center for Neurodegenerative Diseases in Bonn, Germany, and the A.A. Martinos Center for Biomedical Imaging in Boston. Their paper is entitled “FatSegNet: a fully automated deep learning pipeline for adipose tissue segmentation on abdominal Dixon MRI”. This work demonstrates exemplary reproducible research practices: not only do the authors share the code related to their work, but they also distribute Dockerfiles to provide a reproducible and shareable computing environment for their code (i.e. a Docker container).

MRMH: Could you tell us a little about the journey that led you to MRI and how you got interested in this project?

Santiago: I study biomedical computing and this project was an interesting topic for my master’s thesis. Visceral and subcutaneous adipose tissue have been related to metabolic disorders and could be potential biomarkers for neurodegenerative diseases. Therefore, the segmentation of these tissues is of great interest to the Rhineland Study, an ongoing population‐based prospective cohort study (https://www.rheinland-studie.de/). However, most abdominal adipose segmentation methods to date have been hard to generalize to a large population due to the variability of the abdominal cavity. We therefore set out to create a method that was more robust.

Martin: Our lab’s core research agenda is artificial intelligence (AI) method development for structural MRI with a focus on neuroimaging. This is the first time we have ventured into body MRI.

MRMH: What are the main take-home points of your paper?

Santiago: We have developed a fully automated pipeline for the reliable and accurate quantification of different fat tissue volumes from Dixon MRI. By using our 2D-Competitive Dense Fully Convolutional Networks (CDFNet), we have introduced an element of competition among features in order to improve selectivity. Most convolutional neural networks methods calculate lots of features, and in those cases the networks can get really computationally heavy. Another important contribution of our work is that we have validated the method really well with respect to accuracy, reliability, and sensitivity to known effects on a big unseen dataset.

Martin: That’s right! There are two main points: one is that it’s a complete pipeline. It not only takes an image and segments it, but also finds the region of interest consistently across subjects. Second, as Santiago said, is the validation aspect. It’s really important that these methods are validated thoroughly in order to perform well in real-world applications.

MRMH: Can you explain, in simple terms, the added value of the CDFNet? 

Santiago: Basically the CDFNet is the core of the whole pipeline. It introduces local competition which lets us reduce the number of parameters and reduce the size of the network. A smaller network size requires less memory and is less prone to overfitting. 

Martin: Dense blocks have been invented before which introduce something like shortcuts. The traditional approach, however, concatenates the features and every time you do a shortcut the network grows larger. The effect of our local competition is that only the strongest signals can pass through. The result is a leaner, slimmer, and more robust architecture that outperforms the previous methods. 

MRMH: Your method introduces a novel 2.5D approach to segmentation. What is the added benefit of this and how common are discrepancies between the segmentations in the different slice directions? 

Santiago: We adopted this approach because our images are not isotropic. We have high resolution in the axial plane, but you can see that there is important information in all directions. We have segmentation probabilities for each direction and normally people would just average these, but we decided to use a network to aggregate the views. 

Martin: Santiago explained that perfectly. The problem with the different resolutions is that you do not know how much you can trust each view. So, one new aspect we introduced is that we let the network learn which direction to trust more, and this can vary spatially. Our advanced 2.5D network architectures have been shown to outperform traditional 2D and also 3D architectures, and seem to represent a good compromise between the large memory requirement of full 3D vs the low spatial context of 2D or 3D patch-based approaches. Sometimes the views do disagree, but these might be regions of uncertainty and it could be helpful to output this as a marker of quality.

MRMH: FatSegNet is optimized for the Rhineland Study, but could it be used for other abdominal MRI sequences? 

Santiago: We have found that if you have Dixon images, the tool is able to generalize but might need some fine-tuning. What people could choose to do is to include the sequence of the Rhineland Study since it is only 40 seconds long, and our method is expected to work directly on those images. 

Martin: Certainly, it would be best to include the Rhineland sequence. If you have already acquired data, you might need to fine-tune the network. We used 33 subjects (1700 slices in axial view) to train the deep learning models. For fine-tuning, a lower number is probably sufficient, but if the resolution changes, you would need to re-train the system from scratch. 

MRMH: Talking about the number of training subjects, how did you settle on 33? 

Santiago: We started with 20 cases and saw that it was working but the results were not generalizable to all cases, and so we asked our collaborators to label some more data. 

Martin: Yes, it’s always difficult to know how good you really are. We took the cases where the network didn’t work well, had them manually segmented, and added them to the training set. That way you can select the training cases to span a wide variety of body shapes. And, of course, final validation needs to be performed on a variety of cases that were not used in training. 

MRMH: This was your lab’s first non-neuroimaging project. How did you find the experience of venturing into the abdomen?

Santiago: Actually, it is easier to know what you are looking at! I am not a medical person, yet even I can see that a liver looks like a liver, and a kidney looks like a kidney, etc. 

Martin: In the abdomen, you have such high variability. Brains are always more or less the same and there is not a lot of movement; in that sense I think neuroimaging is easier. But Santiago is right, most people are probably more familiar with the appearance of the structures in the abdomen. 

MRMH: What have you been up to since this paper was written?

Santiago: I am now doing my PhD at the intersection between Martin’s group and population science (Prof. Breteler) on deep learning in large cohort studies. Currently, I am working on olfactory bulb segmentation. This is a really tiny structure that’s often overlooked and could play an important role in characterizing early stages of dementia.

Martin: One other project in my lab has been the development of a full-brain segmentation pipeline, called FastSurfer, that is capable of segmenting a brain MRI into close to 100 structures in under 1 minute with two orders of magnitude speedup. The core of this network is similar to the network introduced in this paper. More on our research can be found on our webpage https://deep-mi.org.

MRMH: Do you have anything else you would like to share with the MRI community?

Santiago: Deep learning has some clear advantages; for example, if you have a random error in your ground truth, the network will not learn it and random mistakes will be removed from the data. 

Martin: I think there are three points I’d like to share: First, deep learning is now overtaking traditional approaches, sometimes even with respect to generalizability, which is surprising. Second, very large manual training data sets are not always necessary. Finally, manual ground truth can be wrong. So don’t always blindly trust ground truth!