Reproducible Research Insights with Antonio Tristán-Vega and Santiago Aja-Fernández

March 18, 2022

1461

By Mathieu Boudreau

Our first interviewees of the year, Antonio Tristán-Vega and Santiago Aja-Fernández, are researchers at the University of Valladolid in Spain. We chose their paper “Accurate free-water estimation in white matter from fast diffusion MRI acquisitions using the spherical means technique” because, in it, the authors demonstrated exemplary reproducible research practices; in particular, in the context of a larger software tool that they developed and manage (dMRI-Lab), they shared tools developed as part of their research. To learn more about the work done by Antonio and Santiago, check out our recent interview with them.

To discuss this Q&A, please visit our Discourse forum.

General questions

1. Why did you choose to share your code/data?

We feel that code sharing is a nice way to increase the visibility of our work. Traditionally, our research tends to focus on mathematical modeling of images, which often leads to complex conceptualizations/mathematical developments that end users cannot afford to implement themselves. Publishing our implementations gives our research outcomes the chance to be adopted (or at least tested) by a wider audience.

2. What is your lab or institutional policy on code/data sharing?

With regard to diffusion MRI processing, which is one of the main research lines developed at our lab, we are committed to integrating all the methods we develop into a continuously evolving MATLAB toolbox, which we keep publicly available.

3. At what stage did you decide to share your code/data? Is there anything you wish you had known or done sooner?

Antonio Tristán-Vega (left) and Santiago Aja-Fernández (right)

The answer to this question is related to our reply to the first one. We realized that, quite often, researchers (especially those more interested in clinical applications of dMRI than in dMRI processing itself) adopt standard pipelines based more on the availability of handy software than on the actual suitability of the given methods for their concrete problems. Despite having been, for years, in the habit of publishing small pieces of software that implemented the methods of a particular paper or conference communication, it was not until two years ago that we decided to put together all the software we had contributed over the past decade and publish a complete toolbox. We are aware that it is neither the only nor the best existing software for dMRI processing, however it has the advantage of making all our research outcomes easily available to any interested researcher, without them needing to be big experts in either dMRI processing or software engineering.

4. How do you think we might encourage researchers in the MRI community to contribute more open-source code along with their research papers?

We think this is really a question for academic institutions. In our country, being a regular contributor to or maintainer of a given software, even if it is used by hundreds of researchers worldwide, has a very little added value for your CV, which is evaluated mostly in terms of your number of publications in JCR journals or citations received. Coding (and, above all, maintaining) software is really time consuming, and it is not properly recognized. Consequently, researchers are not motivated to invest time in this very important activity.

Questions about the specific reproducible research habit

1. How does dMRI-Lab fit into the larger landscape of diffusion MRI software tools?

We do not claim that it can substitute or compete with any other software solution. Our aim, with the methods we have developed, is to complement these other tools, so that interested researchers can try and test different processing pipelines and choose the most appropriate one for their needs, without being conditioned by software availability.

Screenshot of one of the demos provided with dMRI-Lab.

2. What questions did you ask yourselves while you were developing the code that would eventually be shared?

The most recurrent question was whether we should develop an easy-to-use code aimed at end users, fixing most of the design degrees of freedom and algorithm parameters to default values, or instead keep the code as flexible as possible, so that other researchers working on image processing might further develop and improve on our methods. In other words, the aspect we wondered about most was the target users of our software.

3. What advice do you have for people that want to develop and maintain a long-term software project like dMRI-Lab?

Document the code and take the time to write detailed help on your functions/modules/ methods/commands. Do not simply direct the user to your paper in order to find out how to use the software.

4. Are there any other reproducible research habits that you didn’t use for this paper but might be interested in trying in the future?

Yes, sharing scripts that can exactly reproduce the results included in the paper; however, this is not always possible due to the use of data that cannot be shared.