Why should you share code?

Mike Croucher on 26 Jun 2025
Latest activity Reply by Adam on 23 Jul 2025

In a discussion on LInkedin about my recent blog post, Do these 3 things to increase the reach of your open source MATLAB toolbox, I was asked by "Could you elaborate on why someone might consider opening/sharing their code? Thinking of early-career researchers, what might be in it for them?"
I'll give my answer here but I'm more interested in yours. How would you have answered this?
This is what I said:
  • It's the right thing to do scientifically. A computational paper is essentially just an advertisement of what you've done. The code contains vital details about how you actually did it. A computational paper is incomplete without the code.
  • If you only describe your algorithm in a paper, I have to implement it before I can apply your research to my problem. If you share the code, I can get started much more quickly using your research. This means I publish faster and since I am a good scientist, this means you get cited faster.
  • Other scientists start off as users of your code. This leads to citations. Over time, some of them start deeply using and modifying your code, this leads to collaborators.
  • Once you decide to share code via something like GitHub, you quickly start adopting good software engineering practices without initially realizing it. This improves the quality of your research since adopting good software practices makes it more likely that your software will give the right answers.
That last point can be a little hard to get your head around sometimes. Even if all you do is use file upload to get your stuff onto GitHub (i.e. you're not using git properly yet) you will start to naturally converge towards better code.
Why? Because as soon as you share code, you have to solve the problem of getting it to run on someone else's machine.
A trivial example concerns hard coded paths, for example. If you only ever run it on your machine then having a line like datafile = "C:\Mystuff\data.csv" always works but it breaks as soon as I try to run it on my machine. You'll look at this and think "Maybe there's a better way to do that".
Similarly dependencies. Your Path may be full of stuff that isn't present on my machine. As soon as I try to run your code, it won't work and you'll have to figure out how to handle dependencies in a reproducible way.
Documentation! An empty README.md is no good if you expect me to know how to use your code. You at least have to say something like "To run this, type runme(N) into MATLAB where N is the size of the model...etc etc)
The act of sharing, and dealing with the consequences, leads to much better code than if you keep it to yourself.
Steve Eddins
Steve Eddins on 22 Jul 2025 (Edited on 22 Jul 2025)
@Mike Croucher, when you started this thread, back in June, it immediately caught my eye and my interest. I wanted to share my thoughts then, but I had just returned from a long trip, and I was about to embark on another long trip.
Anyway, I really appreciate your insights about this topic, and I also appreciate the thoughtful comments of your readers.
I would like to expand upon a couple of your points, based on my experience as a MathWorks software developer, as well as my experience in engineering algorithms research before that.
First, it has been my long experience a research paper based on a significant software component is unlikely to be reliably reproducible unless the code is provided. Specifically, two independent researchers who attempt to reproduce the paper’s results, using only the information in the published paper, are unlikely to end up seeing exactly the same results. And sometimes those differences can be significant.
Here’s an example from Image Processing Toolbox history. In the late 1990s and early 2000s, I and others on the development team found that all the publicly available implementations for the famous Canny edge detector produced significantly different results from each other. With further investigation, we concluded that the variations were caused by some vague wording and missing details in the 1986 IEEE TrPAMI paper that everyone was citing. We eventually worked out exactly what the toolbox implementation should do, but it took a lot of time, and we had to consult Canny’s Masters thesis for clarification about certain aspects of the TrPAMI paper.
I saw this sort of thing happen so often with published research that I eventually added something about it to my interviewing questions when hiring someone at the Ph.D. level.
Second, successfully sharing your code for use by others greatly improves the likelihood that you will be able to reproduce your own work, should it become necessary. Inexperienced software developers, and sometimes even experienced ones, tend to:
  • Discount the need to reproduce one’s own work, sometimes years later, possibly in a different location and with different computer equipment
  • Overlook or forget about computer and OS dependencies, or data dependencies, or certain pieces of code, that are necessary for complete and accurate reproduction
I was lucky to learn this lesson early in my career, while I was still in graduate school. When I was near the end of writing my doctoral thesis, I found that I needed to modify and rerun some of my experiments from a couple of years earlier. I managed it, but it was unexpectedly challenging, and I was a bit lucky. This is one of the reasons that I started using software version control so early, several years before software development became my career.
Thanks, Mike, for prompting this discussion.
Note: I have also posted these thoughts on my personal blog.
xingxingcui
xingxingcui on 30 Jun 2025 (Edited on 30 Jun 2025)
In recent years, I have open-sourced a large number of computer vision (CV) algorithms, covering many small application areas (which can be partially seen from my personal homepage). I am very interested in using MATLAB for algorithm research and exploration, and subsequently I use Python/C++ for developing concrete practical applications. Sharing my work/codes gives me more opportunities to connect with peers, showcase my abilities, and find like-minded partners. Even more, I hope to attract employers who recognize my value! On the other hand, due to work confidentiality requirements, there are many excellent work/codes that I have not open-sourced.
However, considering the job market in China, age(35+) seems to be a stumbling block, and the employment environment is very unfriendly to countless people like me😢. I eagerly look forward to some positive changes.
If the MathWorks community platform could offer some online job listings, I would be very willing to join.
YuGang
YuGang on 28 Jun 2025
Since 2012, I have been engaged in research in the field of signal processing and time-frequency analysis. To date, I have published over 40 journal papers and shared more than 20 code packages, accumulating over 2,700 citations on Google Scholar. I have deep insights into this issue. Beyond the reasons you mentioned earlier, I believe there is another factor. I do believe that my work can advance research in fields such as mathematics, physics, engineering, chemistry, and medicine. Whenever I see my research being applied across different disciplines, it reaffirms the original aspiration that led me to become a scientist or engineer.
Walter Roberson
Walter Roberson on 26 Jun 2025
There is a segment of the population that looks for proof of skills. Having a non-trivial amount of open-source code is often counted as proof of skills (or at least of dedication), and can lead to employment offers.
On the other hand...
  • industries that depend upon material being kept confidential may worry that you might not respect confidentiality, so you might lose some opportunities
  • there might be a feeling that it might not be necessary to pay you, as you "give away your code for free"
goc3
goc3 on 26 Jun 2025
Everything that Mike says makes sense. Hopefully his advice will be helpful to many MATLAB users.
On the other hand, essentially all of the programming work that I have done for my salaried jobs cannot be published due to confidentiality.
Rik
Rik on 28 Jun 2025
Isn't confidential code generally shared internally (whatever that means in the specific context)?
I don't have to worry about the code I write as a hobby, so I share it publicly. When working on something confidential, that work product is share with (a selection of) colleagues.
The citations part doesn't work the same way, but having your name on an internal function that everyone internally uses is still a form of acknowledgement and respect.
goc3
goc3 on 28 Jun 2025
Sure, the code I have written is stored on internal shared drives. But, most of the people where I work (or have worked) do not have MATLAB, let alone know how to use it or even to program. Also, most of the code I have written has been used to process and analyze large data sets—the results of which are documented in reports and may be the only thing (of my work) that most employees ever see. Perhaps I am just an outlier, though.
Adam
Adam on 23 Jul 2025
I'm also in a similar situation, where I have worked in Matlab for almost 20 years, all in the same company, and (together with some other colleagues) have built up a very large library of code containing algorithms, data read/write, maths functions, GUIs, etc, etc, but none of it ever shared externally. Some of it could be, to be fair, if it isn't proprietary, but we never have done. If I'm doing it for a paid job I don't have the time to deal with any responses, bug reports or whatever from putting code on File Exchange, and often my code will have lots of dependencies too.
We reuse each other's code within our repository, but a lot of it does end up not peer-reviewed in any way. But as commercial research it could never be shared externally, obviously.