Post Tagged with: "source code"

The Source Code Debate

Few researchers were using computers 30 years ago.  This quickly changed with the release of several commercially viable personal computers in the 1980s. Since then, processing power has increased and the cost of computers decreased at an exponential rate (see Moore’s Law).

It’s no surprise that computers are now pivotal in chemistry research. We use them in a wide range of calculations – from determining the 40th decimal place of the absolute energy of He to modeling the release and distribution of toxic chemicals in river basins. The software used to address these complex problems is becoming increasingly accessible and easy to use too. There are already a variety of cell phone apps for chemistry related problem solving.

Yet, while the prevalence of software and computer-based research continues to grow, the rules for publishing results and sharing software lags behind. The magical/miracle nature of black-box calculations is disconcerting to individuals that want to know how the answers were obtained (see Sidney Harris cartoon).  A palpable concern is growing in the scientific community around the sharing of software – and the foundational source code -necessary to reproduce published results. Two recent opinion pieces, one in Science titled, “Shining Light into Black Boxes” and the other in Nature titled, “The case for open computer programs” are trying to bring attention to this issue. The articles discuss the advantages and apprehensions of sharing, as well as suggest possible changes. Below is a summary of the points raised by the authors of the two articles – as well as the thoughts others (including myself).

Advantages to sharing software and source code:

  • Reproducibility: As stated by Ince et. al., “The vagaries of hardware, software and natural-language will always ensure that exact reproducibility remains uncertain…” without the release of source code in its entirety.
  • Catching errors: A simple mistake in converting units, assigning missing values as zero, rounding errors, or a misplaced decimal point, can wildly skew outcomes (see Office Space). We can only see and correct errors if we can see the source code.
  • Facilitating progress: All publications require that data, equations, materials, methods, and instrumentation are disclosed so that the results can be tested and furthered by others. We are all better served when source code is disseminated in a similar manner so that programs can be studied and repurposed in future research.
  • Teaching tools: Real, applied examples – that are relevant to research – are useful for new students and researchers learning to program and develop code.
  • Openness: Despite the competition to acquire funding and to publish first, we are all joined in the endeavor of understanding the rules that govern the universe. The open sharing of information has been and will continue to be the foundation of scientific progress.
  • Relying on faith: No matter how prolific or respected you are as a researcher, the implicit assertion, “Trust me, the program works the way I say it does” is not an acceptable means of justifying your results. On a fundamental philosophical level, black box justifications like that should be socially unacceptable in the sciences.

Apprehensions against sharing software and source code:

By May 4, 2012 7 comments science policy