Best Practices for Scientific Computing

Greg Wilson,D. A. Aruliah,C. Titus Brown,Neil Chue Hong,Matt Davis,Richard T. Guy,Steven H. D. Haddock,Kathryn D. Huff,Ian M. Mitchell,Mark D. Plumbley,Ben Waugh,Ethan P. White,Paul P. H. Wilson

Best Practices for Scientific Computing

2014

Scientists spend an increasing amount of time building and using software. However, most scientists are never taught how to do this efficiently. As a result, many are unaware of tools and practices that would allow them to write more reliable and maintainable code with less effort. We describe a set of best practices for scientific software development that have solid foundations in research and experience, and that improve scientists' productivity and the reliability of their software. Software is as important to modern scientific research as telescopes and test tubes. From groups that work exclusively on computational problems, to traditional laboratory and field scientists, more and more of the daily operation of science revolves around developing new algorithms, managing and analyzing the large amounts of data that are generated in single research projects, combining disparate datasets to assess synthetic problems, and other computational tasks. Scientists typically develop their own software for these purposes because doing so requires substantial domain-specific knowledge. As a result, recent studies have found that scientists typically spend 30% or more of their time developing software [1],[2]. However, 90% or more of them are primarily self-taught [1],[2], and therefore lack exposure to basic software development practices such as writing maintainable code, using version control and issue trackers, code reviews, unit testing, and task automation. We believe that software is just another kind of experimental apparatus [3] and should be built, checked, and used as carefully as any physical apparatus. However, while most scientists are careful to validate their laboratory and field equipment, most do not know how reliable their software is [4],[5]. This can lead to serious errors impacting the central conclusions of published research [6]: recent high-profile retractions, technical comments, and corrections because of errors in computational methods include papers in Science [7],[8], PNAS [9], the Journal of Molecular Biology [10], Ecology Letters [11],[12], the Journal of Mammalogy [13], Journal of the American College of Cardiology [14], Hypertension [15], and The American Economic Review [16]. In addition, because software is often used for more than a single project, and is often reused by other scientists, computing errors can have disproportionate impacts on the scientific process. This type of cascading impact caused several prominent retractions when an error from another group's code was not discovered until after publication [6]. As with bench experiments, not everything must be done to the most exacting standards; however, scientists need to be aware of best practices both to improve their own approaches and for reviewing computational work by others. This paper describes a set of practices that are easy to adopt and have proven effective in many research settings. Our recommendations are based on several decades of collective experience both building scientific software and teaching computing to scientists [17],[18], reports from many other groups [19]–, guidelines for commercial and open source software development [26],, and on empirical studies of scientific computing [28]–[31] and software development in general (summarized in [32]). None of these practices will guarantee efficient, error-free software development, but used in concert they will reduce the number of errors in scientific software, make it easier to reuse, and save the authors of the software time and effort that can used for focusing on the underlying scientific questions. Our practices are summarized in Box 1; labels in the main text such as “(1a)” refer to items in that summary. For reasons of space, we do not discuss the equally important (but independent) issues of reproducible research, publication and citation of code and data, and open science. We do believe, however, that all of these will be much easier to implement if scientists have the skills we describe. Box 1. Summary of Best Practices Write programs for people, not computers. A program should not require its readers to hold more than a handful of facts in memory at once. Make names consistent, distinctive, and meaningful. Make code style and formatting consistent. Let the computer do the work. Make the computer repeat tasks. Save recent commands in a file for re-use. Use a build tool to automate workflows. Make incremental changes. Work in small steps with frequent feedback and course correction. Use a version control system. Put everything that has been created manually in version control. Don't repeat yourself (or others). Every piece of data must have a single authoritative representation in the system. Modularize code rather than copying and pasting. Re-use code instead of rewriting it. Plan for mistakes. Add assertions to programs to check their operation. Use an off-the-shelf unit testing library. Turn bugs into test cases. Use a symbolic debugger. Optimize software only after it works correctly. Use a profiler to identify bottlenecks. Write code in the highest-level language possible. Document design and purpose, not mechanics. Document interfaces and reasons, not implementations. Refactor code in preference to explaining how it works. Embed the documentation for a piece of software in that software. Collaborate. Use pre-merge code reviews. Use pair programming when bringing someone new up to speed and when tackling particularly tricky problems. Use an issue tracking tool. Write Programs for People, Not Computers Scientists writing software need to write code that both executes correctly and can be easily read and understood by other programmers (especially the author's future self). If software cannot be easily read and understood, it is much more difficult to know that it is actually doing what it is intended to do. To be productive, software developers must therefore take several aspects of human cognition into account: in particular, that human working memory is limited, human pattern matching abilities are finely tuned, and human attention span is short [33]–[37]. First, a program should not require its readers to hold more than a handful of facts in memory at once (1a). Human working memory can hold only a handful of items at a time, where each item is either a single fact or a “chunk” aggregating several facts [33],[34], so programs should limit the total number of items to be remembered to accomplish a task. The primary way to accomplish this is to break programs up into easily understood functions, each of which conducts a single, easily understood, task. This serves to make each piece of the program easier to understand in the same way that breaking up a scientific paper using sections and paragraphs makes it easier to read. Second, scientists should make names consistent, distinctive, and meaningful (1b). For example, using non-descriptive names, like a and foo, or names that are very similar, like results and results2, is likely to cause confusion. Third, scientists should make code style and formatting consistent (1c). If different parts of a scientific paper used different formatting and capitalization, it would make that paper more difficult to read. Likewise, if different parts of a program are indented differently, or if programmers mix CamelCaseNaming and pothole_case_naming, code takes longer to read and readers make more mistakes [35],[36].

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

439

Citations