The only way to write good code is to write tons of shitty code first. Feeling shame about bad code stops you from getting to good code. - Hadley Wickham
There are several practices and guidelines to write good code.
I will talk about styling and testing
Non adversarial, constructive, transparent, no rejection.
Helps disseminate best practice.
Builds a community of practice.
Book rOpenSci Packages: Development, Maintenance, and Peer Review
Structures in code that suggest refactoring. - Martin Fowler
Refactoring is to modify code to make it easy to understand and modify without changing behaviour.
Code smells create coginitive load and it is more likely to contain errors.
The Programmer’s Brain by Felienne Hermans
Code smell: linguistic antippatern
Code do something different than names involved suggest.
A variable is_valid
that contains an integer (the name suggests a boolean).
Variable and function names should use only lowercase letters, numbers, and underscores.
variable names should be nouns and
function names should be verbs.
avoid re-using names of common functions and variables.
Strive for names that are concise and meaningful (this is not easy!).
Agree on one mold to name variable and be consistent on your code base.
The Programmer’s Brain by Felienne Hermans, The tidyverse style guide
Style Guides: conventions to write code.
styler
and lintr
packages.How to be sure our code produce reliable results? We can’t - not completely - but we can test its behavior againts our expectations to decide if we are sure enough. - Research Software Engineering with Python. Building software that makes research possible
Assume that mistakes will happen and guard against them (defensive programming).
Assertions: used to catching errors. We introduce assertions to our code so that it checks itself as it runs (Python’s assert
statement or R’s stopifnot
functions).
A precondition is something that must be true at the start of a function in order for it to work correctly.
A postcondition is something that the function guarantees is true when it finishes.
An invariant is something that is true for every iteration in a loop.
Unit Testing: used to prevent errors. Unit test checks the correctness of a single unit of software, for example a function.
A unit test will typically have:
We request packages have a test suite, preferably unit tests for all functions, ensuring key functionality is covered. (75% test coverage).
Integration Testing: unit tests give us some confidence that our units of code work in isolation. Integration testing check they work correctly together.
Integration tests are structured the same way as unit tests: a fixture is used to produce an actual result that is compared against the expected result.
However, creating the fixture and running the code can be considerably more complicated.
Testing Frameworks: helps to run and manage several unit test.
In the Testing Chapter of rOpenSci Guide there are several other packages recommendation if you need to test access to databases, creating plots, interaction with web resources, among other task.
Continuous integration (CI): runs tests automatically whenever a change is made.
tells developers immediately if changes have caused problems.
can set up to run tests with several different configurations of the software
or on several different operating systems.
In rOpenSci Dev Guide there is a chapter on Continuous Integration Best Practices
R materials:
rOpenSci Packages: Development, Maintenance, and Peer Review
Testing in R: testthat, swamp, HTTP testing in R
Python materials:
General material:
Thanks to Greg Wilson for the Python material - yabellini@ropensci.org